Use Salesforce CodeGen to build an autonomous wet lab protocol planner and validator for surrogate experiment design and safety optimization

by admin · November 7, 2025

In this tutorial, we build a wet lab protocol planner and validator that acts as an intelligent agent for experimental design and execution. We use Python to design the system and integrate Salesforce’s CodeGen-350M-mono model for natural language reasoning. We built the pipeline as modular components: ProtocolParser to extract structured data such as steps, duration, and temperature from text protocols; Inventory Manager to verify reagent availability and expiration dates; Schedule Planner to generate timelines and parallelization; and Safety Validator to identify biosafety or chemical hazards. The LLM is then used to generate optimization recommendations, effectively closing the loop between sensing, planning, validation and refinement.

import re, json, pandas as pd
from datetime import datetime, timedelta
from collections import defaultdict
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch


MODEL_NAME = "Salesforce/codegen-350M-mono"
print("Loading CodeGen model (30 seconds)...")
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(
   MODEL_NAME, torch_dtype=torch.float16, device_map="auto"
)
print("✓ Model loaded!")

We first import the necessary libraries and load the Salesforce CodeGen-350M-mono model locally for lightweight, API-less inference. We initialize the tokenizer and model with float16 precision and automatic device mapping to ensure compatibility and speed on Colab GPUs.

class ProtocolParser:
   def read_protocol(self, text):
       steps = []
       lines = text.split('n')
       for i, line in enumerate(lines, 1):
           step_match = re.search(r'^(d+).s+(.+)', line.strip())
           if step_match:
               num, name = step_match.groups()
               context="n".join(lines[i:min(i+4, len(lines))])
               duration = self._extract_duration(context)
               temp = self._extract_temp(context)
               safety = self._check_safety(context)
               steps.append({
                   'step': int(num), 'name': name, 'duration_min': duration,
                   'temp': temp, 'safety': safety, 'line': i, 'details': context[:200]
               })
       return steps
  
   def _extract_duration(self, text):
       text = text.lower()
       if 'overnight' in text: return 720
       match = re.search(r'(d+)s*(?:hour|hr|h)(?:s)?(?!w)', text)
       if match: return int(match.group(1)) * 60
       match = re.search(r'(d+)s*(?:min|minute)(?:s)?', text)
       if match: return int(match.group(1))
       match = re.search(r'(d+)-(d+)s*(?:min|minute)', text)
       if match: return (int(match.group(1)) + int(match.group(2))) // 2
       return 30
  
   def _extract_temp(self, text):
       text = text.lower()
       if '4°c' in text or '4 °c' in text or '4°' in text: return '4C'
       if '37°c' in text or '37 °c' in text: return '37C'
       if '-20°c' in text or '-80°c' in text: return 'FREEZER'
       if 'room temp' in text or 'rt' in text or 'ambient' in text: return 'RT'
       return 'RT'
  
   def _check_safety(self, text):
       flags = []
       text_lower = text.lower()
       if re.search(r'bsl-[23]|biosafety', text_lower): flags.append('BSL-2/3')
       if re.search(r'caution|corrosive|hazard|toxic', text_lower): flags.append('HAZARD')
       if 'sharp' in text_lower or 'needle' in text_lower: flags.append('SHARPS')
       if 'dark' in text_lower or 'light-sensitive' in text_lower: flags.append('LIGHT-SENSITIVE')
       if 'flammable' in text_lower: flags.append('FLAMMABLE')
       return flags


class InventoryManager:
   def __init__(self, csv_text):
       from io import StringIO
       self.df = pd.read_csv(StringIO(csv_text))
       self.df['expiry'] = pd.to_datetime(self.df['expiry'])
  
   def check_availability(self, reagent_list):
       issues = []
       for reagent in reagent_list:
           reagent_clean = reagent.lower().replace('_', ' ').replace('-', ' ')
           matches = self.df[self.df['reagent'].str.lower().str.contains(
               '|'.join(reagent_clean.split()[:2]), na=False, regex=True
           )]
           if matches.empty:
               issues.append(f"❌ {reagent}: NOT IN INVENTORY")
           else:
               row = matches.iloc[0]
               if row['expiry']  2)
       return list(reagents)[:15]

We define the ProtocolParser and InventoryManager classes to extract structured experiment details and validate reagent inventory. We parse the duration, temperature, and safety flags for each protocol step, while inventory managers verify inventory levels, expiration dates, and reagent availability via fuzzy matching.

class SchedulePlanner:
   def make_schedule(self, steps, start_time="09:00"):
       schedule = []
       current = datetime.strptime(f"2025-01-01 {start_time}", "%Y-%m-%d %H:%M")
       day = 1
       for step in steps:
           end = current + timedelta(minutes=step['duration_min'])
           if step['duration_min'] > 480:
               day += 1
               current = datetime.strptime(f"2025-01-0{day} 09:00", "%Y-%m-%d %H:%M")
               end = current
           schedule.append({
               'step': step['step'], 'name': step['name'][:40],
               'start': current.strftime("%H:%M"), 'end': end.strftime("%H:%M"),
               'duration': step['duration_min'], 'temp': step['temp'],
               'day': day, 'can_parallelize': step['duration_min'] > 60,
               'safety': ', '.join(step['safety']) if step['safety'] else 'None'
           })
           if step['duration_min']

We implement SchedulePlanner and SafetyValidator to design efficient experimental schedules and enforce laboratory safety standards. We dynamically generate daily schedules, identify steps that can be parallelized, and verify potential risks such as unsafe pH levels, hazardous chemicals, or biosafety level requirements.

def llm_call(prompt, max_tokens=200):
   try:
       inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=512).to(model.device)
       outputs = model.generate(
           **inputs, max_new_tokens=max_tokens, do_sample=True,
           temperature=0.7, top_p=0.9, pad_token_id=tokenizer.eos_token_id
       )
       return tokenizer.decode(outputs[0], skip_special_tokens=True)[len(prompt):].strip()
   except:
       return "Batch similar temperature steps together. Pre-warm instruments."


def agent_loop(protocol_text, inventory_csv, start_time="09:00"):
   print("n🔬 AGENT STARTING PROTOCOL ANALYSIS...n")
   parser = ProtocolParser()
   steps = parser.read_protocol(protocol_text)
   print(f"📄 Parsed {len(steps)} protocol steps")
   inventory = InventoryManager(inventory_csv)
   reagents = inventory.extract_reagents(protocol_text)
   print(f"🧪 Identified {len(reagents)} reagents: {', '.join(reagents[:5])}...")
   inv_issues = inventory.check_availability(reagents)
   validator = SafetyValidator()
   safety_risks = validator.validate(steps)
   planner = SchedulePlanner()
   schedule = planner.make_schedule(steps, start_time)
   parallel_opts, time_saved = planner.optimize_parallelization(schedule)
   total_time = sum(s['duration'] for s in schedule)
   optimized_time = total_time - time_saved
   opt_prompt = f"Protocol has {len(steps)} steps, {total_time} min total. Key bottleneck optimization:"
   optimization = llm_call(opt_prompt, max_tokens=80)
   return {
       'steps': steps, 'schedule': schedule, 'inventory_issues': inv_issues,
       'safety_risks': safety_risks, 'parallelization': parallel_opts,
       'time_saved': time_saved, 'total_time': total_time,
       'optimized_time': optimized_time, 'ai_optimization': optimization,
       'reagents': reagents
   }

We build agency loops that integrate sensing, planning, validation, and revision into a single, coherent process. We use CodeGen for inference-based optimization to refine step ordering and propose practical improvements for efficiency and parallel execution.

def generate_checklist(results):
   md = "# 🔬 WET-LAB PROTOCOL CHECKLISTnn"
   md += f"**Total Steps:** {len(results['schedule'])}n"
   md += f"**Estimated Time:** {results['total_time']} min ({results['total_time']//60}h {results['total_time']%60}m)n"
   md += f"**Optimized Time:** {results['optimized_time']} min (save {results['time_saved']} min)nn"
   md += "## ⏱️ TIMELINEn"
   current_day = 1
   for item in results['schedule']:
       if item['day'] > current_day:
           md += f"n### Day {item['day']}n"
           current_day = item['day']
       parallel = " 🔄" if item['can_parallelize'] else ""
       md += f"- [ ] **{item['start']}-{item['end']}** | Step {item['step']}: {item['name']} ({item['temp']}){parallel}n"
   md += "n## 🧪 REAGENT PICK-LISTn"
   for reagent in results['reagents']:
       md += f"- [ ] {reagent}n"
   md += "n## ⚠️ SAFETY & INVENTORY ALERTSn"
   all_issues = results['safety_risks'] + results['inventory_issues']
   if all_issues:
       for risk in all_issues:
           md += f"- {risk}n"
   else:
       md += "- ✅ No critical issues detectedn"
   md += "n## ✨ OPTIMIZATION TIPSn"
   for tip in results['parallelization']:
       md += f"- {tip}n"
   md += f"- 💡 AI Suggestion: {results['ai_optimization']}n"
   return md


def generate_gantt_csv(schedule):
   df = pd.DataFrame(schedule)
   return df.to_csv(index=False)

We create output generators that convert the results into human-readable Markdown checklists and Gantt chart-compatible CSVs. We ensure every execution generates clear reagent summaries, time savings, and security or inventory alerts to streamline laboratory operations.

SAMPLE_PROTOCOL = """ELISA Protocol for Cytokine Detection


1. Coating (Day 1, 4°C overnight)
  - Dilute capture antibody to 2 μg/mL in coating buffer (pH 9.6)
  - Add 100 μL per well to 96-well plate
  - Incubate at 4°C overnight (12-16 hours)
  - BSL-2 cabinet required


2. Blocking (Day 2)
  - Wash plate 3× with PBS-T (200 μL/well)
  - Add 200 μL blocking buffer (1% BSA in PBS)
  - Incubate 1 hour at room temperature


3. Sample Incubation
  - Wash 3× with PBS-T
  - Add 100 μL diluted samples/standards
  - Incubate 2 hours at room temperature


4. Detection Antibody
  - Wash 5× with PBS-T
  - Add 100 μL biotinylated detection antibody (0.5 μg/mL)
  - Incubate 1 hour at room temperature


5. Streptavidin-HRP
  - Wash 5× with PBS-T
  - Add 100 μL streptavidin-HRP (1:1000 dilution)
  - Incubate 30 minutes at room temperature
  - Work in dark


6. Development
  - Wash 7× with PBS-T
  - Add 100 μL TMB substrate
  - Incubate 10-15 minutes (monitor color development)
  - Add 50 μL stop solution (2M H2SO4) - CAUTION: corrosive
"""


SAMPLE_INVENTORY = """reagent,quantity,unit,expiry,lot
capture antibody,500,μg,2025-12-31,AB123
blocking buffer,500,mL,2025-11-30,BB456
PBS-T,1000,mL,2026-01-15,PT789
detection antibody,8,μg,2025-10-15,DA321
streptavidin HRP,10,mL,2025-12-01,SH654
TMB substrate,100,mL,2025-11-20,TM987
stop solution,250,mL,2026-03-01,SS147
BSA,100,g,2024-09-30,BS741"""


results = agent_loop(SAMPLE_PROTOCOL, SAMPLE_INVENTORY, start_time="09:00")
print("n" + "="*70)
print(generate_checklist(results))
print("n" + "="*70)
print("n📊 GANTT CSV (first 400 chars):n")
print(generate_gantt_csv(results['schedule'])[:400])
print("n🎯 Time Savings:", f"{results['time_saved']} minutes via parallelization")

We perform comprehensive test runs using ELISA protocol samples and reagent inventory data sets. We visualize the agent’s output, optimized plans, parallelization gains, and AI-suggested improvements, demonstrating how our planner functions as a standalone smart lab assistant.

Finally, we demonstrate how agent AI principles can enhance the reproducibility and safety of wet lab workflows. By parsing free-form experimental text into structured, actionable plans, we automate protocol validation, reagent management, and timing optimization in a single pipeline. CodeGen integration enables on-device reasoning about bottlenecks and security conditions, enabling independent data security operations. We ended up with a fully functional planner that can generate Gantt chart-compatible timelines, Markdown checklists, and AI-driven optimization tips, creating a solid foundation for an autonomous lab planning system.

Check The complete code is here. Please feel free to check out our GitHub page for tutorials, code, and notebooks. In addition, welcome to follow us twitter And don’t forget to join our 100k+ ML SubReddit and subscribe our newsletter. wait! Are you using Telegram? Now you can also join us via telegram.

Asif Razzaq is the CEO of Marktechpost Media Inc. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of artificial intelligence for the benefit of society. His most recent endeavor is the launch of Marktechpost, an AI media platform that stands out for its in-depth coverage of machine learning and deep learning news that is technically sound and easy to understand for a broad audience. The platform has more than 2 million monthly views, which shows that it is very popular among viewers.

🙌 FOLLOW MARKTECHPOST: Add us as your go-to source on Google.