Step-by-step guide to AI Agent Development with Microsoft Agent-Lightning

by admin · September 1, 2025

In this tutorial, we will set up advanced AI proxy using Microsoft’s Agent light frame. We run everything directly inside Google Colab, which means we can try to use server and client components in one place. By defining a small QA agent, connecting it to a local proxy light server, and then training with multiple system prompts, we can observe how the framework supports resource updates, task queuing, and automated evaluation. Check The complete code is here.

!pip -q install agentlightning openai nest_asyncio python-dotenv > /dev/null
import os, threading, time, asyncio, nest_asyncio, random
from getpass import getpass
from agentlightning.litagent import LitAgent
from agentlightning.trainer import Trainer
from agentlightning.server import AgentLightningServer
from agentlightning.types import PromptTemplate
import openai
if not os.getenv("OPENAI_API_KEY"):
   try:
       os.environ["OPENAI_API_KEY"] = getpass("🔑 Enter OPENAI_API_KEY (leave blank if using a local/proxy base): ") or ""
   except Exception:
       pass
MODEL = os.getenv("MODEL", "gpt-4o-mini")

We first need to install the required libraries and import all the core modules needed for proxy light. We can also safely set up the OpenAI API key and define the model that will be used for the tutorial. Check The complete code is here.

class QAAgent(LitAgent):
   def training_rollout(self, task, rollout_id, resources):
       """Given a task {'prompt':..., 'answer':...}, ask LLM using the server-provided system prompt and return a reward in [0,1]."""
       sys_prompt = resources["system_prompt"].template
       user = task["prompt"]; gold = task.get("answer","").strip().lower()
       try:
           r = openai.chat.completions.create(
               model=MODEL,
               messages=[{"role":"system","content":sys_prompt},
                         {"role":"user","content":user}],
               temperature=0.2,
           )
           pred = r.choices[0].message.content.strip()
       except Exception as e:
           pred = f"[error]{e}"
       def score(pred, gold):
           P = pred.lower()
           base = 1.0 if gold and gold in P else 0.0
           gt = set(gold.split()); pr = set(P.split());
           inter = len(gt & pr); denom = (len(gt)+len(pr)) or 1
           overlap = 2*inter/denom
           brevity = 0.2 if base==1.0 and len(P.split())

We define a simple Qaagent by extending Litagent, and we process each training rollout by sending user prompts to the LLM, collecting responses and ratings based on the golden answer. We design rewards features to verify correctness, token overlap and simplicity, allowing agents to learn and produce concise and clear output. Check The complete code is here.

TASKS = [
   {"prompt":"Capital of France?","answer":"Paris"},
   {"prompt":"Who wrote Pride and Prejudice?","answer":"Jane Austen"},
   {"prompt":"2+2 = ?","answer":"4"},
]
PROMPTS = [
   "You are a terse expert. Answer with only the final fact, no sentences.",
   "You are a helpful, knowledgeable AI. Prefer concise, correct answers.",
   "Answer as a rigorous evaluator; return only the canonical fact.",
   "Be a friendly tutor. Give the one-word answer if obvious."
]
nest_asyncio.apply()
HOST, PORT = "127.0.0.1", 9997

We define a small benchmark with three QA tasks and curate multiple candidate system prompts for optimization. We then apply NEST_ASYNCIO and set up the local server host and port, so that we can run the proxy ray server and client while a COLAB runtime. Check The complete code is here.

async def run_server_and_search():
   server = AgentLightningServer(host=HOST, port=PORT)
   await server.start()
   print("✅ Server started")
   await asyncio.sleep(1.5)
   results = []
   for sp in PROMPTS:
       await server.update_resources({"system_prompt": PromptTemplate(template=sp, engine="f-string")})
       scores = []
       for t in TASKS:
           tid = await server.queue_task(sample=t, mode="train")
           rollout = await server.poll_completed_rollout(tid, timeout=40)  # waits for a worker
           if rollout is None:
               print("⏳ Timeout waiting for rollout; continuing...")
               continue
           scores.append(float(getattr(rollout, "final_reward", 0.0)))
       avg = sum(scores)/len(scores) if scores else 0.0
       print(f"🔎 Prompt avg: {avg:.3f}  |  {sp}")
       results.append((sp, avg))
   best = max(results, key=lambda x: x[1]) if results else ("",0)
   print("n🏁 BEST PROMPT:", best[0], " | score:", f"{best[1]:.3f}")
   await server.stop()

We started the proxy server and iterated through our candidate system prompts to update the shared System_prompt before queuing each training task. We then polled the completed rollout, calculated the average reward for each prompt, reported the best performing tips, and gracefully stopped the server. Check The complete code is here.

def run_client_in_thread():
   agent = QAAgent()
   trainer = Trainer(n_workers=2)    
   trainer.fit(agent, backend=f"
client_thr = threading.Thread(target=run_client_in_thread, daemon=True)
client_thr.start()
asyncio.run(run_server_and_search())

We start the client in a separate thread with two parallel workers, so we can handle tasks sent by the server. At the same time, we run a server loop that evaluates different tips, collects rollout results, and reports the best system tips based on average rewards.

All in all, we will see how to use a proxy, allowing us to create a flexible proxy pipeline with just a few lines of code. We can start the server in a single COLAB environment, run parallel client workers, evaluate different system prompts and automatically measure performance. This illustrates how frameworks can simplify the process of building, testing and optimizing AI agents in a structured way.

Check The complete code is here. Check out ours anytime Tutorials, codes and notebooks for github pages. Also, please stay tuned for us twitter And don’t forget to join us 100K+ ml reddit And subscribe Our newsletter.

Asif Razzaq is CEO of Marktechpost Media Inc. As a visionary entrepreneur and engineer, ASIF is committed to harnessing the potential of artificial intelligence to achieve social benefits. His recent effort is to launch Marktechpost, an artificial intelligence media platform that has an in-depth coverage of machine learning and deep learning news that can sound both technically, both through technical voices and be understood by a wide audience. The platform has over 2 million views per month, demonstrating its popularity among its audience.