Coding implementation using semantic kernel and Gemini advanced tool proxy
In this tutorial, we use Semantic kernel Combined with Google’s Gemini free model, we run it seamlessly on Google Colab. We first wire semantic kernel plugins as toolbars such as web search, math evaluation, file I/O and notes, and then let Gemini coordinate them with structured JSON output. We see agent plans, call tools, process observations and provide final answers. Check The complete code is here.
!pip -q install semantic-kernel google-generativeai duckduckgo-search rich
import os, re, json, time, math, textwrap, getpass, pathlib, typing as T
from rich import print
import google.generativeai as genai
from duckduckgo_search import DDGS
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY") or getpass.getpass("🔑 Enter GEMINI_API_KEY: ")
genai.configure(api_key=GEMINI_API_KEY)
GEMINI_MODEL = "gemini-1.5-flash"
model = genai.GenerativeModel(GEMINI_MODEL)
import semantic_kernel as sk
try:
from semantic_kernel.functions import kernel_function
except Exception:
from semantic_kernel.utils.function_decorator import kernel_function
We first install the library and import the required modules, including the semantic kernel, Gemini and DuckDuckgo search. We set up the GEMINI API key and model to generate responses and prepare the kernel_function of the semantic kernel to register our custom tool. Check The complete code is here.
class AgentTools:
"""Semantic Kernel-native toolset the agent can call."""
def __init__(self):
self._notes: list[str] = []
@kernel_function(name="web_search", description="Search the web for fresh info; returns JSON list of {title,href,body}.")
def web_search(self, query: str, k: int = 5) -> str:
k = max(1, min(int(k), 10))
hits = list(DDGS().text(query, max_results=k))
return json.dumps(hits[:k], ensure_ascii=False)
@kernel_function(name="calc", description="Evaluate a safe math expression, e.g., '41*73+5' or 'sin(pi/4)**2'.")
def calc(self, expression: str) -> str:
allowed = {"__builtins__": {}}
for n in ("pi","e","tau"): allowed[n] = getattr(math, n)
for fn in ("sin","cos","tan","asin", "sqrt","log","log10","exp","floor","ceil"):
allowed[fn] = getattr(math, fn)
return str(eval(expression, allowed, {}))
@kernel_function(name="now", description="Get the current local time string.")
def now(self) -> str:
return time.strftime("%Y-%m-%d %H:%M:%S")
@kernel_function(name="write_file", description="Write text to a file path; returns saved path.")
def write_file(self, path: str, content: str) -> str:
p = pathlib.Path(path).expanduser().resolve()
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(content, encoding="utf-8")
return str(p)
@kernel_function(name="read_file", description="Read text from a file path; returns first 4000 chars.")
def read_file(self, path: str) -> str:
p = pathlib.Path(path).expanduser().resolve()
return p.read_text(encoding="utf-8")[:4000]
@kernel_function(name="add_note", description="Persist a short note into memory.")
def add_note(self, note: str) -> str:
self._notes.append(note.strip())
return f"Notes stored: {len(self._notes)}"
@kernel_function(name="search_notes", description="Search notes by keyword; returns top matches.")
def search_notes(self, query: str) -> str:
q = query.lower()
hits = [n for n in self._notes if q in n.lower()]
return json.dumps(hits[:10], ensure_ascii=False)
kernel = sk.Kernel()
tools = AgentTools()
kernel.add_plugin(tools, "agent_tools")
We define the AgentTools class as our semantic kernel toolset, giving agent capabilities such as web search, secure math calculations, time retrieval, file read/write/write and lightweight attention storage. We then initialize the semantic kernel and register these tools as plugins so that the proxy can call them during the inference process. Check The complete code is here.
def list_tools() -> dict[str, dict]:
registry = {}
for name in ("web_search","calc","now","write_file","read_file","add_note","search_notes"):
fn = getattr(tools, name)
desc = getattr(fn, "description", "") or fn.__doc__ or ""
sig = "()" if name in ("now",) else "(**kwargs)"
registry[name] = {"callable": fn, "description": desc.strip(), "signature": sig}
return registry
TOOLS = list_tools()
CATALOG = "n".join(
[f"- {n}{v['signature']}: {v['description']}" for n,v in TOOLS.items()]
)
SYSTEM = f"""You are a meticulous tool-using AI agent.
You can call TOOLS by returning ONLY a JSON object:
{{"tool":"","args":{{...}}}}
After finishing all steps, respond with:
{{"final_answer":""}}
TOOLS available:
{CATALOG}
Rules:
- Prefer factuality; cite web_search results as A Coding Implementation of an Advanced Tool-Using AI Agent with Semantic Kernel and Gemini(url).
- Keep steps minimal; at most 8 tool calls.
- For file outputs, use write_file and mention the saved path.
- If a tool error occurs, adjust arguments and try again.
"""
def extract_json(s: str) -> dict|None:
for m in re.finditer(r"{.*}", s, flags=re.S):
try: return json.loads(m.group(0))
except Exception: continue
return None
We create a List_tools assistant to collect all available tools, their descriptions and signatures into the agency’s registry. We then build a directory string that lists these tools and embeds them into a system prompt that indicates how Gemini calls the tool in strict JSON format and returns the final answer. Finally, we define extract_json to safely parse the final answer in tool calls or model output. Check The complete code is here.
def run_agent(task: str, max_steps: int = 8, verbose: bool = True) -> str:
transcript: list[dict] = [{"role":"system","parts":[SYSTEM]},
{"role":"user","parts":[task]}]
observations = ""
for step in range(1, max_steps+1):
content = []
for m in transcript:
role = m["role"]
for part in m["parts"]:
content.append({"text": f"[{role.upper()}]n{part}n"})
if observations:
content.append({"text": f"[OBSERVATIONS]n{observations[-4000:]}n"})
resp = model.generate_content(content, request_options={"timeout":60})
text = resp.text or ""
if verbose:
print(f"n[bold cyan]Step {step} - Model[/bold cyan]n{textwrap.shorten(text, 1000)}")
cmd = extract_json(text)
if not cmd:
transcript.append({"role":"user","parts":[
"Please output strictly one JSON object per your rules."
]})
continue
if "final_answer" in cmd:
return cmd["final_answer"]
if "tool" in cmd:
tname = cmd["tool"]; args = cmd.get("args", {})
if tname not in TOOLS:
observations += f"nToolError: unknown tool '{tname}'."
continue
try:
out = TOOLS[tname]["callable"](**args)
out_str = out if isinstance(out,str) else json.dumps(out, ensure_ascii=False)
if len(out_str) > 4000: out_str = out_str[:4000] + "...[truncated]"
observations += f"n[{tname}] {out_str}"
transcript.append({"role":"user","parts":[f"Observation from {tname}:n{out_str}"]})
except Exception as e:
observations += f"nToolError {tname}: {e}"
transcript.append({"role":"user","parts":[f"ToolError {tname}: {e}"]})
else:
transcript.append({"role":"user","parts":[
"Your output must be a single JSON with either a tool call or final_answer."
]})
return "Reached step limit. Summarize findings:n" + observations[-1500:]
We run an iterative proxy loop that feeds the system+user context to Gemini, performs JSON-only tool calls, executes requested tools, returns observations to transcripts, and then returns Final_answer; if the model drifts from the pattern, we push it, and if it hits the upper limit of the step, we summarize the findings. Check The complete code is here.
DEMO = (
"Find the top 3 concise facts about Chandrayaan-3 with sources, "
"compute 41*73+5, store a 3-line summary into '/content/notes.txt', "
"add the summary to notes, then show current time and return a clean final answer."
)
if __name__ == "__main__":
print("[bold]🔧 Tools loaded:[/bold]", ", ".join(TOOLS.keys()))
ans = run_agent(DEMO, max_steps=8, verbose=True)
print("n" + "="*80 + "n[bold green]FINAL ANSWER[/bold green]n" + ans + "n")
We define a demonstration task that allows the agent to search, calculate, write files, save notes, and report the current time. We then run the agent end-to-end, print the loading tool and the final answers to verify the complete tool usage workflow in one go.
In summary, we observe how semantic kernels and Gemini collaborate to form a compact and powerful proxy system within COLAB. Not only do we test tool calls, we also see how the results flow back into the inference loop to produce a clean final answer. Now we have a reusable blueprint for scaling more tools or tasks, and we prove that building a practical, advanced AI proxy can be simple and effective when we use the right combination of frameworks.
Check The complete code is here. Check out ours anytime Tutorials, codes and notebooks for github pages. Also, please stay tuned for us twitter And don’t forget to join us 100K+ ml reddit And subscribe Our newsletter.
Asif Razzaq is CEO of Marktechpost Media Inc. As a visionary entrepreneur and engineer, ASIF is committed to harnessing the potential of artificial intelligence to achieve social benefits. His recent effort is to launch Marktechpost, an artificial intelligence media platform that has an in-depth coverage of machine learning and deep learning news that can sound both technically, both through technical voices and be understood by a wide audience. The platform has over 2 million views per month, demonstrating its popularity among its audience.