Guide to Understanding, Building, and Optimizing API Call Agents

The role of artificial intelligence in technology companies is developing rapidly. AI use cases have evolved from passive information processing to active subjects who can perform tasks. According to a global AI adoption survey conducted by Georgian and Newtonx in March 2025, 91% of technology executives are reportedly using Agentic AI during the growth phase and enterprise companies are using or planning to use Agentic AI.

API call agents are the primary example of transfer to agents. API call agents use large language models (LLMS) to interact with software systems through their application programming interfaces (APIs).

For example, by converting natural language commands into precise API calls, the agent can retrieve real-time data, automate daily tasks, and even control other software systems. This feature translates AI agents into a useful intermediary between human intentions and software capabilities.

The company is currently using API call agents in various fields, including:

Consumer apps: Assistants such as Apple Siri or Amazon’s Alexa are designed to simplify daily tasks such as controlling smart home devices and bookings.
Enterprise Workflow: Enterprises have deployed API agents to automate repetitive tasks, such as retrieving data from CRM, generating reports, or merging information from internal systems.
Data Retrieval and Analysis: Enterprises are using API proxy to simplify access to proprietary datasets, subscription-based resources, and public APIs to generate insights.

In this article, I will use an engineering-centric approach to understanding, building and optimizing API call proxy. The materials in this article are based in part on practical research and development conducted by the Georgian AI laboratory. The motivational question for most of AI Labs research in the field of API call agents is: “If an organization has an API, what is the most effective way to build a proxy that can be contacted with that API in natural language?”

I’ll explain how API call agents work and how successfully architect and engineer these agents for performance. Finally, I will provide a systematic workflow that the engineering team can use to implement API call proxy.

I. Key definition:

API or application programming interface: A set of rules and protocols that enable different software applications to communicate and exchange information.
agent: An AI system designed to perceive its environment, make decisions and take action to achieve specific goals.
api call proxy: Professional AI proxy that converts natural language descriptions into precise API calls.
Code Generation Agent: Aid software development AI systems by writing, modifying and debugging code. Although related, my focus is mainly call API, although AI can help put up These agents.
MCP (Model Context Protocol): A protocol, especially developed by humans, defines how LLMs connect and utilize external tools and data sources.

ii. Core task: convert natural language into API actions

The basic function of API call proxy is to interpret the user’s natural language requests and convert them into one or more precise API calls. This process usually involves:

Intent identification: Understand the user’s goals, even if the expression is ambiguous.
Tool selection: Identify the appropriate API endpoint (or “tool”) from a set of available options that can satisfy the intent.
Parameter extraction: Identify and extract the necessary parameters of the selected API call from the user query.
Execution and response generation: Make an API call, receive a response, and then combine this information into a coherent answer or perform subsequent actions.

Consider a request like “Hey Siri, what’s the weather today?” The proxy must determine that it is necessary to call the weather API, determine the user’s current location (or a specification for allowing location), and then formulate an API call to retrieve weather information.

For “Hey Siri, what’s the weather today?”, the sample API call might look like this:

get /v1 /weather? location = new%20york&units=metric

The initial advanced challenges in this translation process are inherent, including the ambiguity of natural language and the need for agents to maintain context between multi-step interactions.

For example, agents often have to “remember” previous parts of the result of a conversation or earlier API call to inform the current action. Context loss is a common failure mode if management is not explicitly managed.

iii. Architectural Solutions: Key Components and Protocols

Building an effective API call proxy requires a structured approach to building.

1. Define “tools” for the agent

For LLM to use API, the functionality of the API must be described in a way that it can understand. Each API endpoint or function is usually represented as a “tool”. Powerful tool definitions include:

A clear natural language description of the purpose and function of the tool.
The exact specification of its input parameters (name, type, whether required or optional, and description).
The description of the output or data returned by the tool.

2. The role of Model Context Protocol (MCP)

MCP is a key enabler for LLMS to use more standardized and powerful tools. It provides a structured format for defining how the model connects to external tools and data sources.

MCP standardization is beneficial because it allows for easier integration of various tools, which can promote repeatability of tool definitions between different agents or models. Additionally, for engineering teams, this is a best practice starting with well-defined API specifications, such as OpenAPI specifications. Tools like stainless steel are designed to help convert these OpenAPI specifications into MCP configurations, thus simplifying the process of making the API “Agent-Ready”.

3. Agent framework and implementation selection

Several frameworks can help build the agent itself. These include:

Pydantic: While not only a proxy framework, Pydantic can be used to define data structures and ensure type safety of tool inputs and outputs, which is important for reliability. Many custom proxy implementations utilize Pydantic to achieve this structural integrity.
Lastmile’s MCP_AGENT: The framework is designed specifically for collaboration with the MCP and provides a more relaxed structure that aligns with the practice of effective agents described in the study of humanity.
Internal framework: Proxy generated using AI code (using tools like cursor or Klein) to help write proxies, and boilerplate code for their tools and surrounding logic is becoming increasingly common. Georgians’ AI lab experience working with companies for agency implementations, suggesting that this may be great for creating very minimal custom frameworks.

iv. Engineering of reliability and performance

Ensuring that agents make API calls reliably and perform well requires engineering efforts that focus on. Two ways to do this are (1) dataset creation and verification and (2) rapid engineering and optimization.

1. Dataset creation and verification

Training (if applicable), testing and optimization agents require high-quality datasets. This dataset should include representative natural language queries and their corresponding sequences of API calls or results.

Create manually: Manually curating datasets ensure high accuracy and relevance, but can be labor-intensive.
Synthesis Generation: Dataset creation can be extended by programming or using LLM, but this approach presents a significant challenge. Research from the Georgian AI Lab found that ensuring the correctness and realistic complexity of API calls and queries generated by synthesis is very difficult. Often, the problems arising are either too trivial or impossible to complex, making it difficult to measure nuanced proxy performance. Careful verification of synthetic data is absolutely crucial.

For critical evaluation, smaller, high-quality manually validated datasets are often more reliable than large noisy synthetic datasets.

2. Timely engineering and optimization

The performance of LLM-based agents is greatly influenced by the hints used to guide inference and tool selection.

Effective prompts involve clearly defining the agent’s tasks, providing descriptions of available tools, and building prompts to encourage accurate parameter extraction.
System optimization using frameworks DSPY Can significantly improve performance. DSPY allows you to define components of the proxy (e.g., idea-generated modules, tool selection, parameter formatting) and then use a small number of examples in the dataset to find optimization tips or configurations for those components using a compiler-like approach.

V. Recommended ways to valid API proxy

Developing a powerful API call AI proxy is an iterative engineering discipline. According to the findings of Georgia AI Laboratory research, the workflow using the system (e.g.:

Start with a clear API definition: Start with a good structure OpenAPI Specifications For APIs, your agent will interact with it.
Standardized tool access: Convert your OpenAPI specification to MCP Similar tools Stainless steel This can be facilitated to create standardized ways for your agent to understand and use the API.
Implementation Agent: Select the appropriate framework or method. This may involve use pydantic Used for customizing data modeling in proxy structures or leveraging similar frameworks Lastmile’s MCP_AGENT This is built around MCP.
- Before doing this, consider connecting MCP to a tool like Claude Desktop or Cline, and then manually use this interface to feel the effect of a universal proxy using it, how many iterations usually take to use MCP correctly, and any other details that can save you time during implementation.
Planning quality assessment dataset: Manually create or meticulously validate datasets for queries and expected API interactions. This is crucial for reliable testing and optimization.
Optimize proxy prompts and logic: Adopt a frame DSPY In order to improve the agent’s prompts and internal logic, data sets are used to improve accuracy and reliability.

vi. Illustrative examples of workflows

Here is a simplified example of the recommended workflow for building API call proxy:

Step 1: Start with a clear API definition

Imagine an API for managing simple to-do lists defined in OpenAPI:

OpenAPI: 3.0.0

information:

Title: To-do List API

Version: 1.0.0

path:

/Task:

postal:

Summary: Add a new task

RequestBody:

Required: Yes

content:

Application/JSON:

model:

Type: Object

characteristic:

describe:

Type: String

answer:

“201”:

Description: A task created successfully

get:

Summary: Get all tasks

answer:

“200”:

Description: Task List

Step 2: Standardized Tool Access

Convert OpenAPI specifications to Model Context Protocol (MCP) configuration. Using tools like stainless steel, this may result in:

Tool name	describe	Enter parameters	Output description
Add a task	Add new tasks to the to-do list.	`description` (string, required): a description of the task.	Task creation confirmation.
Get a task	Retrieve all tasks from the to-do list.	Nothing	A list of tasks and its described.

Step 3: Implement the agent

Use Pydantic for data modeling to create features that correspond to MCP tools. Then, use LLM to interpret natural language queries and select appropriate tools and parameters.

Step 4: Planning a quality assessment dataset

Create a dataset:

ask	Expected API calls	Expected results
“Add ‘Buy Groceries’ to my list.”	Add task with `description`=”Buy groceries”	Task creation confirmation
“What’s on my list?”	“Get Tasks”	Task list including “Buy Groceries”

Step 5: Optimize proxy prompts and logic

Use DSPY to perfect the tips, focus on clear instructions, tool selection and parameter extraction, and use curated datasets for evaluation and improvement.

By integrating these building blocks, from structured API definitions and standardized tool protocols to strict data practices and system optimizations – engineering teams can build more capable, reliable and maintainable API calls to AI agents.

Guide to Understanding, Building, and Optimizing API Call Agents

I. Key definition:

ii. Core task: convert natural language into API actions