Efficient AI agents don’t have to be expensive: This is proof
Is the scale of AI agents too high? This is a hot topic in the AI world, and a new study by the Oppo AI agent team finally puts some real numbers and solutions on the dining table.
Today’s most impressive AI agents can handle large-scale multi-step tasks using the inference capabilities of large language models (LLMs) such as GPT-4 and Claude. However, with each breakthrough, the price of running these systems has increased, making it difficult for businesses (even researchers!) to deploy them widely. Enter the “Effective Agent” framework – a new recipe for the proxy system that maintains almost all performance, but greatly reduces costs.
The real problem: AI agents become expensive
Ever wonder why your favorite Smart AI assistant hasn’t taken over every aspect of your workflow yet? It’s not only technology, but billing. Some cutting-edge proxy systems require hundreds of API calls per task. Multiplying it by thousands of users, suddenly, “scalability” seems more like a stylish dream.
this The Oppo team saw this arrival. Their latest research Systematically break down where agent costs increase, and more importantly, how much complexity does it really require to solve daily tasks.
Game Change: Measuring AI Agent Efficiency
This study introduces a clear measure: Tolls. Imagine this is the “total cost of generating the correct answer for the question.” This is the amount you pay for the token (each word in and out of the model) and the benefits of the model being right in the first attempt.
Here’s a little bit: high-performance models like Claude 3.7 sonnets rank first in accuracy, but they cost three to four times as much as GPT-4.1. For simpler jobs, smaller models like the QWEN3-30B-A3B have less of their functions, but by comparison.

Big Experiment: What makes agents expensive?
1. Backbone model selection
Claude 3.7 sonnet pinned exactly 61.82% on a difficult benchmark, but cost $3.54 per successful task. GPT-4.1 has seen a decrease in accuracy (53.33%), but it only costs $0.98. Want to achieve accurate bones and reduce the results quickly? The basic task cost for QWEN3 shrinkage is $0.13.
2. Plan and zoom
You think “more plans” means “better results”. Not so fast. Too many steps equal higher costs, but the success rate is not high. Scaling tips that allow agents to try more options (best-N) burn a lot of calculations to jump precisely.
3. How to use the tool for the agent
Agents can use browsers, search engines, and other tools to get new information. More search sources help a certain extent, but fancy moves like on-page/selling such as adding costs need no reward. Keep the tool simple and wide-ranging.
4. Proxy memory
surprisingly, The simplest memory setting (just tracking actions and observations) can make the best balance of low cost and efficient. The additional memory modules make the agent slower, more expensive, with little benefit.
Put everything together: “Effective Agent” blueprint
This is how effective proxy systems crack code:
- Use smart but not overly expensive models (GPT-4.1).
- Limit its steps to avoid endless “overthinking”.
- Extensive searches (mixed with Google, Wikipedia and other sources), but don’t get heavy with crazy browser operations.
- Keeping memory slim and simple.
result? Efficient agents provide 96.7% of open source competitors (such as owls) performance, but cost less than three-quarters! That’s 28.4% of the bill’s decline without sacrificing results.
Why this matters
This study is a wake-up call: Smart AI should not only become powerful, but also practical. If you are building or deploying an agent, measure your pass cost and select ingredients wisely. Don’t think that bigger is always better. Sometimes, simple victory.
this Effective proxy framework is open source, so you can start trying these ideas now. As AI becomes more common, efficient design will be key – whether you launch an agent at a startup or a Fortune 500 company.
Bottom line: If you are willing to rethink how to build them, the next generation of AI agents are both smart and affordable. Efficient agency papers are not only another technology in-depth study, but also a roadmap for making AI work everywhere. Who doesn’t want that?
Check Paper and Github page. Also, please stay tuned for us twitter And don’t forget to join us 100K+ ml reddit And subscribe Our newsletter.

Sana Hassan, a consulting intern at Marktechpost and a dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. He is very interested in solving practical problems, and he brings a new perspective to the intersection of AI and real-life solutions.