Google AI introduces multi-agent system search quality: a new AI proxy optimization framework for better tips and topology
Multi-agent systems are becoming a key development in artificial intelligence due to the ability to coordinate multiple large language models (LLMs) to solve complex problems. Instead of relying on the perspective of a single model, these systems distribute roles among agents, each contributing unique capabilities. This division of labor enhances the system’s ability to analyze, respond, and act in a more powerful way. Whether applied to code debugging, data analysis, retrieval-enhanced generation or interactive decision-making, LLM-driven agents have achieved results that a single model cannot always match. The functions of these systems are in their design, especially the configuration of inter-agent connections (called topology), and the specific description given for each agent, called hints. As computational models mature, the challenge has shifted from proving feasibility to optimizing architecture and behavior for excellent results.
An important problem is that it is difficult to design these systems effectively. When prompted, those structured inputs that guide each agent role will change slightly, and the performance can swing sharply. This sensitivity enables the risk of scalability, especially when agents are linked together in a workflow where one’s output is input as others. Errors can spread or even expand. Furthermore, topological decisions, such as determining the number of agents involved, their interaction styles and task sequences, still rely closely on manual configuration and trial and error. The design space is huge and nonlinear because it combines many options for rapid engineering and topology. For traditional design methods, optimizing both at the same time is largely out of reach.
Some efforts have been made to improve various aspects of this design problem, but there are still gaps. Methods such as DSPY, such as methods that prompt an example of automation, while others focus on increasing the number of agents participating in tasks such as voting. Tools such as ADAS introduce code-based topological configurations through meta-agents. Some frameworks (such as Aflow) apply techniques such as Monte Carlo Tree Search to explore combinations more effectively. However, these solutions often focus on timely and timely topology or topology optimization rather than both. The lack of integration limits their ability to generate intelligent and robust MAS designs under complex operating conditions.
Researchers at Google and Cambridge University launch a new framework called Multi-agent system search (quality). This method automatically designs MAS by interleaving prompts and topology optimization in a phased approach. Unlike earlier attempts to independently handle these two components, quality first determines which elements (hints and topology) are most likely to affect performance. By narrowing searches to this influential subspace, the framework runs more efficiently while delivering higher quality results. The method is divided into three stages: local prompt optimization, selection of effective workflow topology based on optimization prompts, and then global optimization prompts are optimized across the system. This framework not only reduces computational overhead, but also reduces the burden of manual adjustments for researchers.
The technical implementation of quality is structured. First, every component of MAS will be improved quickly. These blocks are proxy modules with specific responsibilities, such as aggregation, reflection or debate. For example, a prompt optimizer produces changes that include instructional guidance (e.g., “step-by-step thinking”) and example-based learning (e.g., one-hit or a few demonstrations). The optimizer evaluates it using validation metrics to guide improvements. Once the prompts for each proxy are optimized locally, the system begins to explore effective combinations of proxy to form a topology. This topological optimization is informed by early results and restricts it to the pruning search space identified as the most influential. Finally, the best topology undergoes timely adjustments at the global level, fine-tuning in the context of the entire workflow to maximize collective efficiency.
In tasks such as inference, multi-hop comprehension, and code generation, optimized MAS always exceeds existing benchmarks. In performance tests performed in mathematical datasets using Gemini 1.5 Pro, timely optimized agents showed that the average accuracy through enhanced cues was about 84%, while the agents scaled through self-contradictory or multi-agent debate were 76-80%. In the HOTPOTQA benchmark, the quality science of using quality debate topology was improved by 3%. Conversely, other topology (such as reflection or summary) fails to generate benefits, even leading to 15% degradation. On livecodebench, the executor topology provides a +6% boost, but the reflection-like approach sees negative results again. These findings demonstrate that a portion of the topological design space contributes only positive contributions and strongly demands for targeted optimizations, such as those used in quality.
Several key points of research include:
- MAS design complexity is affected by rapid sensitivity and topological arrangement.
- On the block and system level, rapid optimization is more efficient than separate proxy scaling, as demonstrated by the 84% accuracy of the enhancement hint rather than the 76% of the self-stability scaling.
- Not all topology is beneficial; debate increases +3% in HOTPOTQA, while reflection leads to up to -15%.
- Large-scale frameworks incorporate timely optimizations in three stages, greatly reducing the burden of computing and design.
- Topology such as debate and executors are effective, while other topology such as reflection and summary can also degrade system performance.
- Quality avoids complete search complexity by pruning design space based on early impact analysis, improving performance while saving resources.
- The method is modular and supports plug-in proxy configuration to adapt it to various domains and tasks.
- The final MAS model from multiple benchmarks such as mathematics, hotpotqa and livecodebench surpasses the most advanced baseline.
In summary, this study identifies rapid sensitivity and topological complexity as major bottlenecks in multi-agent system (MAS) development and proposes a structured solution that strategically optimizes both areas. The Quality Framework demonstrates a scalable, efficient MAS design approach that minimizes the need for human input while maximizing performance. The study provides compelling evidence that better timely design is more effective than just adding agents, and that targeted searches in an influential topological subset will bring meaningful benefits in the real world.
View paper. All credits for this study are to the researchers on the project. Also, please stay tuned for us twitter And don’t forget to join us 95k+ ml reddit And subscribe Our newsletter.

Asif Razzaq is CEO of Marktechpost Media Inc. As a visionary entrepreneur and engineer, ASIF is committed to harnessing the potential of artificial intelligence to achieve social benefits. His recent effort is to launch Marktechpost, an artificial intelligence media platform that has an in-depth coverage of machine learning and deep learning news that can sound both technically, both through technical voices and be understood by a wide audience. The platform has over 2 million views per month, demonstrating its popularity among its audience.
 
																								 
																								