A report composed of researchers from the University of Liverpool, Huawei Noah’s Ark Laboratory, Oxford University and University College London explains the in-depth study of the Agents, a new paradigm for independent research. Powered by Large Language Models (LLMS), these systems are designed to handle complex, long horse tasks that require dynamic inference, adaptive planning, iterative tool usage and structured analysis of output. Unlike traditional retrieval-based generation (RAG) methods or static tool usage models, DR Adents is able to navigate evolving user intentions and ambiguous information landscapes by integrating structured APIs and browser-based search mechanisms.


Limitations of existing research frameworks
Before delving into agents (DR), most LLM-driven systems focused on fact retrieval or single-step reasoning. The rag system improves fact grounding, while tools such as flares and tool builders enable basic tool use. However, these models lack real-time adaptability, deep reasoning and modular scalability. They struggle with long-term text coherence, effective multi-turn retrieval and dynamic workflows, which are key requirements for real-world research.
Deeply study the architectural innovation of agents (DR agents)

The basic design of deep research agents (DRs) solves the limitations of static inference systems. The main technical contributions include:
- Workflow classification: The distinction between static (manual, fixed sequence) and dynamic (adaptive, real-time) research workflows.
- Model Context Protocol (MCP): A standardized interface that enables secure, consistent interaction with external tools and APIs.
- Agent to Agent (A2A) protocol: Promote decentralized and structured communication among agents for collaborative task execution.
- Mixed search methods: Supports API-based (structured) and browser-based (unstructured) data acquisition.
- Multimodal tool usage: Integrate code execution, data analysis, multi-modal generation and memory optimization in inference loops.
System pipeline: From query to report generation
A typical in-depth research agent (DR agent) handles research queries in the following ways:
- Understand intent by planning only, intent planning or unified intent planning strategies
- Use APIs (e.g. Arxiv, Wikipedia, Google Search) and browser environment for dynamic content retrieval
- Calling tools through MCP to execute tasks such as scripts, analysis or media processing, etc.
- Structured reports, including evidence-based summary, table or visualization
Memory mechanisms such as vector databases, knowledge graphs, or structured repositories allow agents to manage novel reasoning and reduce redundancy.
Comparison with rags and traditional tool usage agents
Unlike the rag method that runs on a static search pipeline, In-depth Research Agent (DR DRENTS):
- Perform multi-step planning through evolving mission goals
- Adjust search strategy according to task progress
- Coordinate between multiple dedicated agents (in a multi-agent setup)
- Leverage asynchronous and parallel workflows
This architecture enables more coherent, scalable and flexible research task execution.


Industrial implementation of agents
- Openai Dr: O3 inference model generated using RL-based dynamic workflows, multi-modal retrieval and support code-enabled reports.
- Gemini: Built on Gemini-2.0 Flash; supports large context windows, asynchronous workflows and multi-mode task management.
- Grok DeepSearch: Combining sparse attention, browser-based retrieval and sandbox execution environment.
- Confused Dr: Iterative web search using hybrid LLM orchestration application.
- Microsoft Researchers and Analysts: Integrate OpenAI models in Microsoft 365 for domain-specific, secure research pipelines.
Benchmarks and performance
In-depth research agents (DR agents) were tested using QA and task execution benchmarks:
- Quality inspection: HOTPOTQA, GPQA, 2WIKIMULTIHOPQA, TRIVIAQA
- Complex research: mle bench, browse, Gaia, hle
Benchmarks test depth, tool usage accuracy, reasoning coherence and structured reports. Agents like Deepresearcher and SimpleDeepsearcher always outperform traditional systems.
FAQ
Question 1: What is a deep research agent?
A: DR agents are LLM-based systems that use dynamic planning and tool integration to perform autonomous multi-step research workflows.
Q2: How do Dr Agents better than RAG models?
Answer: Agents support adaptive planning, multi-hop retrieval, iterative tool usage and real-time report synthesis.
Q3: What protocols do DR proxy use?
A: MCP (for tool interaction) and A2A (for proxy collaboration).
Question 4: Are these systems ready?
A: Yes. Openai, Google, Microsoft, etc. have deployed Dr agents in public and enterprise applications.
Q5: How to evaluate Dr agents?
A: Use QA benchmarks, such as HotPotQA and HLE, as well as MLE-Bench and BrowseComp to execute benchmarks.
Check Paper. All credits for this study are to the researchers on the project.
Sponsorship Opportunities: Attract the most influential AI developers in the United States and Europe. 1M+ monthly readers, 500K+ community builders, unlimited possibilities. [Explore Sponsorship]

Nikhil is an intern consultant at Marktechpost. He is studying for a comprehensive material degree in integrated materials at the Haragpur Indian Technical College. Nikhil is an AI/ML enthusiast and has been studying applications in fields such as biomaterials and biomedical sciences. He has a strong background in materials science, and he is exploring new advancements and creating opportunities for contribution.