0

Mirix: A modular multi-agent memory system for enhancing long-term reasoning and personalization of LLM-based agents

The latest developments in LLM agents are focused on enhancing the ability to perform complex tasks. However, the critical dimension is still not ignored: memory – the ability of proxy to persist throughout the time, recall and rationally. Without continuous memory, most LLM-based proxies remain stateless and cannot build contexts outside of a single prompt, limiting their usefulness in the real world where consistency and personalization are critical.

To solve this problem, Mirix AI introduced Mirix, a modular multi-proxy storage system designed to provide strong long-term memory for LLM-based proxies. Unlike flat text-centric systems, Mirix integrates structured memory types across modalities, including visual inputs, and is built on a coordinated multi-proxy architecture for memory management.

Core architecture and memory composition

Mirix has six professional components of memory, each component dominated by the corresponding memory manager:

  • Core memory: Stores ongoing proxy and user information, divided into “role” (agent profile, tone, and behavior) and “human” (user facts, such as names, preferences, and relationships).
  • Plot memory: Captures events and user interactions for timestamps and captures with structured properties such as event_type, summary, details, participants, and timestamps.
  • Semantic memory: Encode abstract concepts, knowledge graphs, and named entities and organize entries using types, abstracts, details and sources.
  • Program Memory: Use well-defined steps and descriptions to contain structured workflows and task sequences, usually formatted in JSON for easy operation.
  • Resource memory: References to external documents, images, and audio are recorded through titles, summary, resource types, and content or links for context continuity.
  • Knowledge Vault: Ensure verbatim facts and sensitive information with strict access controls and sensitivity tags, such as credentials, contacts, and API keys.

one Metamemory Manager Plan the activities of these six professional managers, enable smart message routing, hierarchical storage and memory-specific retrieval operations. In this architecture, other agents (such as chat and interface) will be formed.

Active search and interaction pipelines

The core innovation of Mirix is Active search mechanism. On user input, the system first automatically penetrates the topic, then retrieves relevant memory entries from all six components, and finally marks the retrieved data to inject context into the final system prompt. This process reduces dependence on outdated parameter model knowledge and provides stronger answers to ground.

Multiple search strategies – including embedding_match,,,,, bm25_matchand string_match– Available to ensure accurate and context-aware access to memory. This architecture allows for further expansion of the search tool as needed.

System implementation and application

Mirix is deployed as a cross-platform assistant application developed using React-Electron (for UI) and Uvicorn (for backend API). The assistant monitors screen activity by capturing screenshots every 1.5 seconds; only non-redundant screens are retained and memory updates are triggered in batches after collecting 20 unique screenshots (about once per minute). Uploading to Gemini API is being streamed, enabling efficient visual data processing and latency below 5 seconds, updating memory from visual input.

Users interact through a chat interface that dynamically utilizes the proxy’s memory components to generate a context-aware personalized response. Semantics and program memory are rendered as extensible trees or lists that provide transparency and allow users to review and check what the agent “remembers”.

Evaluation of multimodal and dialogue benchmarks

Mirix has been verified on two strict tasks:

  1. ScreenShotVQA: Benchmarks for visual questioning require persistent long-term memory on high-resolution screenshots. Mirix outperforms the search-enhanced generation (RAG) baseline (particularly Siglip and Gemini) 35% of LLM-AS-AA-Gudge accuracywhile reducing the need for retrieval and storage 99.9% Compared to text-heavy methods.
  2. locomotive: Text benchmarks for assessing long form dialogue memory. Mirix achieves 85.38% average accuracypowerful open source systems over 8 points (such as Langmem and MEM0) and close to the full-text sequence upper limit.

Modular designs can be high-performance in multimodal and text-only inference domains.

Use Cases: Wearable Devices and Memory Market

Mirix is designed to be scalable and supports lightweight AI wearable devices including smart glasses and pins, i.e. its efficient, modular architecture. Hybrid deployment allows on-device and cloud-based memory processing, while actual applications include real-time conference summary, granular location and context recollection, and dynamic modeling of user habits.

A visionary feature of Mirix is Memory market: A decentralized ecosystem that enables secure memory sharing, monetization and collaborative AI personalization among users. The market is designed with fine-grained privacy controls, end-to-end encryption and decentralized storage to ensure data sovereignty and user self-ownership.

in conclusion

Mirix represents an important step in bringing LLM-based proxy to a similar human memory. Its structured multi-agent composition architecture enables powerful memory abstraction, multi-modal support, and real-time, context-based reasoning. Mirix has grown experience with challenging benchmarks and accessible cross-platform application interfaces, setting new standards for memory-enhanced AI systems.

FAQ

1. What makes Mirix different from existing memory systems like MEM0 or ZEP?
Mirix introduces multi-component, constituent memory (beyond text paragraph storage), multi-mode support (including visual), and multi-agent retrieval architectures for more scalable, accurate and context-rich long-term memory management.

2. How does Mirix ensure that low latency memory is updated from visual input?
By using streaming uploads with the Gemini API, Mirix is able to update screenshot-based visual memory with a latency of less than 5 seconds even during active user sessions.

3. Is Mirix compatible with closed source LLMs like GPT-4?
Yes. Since Mirix runs as an external system (rather than as a model plugin or retraining), it can enhance any LLM regardless of its basic architecture or license, including GPT-4, Gemini and other proprietary models.


Check paper, github and projects. All credits for this study are to the researchers on the project.

Sponsorship Opportunities: Attract the most influential AI developers in the United States and Europe. 1M+ monthly readers, 500K+ community builders, unlimited possibilities. [Explore Sponsorship]


Sajjad Ansari is a final year undergraduate student from IIT Kharagpur. As a technology enthusiast, he delves into the practical application of AI, focusing on understanding AI technology and its real-world impact. He aims to express complex AI concepts in a clear and easy way.