Liquid AI releases LFM2-ColBERT-350M: a new small model that brings late interactive retrieval to multilingual and cross-lingual RAGs

by admin · October 29, 2025

Can a compact late interactive retriever index once and provide accurate cross-language search with fast inference? Liquid AI releases LFM2-ColBERT-350Ma compact late interactive retriever for multilingual and cross-language search. Documents can be indexed in one language, queries can be written in multiple languages, and the accuracy of system retrieval is high. The Liquid AI team reports inference speeds comparable to models that are 2.3 times smaller, thanks to the LFM2 backbone. This model provides a Hugging Face demonstration and detailed model cards for integration into retrieval enhancement generation systems.

What delayed interaction means and why it matters?

Most production systems use dual encoders for speed, or cross-encoders for accuracy. Later interactions aim to combine the best of both worlds. Queries and documents are encoded separately at the token level. The system uses operations such as MaxSim to compare tag vectors at query time. This preserves fine-grained token interactions without bearing the full cost of joint cross-concern. It allows precomputation of documents and improves accuracy when ranking. It can act as a first-stage retriever or as a one-pass sorter.

Model specifications

LFM2-ColBERT-350M has a total of 350 million parameters. There are 25 layers, including 18 convolutional blocks, 6 attention blocks and 1 dense layer. The context length is 32k tokens. Vocabulary size is 65,536. The similarity function is MaxSim. The output dimension is 128. The training accuracy is BF16. The license is LFM Open License v1.0.

Support and evaluation languages

The model supports 8 languages. They are English, Arabic, Chinese, French, German, Japanese, Korean and Spanish. Italian and Portuguese were added to this evaluation, bringing the matrix to nine languages for cross-comparison of document and query languages. This distinction is relevant when planning deployments that must cover specific customer markets.

Assessment setting and main results

Liquid AI extends the NanoBEIR benchmark to Japanese and Korean and releases reproducibility extensions. On this setting, LFM2-ColBERT-350M shows stronger multilingual capabilities than the baseline late interaction model in this class, namely GTE-ModernColBERT-v1 with 150M parameters. German, Arabic, Korean and Japanese saw the largest increases, while English’s performance remained unchanged.

Main points

Token-level scoring using MaxSim preserves fine-grained interactions while preserving separate encoders, so document embeddings can be efficiently pre-computed and queried.
Documents can be indexed in one language and retrieved in multiple languages. The model card lists 8 supported languages, while evaluation across language pairs covers 9 languages.
On the NanoBEIR multilingual extension, LFM2-ColBERT-350M outperforms the previous late-interaction baseline (GTE-ModernColBERT-v1 at 150M) and maintains English performance.
Inference speeds are reported to be comparable to models with a 2.3x smaller batch size, thanks to the LFM2 backbone.

Editor’s Note

Liquid AI’s LFM2-ColBERT-350M applies post-interaction ColBERT with MaxSim, which encodes queries and documents separately and then scores tag vectors at query time, thus preserving tag-level interactions and enabling pre-computed document embeddings for scaling. It targets multilingual and cross-language retrieval, single indexing and querying in multiple languages, and describes evaluation on NanoBEIR multilingual extensions. The Liquid AI team reports inference speeds comparable to models that are 2.3 times smaller thanks to the LFM2 backbone. Overall, Nano’s late-stage interactions look production-ready for multilingual RAG trials.

Check Model weight, demonstration and technical details. Please feel free to check out our GitHub page for tutorials, code, and notebooks. In addition, welcome to follow us twitter And don’t forget to join our 100k+ ML SubReddit and subscribe our newsletter. wait! Are you using Telegram? Now you can also join us via telegram.

Asif Razzaq is the CEO of Marktechpost Media Inc. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of artificial intelligence for the benefit of society. His most recent endeavor is the launch of Marktechpost, an artificial intelligence media platform that stands out for its in-depth coverage of machine learning and deep learning news that is technically sound and easy to understand for a broad audience. The platform has more than 2 million monthly views, which shows that it is very popular among viewers.

🙌 FOLLOW MARKTECHPOST: Add us as your go-to source on Google.

Liquid AI releases LFM2-ColBERT-350M: a new small model that brings late interactive retrieval to multilingual and cross-lingual RAGs

What delayed interaction means and why it matters?

Model specifications

Support and evaluation languages

Assessment setting and main results

Main points

Editor’s Note

You may also like...

live chat

Recent Posts

Liquid AI releases LFM2-ColBERT-350M: a new small model that brings late interactive retrieval to multilingual and cross-lingual RAGs

What delayed interaction means and why it matters?

Model specifications

Support and evaluation languages

Assessment setting and main results

Main points

Editor’s Note

You may also like...

Establish an AI proxy framework based on multi-node graphs to automate complex tasks

Beyond Logic: Rethinking Human Thought with Geoffrey Hinton’s Analogy Theory

Excellent timeline for oxygen rewrites found in the farthest galaxy of the universe

live chat

Recent Posts