Allen Institute for Artificial Intelligence (AI2) launches Olmo 3: an open source 7B and 32B LLM series based on Dolma 3 and Dolci Stack

by admin · November 21, 2025

The Allen Institute for Artificial Intelligence (AI2) is releasing Olmo 3 as a fully open model series that exposes the entire “model flow,” from raw data and code to intermediate checkpoints and deployment-ready variants.

Olmo 3 is a dense transformer kit with 7B and 32B parameter models. The series includes Olmo 3-Base, Olmo 3-Think, Olmo 3-Instruct and Olmo 3-RL Zero. The 7B and 32B variants share a context length of 65,536 tokens and use the same staged training recipe.

Dolma 3 Data Suite

The core of the training process is Dolma 3, a new data collection designed for Olmo 3. Dolma 3 consists of Dolma 3 Mix, Dolma 3 Dolmino Mix and Dolma 3 Longmino Mix. Dolma 3 Mix is a 5.9T token pre-training data set, including web page text, scientific PDFs, code repositories and other natural data. The Dolmino and Longmino subsets were constructed from filtered, higher quality slices of this pool.

Dolma 3 Mix supports the main pre-training phase of Olmo 3-Base. The AI2 research team then applied Dolma 3 Dolmino Mix, a mid-term training set of 100B tokens that emphasizes math, coding, instruction following, reading comprehension, and thinking-oriented tasks. Finally, Dolma 3 Longmino Mix adds 50B tokens for the 7B model and 100B tokens for the 32B model, focusing on long documents and scientific PDFs processed using the olmOCR pipeline. This phased course pushes the context limit to 65,536 tokens while maintaining stability and quality.

Large-scale training on H100 cluster

Olmo 3-Base 7B was trained on Dolma 3 Mix using 1,024 H100 devices, achieving approximately 7,700 tokens per second per device. In the later stage, 128 H100s are used for Domino mid-term training, and 256 H100s are used for Long mino long context expansion.

Base model performance relative to open families

On standard capability benchmarks, Olmo 3-Base 32B is positioned as the leading fully open base model. The AI2 research team reports that it is competitive with well-known open weight series such as the similarly sized Qwen 2.5 and Gemma 3. Comparing across a wide range of tasks, Olmo 3-Base 32B ranks close to or higher than these models while keeping the full data and training configuration open for inspection and reuse.

Olmo 3 Think focused on reasoning

The Olmo 3-Think 7B and Olmo 3-Think 32B sit above the base model as inference-focused variants. They use a three-stage post-training scheme that includes supervised fine-tuning, direct preference optimization, and reinforcement learning with verifiable rewards within the OlmoRL framework. Olmo 3-Think 32B is described as the strongest fully open inference model, closing the gap with the Qwen 3 32B thinking model while using approximately six times fewer training tokens.

Olmo 3 Chat and Tool Usage Guide

Olmo 3-Instruct 7B is tuned for fast instruction tracking, multi-turn chat, and tool usage. It starts with Olmo 3-Base 7B, applying a separate Dolci Instruct data and training pipeline, covering supervised fine-tuning, DPO and RLVR for dialog and function call workloads. The AI2 research team reports that Olmo 3-Instruct’s performance is comparable to or better than that of open-weight competitors such as Qwen 2.5, Gemma 3, and Llama 3.1, and is competitive with the similarly sized Qwen 3 series on multiple instruction and inference benchmarks.

RL zero for clean RL studies

Olmo 3-RL Zero 7B is designed for researchers who are concerned with language model reinforcement learning but need to clearly distinguish between pre-training data and RL data. It is a fully open RL path built on top of Olmo 3-Base and uses the Dolci RL null data set purified according to Dolma 3.

comparison table

Model variants	training or post-training data	Main use cases	Report position comparison with other open models
Olmo 3 Base 7B	Dolma 3 Mix pre-training, Dolma 3 Dolmino Mix mid-training, Dolma 3 Longmino Mix long context	Common basic model, long context reasoning, code, mathematics	Powerful fully open 7B base designed as the basis for Think, Instruct, RL Zero and evaluated against the leading open 7B scale base
Olmo 3 Base 32B	Same Dolma 3 grading pipeline as 7B, with 100B Longmino token for long context	High-end foundation for research, long-context workloads, and reinforcement learning settings	Described as the best fully open 32B base, comparable to Qwen 2.5 32B and Gemma 3 27B, and better than Marin, Apertus, LLM360
Olmo 3 Thoughts 7B	Olmo 3 Base 7B, and Dolci Think SFT, Dolci Think DPO, Dolci Think RL in the OlmoRL frame	The 7B model that focuses on reasoning, with traces of internal thinking	A fully open inference model at efficient scale that enables chain-of-thought research and reinforcement learning experiments on modest hardware
Olmo 3 Thoughts 32B	Olmo 3 Base 32B, plus the same Dolci Think SFT, DPO, RL tubing	Flagship reasoning model, long thinking traces	Known as the strongest fully open thinking model, it can compete with the Qwen 3 32B thinking model, while the number of tokens trained is reduced by about 6 times
Olmo 3 Instruction 7B	Olmo 3 Base 7B, plus Dolci Instruct SFT, Dolci Instruct DPO, Dolci Instruct RL 7B	Follow instructions, chat, function calls, tool usage	Reportedly outperforms Qwen 2.5, Gemma 3, Llama 3, and closes the gap with similarly sized Qwen 3 series
Olmo 3 RL Zero 7B	Olmo 3 Base 7B, plus Dolci RLZero math, code, IF, mixed data sets, purified from Dolma 3	RLVR research on mathematics, coding, instruction following, and mixed tasks	Introduced as a fully open RL path for benchmarking RLVR on top of base models with fully open pre-trained data

Main points

End-to-end transparent pipeline: Olmo 3 exposes the complete “model flow” from Dolma 3 data construction, through staged pre-training and post-training, to released checkpoints, evaluation suites and tools, enabling fully repeatable LLM research and fine-grained debugging.
Dense 7B and 32B models with 65K contexts: This series covers 7B and 32B dense Transformers, all with 65,536 token context windows, trained through a three-stage Dolma 3 course, Dolma 3 Mix for main pre-training, Dolma 3 Dolmino for mid-term training, and Dolma 3 Longmino for long context expansion.
Powerful open foundation and inference model: Olmo 3 Base 32B is positioned as the top fully open base model in its scale, competing with Qwen 2.5 and Gemma 3, while Olmo 3 Think 32B is described as the strongest fully open thinking model, close to the Qwen 3 32B thinking model, using approximately 6 times fewer training tokens.
Mission Tuning Instructions and RL Zero Variant: Olmo 3 Instruct 7B uses Dolci Instruct SFT, DPO and RLVR data to enable instruction tracking, multi-round chat and tool usage, and its performance is reported to be comparable to or better than Qwen 2.5, Gemma 3 and Llama 3.1 at similar scales. Olmo 3 RL Zero 7B provides a fully open path to RLVR where the Dolci RLZero dataset has been purified from pre-training data for mathematics, code, instruction following, and general chatting.

Olmo 3 is an unusual version in that it implements openness of the entire stack, Dolma 3 data recipes, staged pre-training, Dolci post-training, RLVR in OlmoRL and evaluation using OLMES and OlmoBaseEval. This reduces ambiguity around data quality, long-context training, and reinforcement learning for inference, and creates a concrete baseline for scaling Olmo 3 Base, Olmo 3 Think, Olmo 3 Instruct, and Olmo 3 RL Zero in controlled experiments. Overall, Olmo 3 sets a rigorous reference point for a transparent research-level LLM pipeline.

Check technical details. Please feel free to check out our GitHub page for tutorials, code, and notebooks. In addition, welcome to follow us twitter And don’t forget to join our 100k+ ML SubReddit and subscribe our newsletter. wait! Are you using Telegram? Now you can also join us via telegram.

Michal Sutter is a data science professional with a master’s degree in data science from the University of Padua. With a solid foundation in statistical analysis, machine learning, and data engineering, Michal excels at transforming complex data sets into actionable insights.

🙌 FOLLOW MARKTECHPOST: Add us as your go-to source on Google.

Allen Institute for Artificial Intelligence (AI2) launches Olmo 3: an open source 7B and 32B LLM series based on Dolma 3 and Dolci Stack

Dolma 3 Data Suite

Large-scale training on H100 cluster

Base model performance relative to open families

Olmo 3 Think focused on reasoning

Olmo 3 Chat and Tool Usage Guide

RL zero for clean RL studies

comparison table

Main points

You may also like...

live chat

Recent Posts

Allen Institute for Artificial Intelligence (AI2) launches Olmo 3: an open source 7B and 32B LLM series based on Dolma 3 and Dolci Stack

Dolma 3 Data Suite

Large-scale training on H100 cluster

Base model performance relative to open families

Olmo 3 Think focused on reasoning

Olmo 3 Chat and Tool Usage Guide

RL zero for clean RL studies

comparison table

Main points

You may also like...

Scientists map brain circuits for food cravings based on energy demands

Try unlimited spins for free online slot games

Why creators need AI in the multi-platform era

live chat

Recent Posts