AI2's Olmo 3 family challenges Kevin and Llama with efficient, open reasoning and customization

AI2’s Olmo 3 family challenges Kevin and Llama with efficient, open reasoning and customization

Allen Institute for AI (AI2) With its latest release it hopes to capitalize on the growing demand for custom models and businesses seeking greater transparency from AI models.

AI2 is the latest addition to the Olmo family of large language models available to organizations, with a focus on openness and customization.

Olmo 3 has a longer context window, more logic traces and is better at coding than its previous iterations. This latest version, like other Olmo releases, is open source under the Apache 2.0 license. Enterprises will have full transparency and control over training data and checkpointing.

AI2 will release three versions of Olmo 3:

Think in both Olmo 3-7B and 32B are considered flagship reasoning models for modern research.
Olmo 3- Also in both parameters, which is ideal for programming, comprehension, mathematics and long contextual reasoning. AI2 said this version is “ideal for pre-training or fine-tuning
In Olmo 3-instruct 7b which is better for following instructions, multi-turn dialogs and tool usage

The company said Olmo 3-Think is “the first fully open 32B thinking model that develops an explicit reasoning chain approach.” Olmo-3Think also has a long context window of 65,000 tokens, perfect for reasoning over long-running agent projects or long documents.

Noah Smith, senior director of AI2 at NLP Research, told VentureBeat in an interview that many of its customers, from regulated enterprises to research institutions, want to use models that give them assurance about what goes into training.

“The releases from our friends in the tech world are great and very interesting, but there are a lot of people who have the data privacy controls in the model, how the model is trained and how the model can be used in front of the mind,” Smith said.

Developers can access models on Hug Face and AI2 Playground.

Transparency and customization

Models like Olmo 3, which the company believes need to be controlled by any organization using its model and adapted in a way that works best for them, Smith said.

“We don’t believe in one-size-fits-all solutions,” Smith said. “It’s a well-known thing in the machine learning world that if you try and build a model that solves all problems, it’s not really going to be the best model for any problem. There’s no formal proof of that, but it’s something that an old timer like me has kind of observed.”

He added that “capability models” may not be as flash as getting high scores on math exams “but offer more flexibility to businesses.”

Olmo 3 allows businesses to essentially refactor the model it learns from. The idea is that businesses can bring their own proprietary sources to guide the model in answering company-specific questions. To help businesses during this process, AI2 included checkpoints from each major training phase.

The demand for model customization has increased as businesses that cannot build their own LLMs want to build company-specific or industry-based models. Like startups arcee is Started offering Enterprise-oriented, customizable mini-models.

Models like the Olmo 3, Smith said, also give businesses more confidence in the technology. Because Olmo 3 provides training data, Smith said businesses can be confident the model hasn’t eaten anything it shouldn’t have.

AI2 has always claimed to be committed to greater transparency, even launching a tool called Olmotres in April which can track the output of a model directly on the actual training data. The company releases open-sourced models and posts its code to repositories like GitHub for anyone to use.

There are competitors like Google and Openei It was met with criticism from developers More moves that hide the raw logic token and choose to abstract the logic, claiming that they now resort to “blind debugging” without transparency.

AI2 preprocessed Olmo 3 on the six-trillion-token Openai dataset, Dolma 3. The dataset includes web data, scientific literature and code. Smith said he optimized Olmo 3 for code, compared to Olmo 2’s focus on math.

This is how it stacks

AI2 claims that the Olmo 3 family of models really represents a significant leap forward for open source models, at least for open source LLMs developed outside of China. The base Olmo 3 model trained “with approximately 2.5x 2.5x higher compute efficiency as measured by GPU hours per token,” meaning it consumed less energy during pre-training and cost less.

The company said the Olmo 3 models outperformed other open models such as Stanford, LLM 360KK2, and Apperts from Marin, although the AI2 did not provide data for benchmark testing.

“Notably, Olmo 3 think (32B) is the strongest open reasoning model, outperforming similar best-weighted models, such as our suite of reasoning benchmarks of Kevin 3-32B thinking models, being trained on 6x fewer tokens.

Olmo 3-Instruct outperformed Kevin 2.5, Gemma 3 and Llama 3.1, the company added.

Editor's pick

Get latest news

AI2’s Olmo 3 family challenges Kevin and Llama with efficient, open reasoning and customization

Transparency and customization

This is how it stacks

Data cleaning on the command line for beginning data scientists

When Pixels Meet Paper: A Return to Concrete Art in the AI ​​Age | By jm bunthous | November, 2025

You may also like

Leave a Comment Cancel Reply

Editor's pick

Get latest news

When Pixels Meet Paper: A Return to Concrete Art in the AI Age | By jm bunthous | November, 2025