Mistral AI Launches Devstral, Powerful New Open Source SWE Agent Model That Runs On Laptops

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Well-funded French AI model maker Mistral has consistently punched above its weight since its debut of its own powerful open source foundation model in fall 2023 — but it took some criticism among developers on X recently for its last release of a proprietary large language model (LLM) called Medium 3, which some viewed as betraying its open source roots and commitment.

(Recall that open source models can be taken and adapted freely by anyone, while proprietary models must be paid for and their customization options are more limited and controlled by the model maker.)

But today, Mistral is back and recommitting to the open source AI community, and AI-powered software development in particular, in a big way. The company has teamed up with open source startup All Hands AI, creators of Open Devin to release Devstral, a new open-source language model with 24-million parameters — much smaller than many rivals whose models are in the multibillions, and thus, requiring far less computing power such that it can be run on a laptop — purpose-built for agentic AI development.

Unlike traditional LLMs designed for short-form code completions or isolated function generation, Devstral is optimized to act as a full software engineering agent—capable of understanding context across files, navigating large codebases, and resolving real-world issues.

The model is now freely available under the permissive Apache 2.0 license, allowing developers and organizations to deploy, modify, and commercialize it without restriction.

“We wanted to release something open for the developer and enthusiast community—something they can run locally, privately, and modify as they want,” said Baptiste Rozière, research scientist at Mistral AI. “It’s released under Apache 2.0, so people can do basically whatever they want with it.”

Building upon Codestral

Devstral represents the next step in Mistral’s growing portfolio of code-focused models, following its earlier success with the Codestral series.

First launched in May 2024, Codestral was Mistral’s initial foray into specialized coding LLMs. It was a 22-billion-parameter model trained to handle over 80 programming languages and became well-regarded for its performance in code generation and completion tasks.

The model’s popularity and technical strengths led to rapid iterations, including the launch of Codestral-Mamba—an enhanced version built on Mamba architecture—and most recently, Codestral 25.01, which has found adoption among IDE plugin developers and enterprise users looking for high-frequency, low-latency models.

The momentum around Codestral helped establish Mistral as a key player in the coding-model ecosystem and laid the foundation for the development of Devstral—extending from fast completions to full-agent task execution.

Outperforms larger models on top SWE benchmarks

Devstral achieves a score of 46.8% on the SWE-Bench Verified benchmark, a dataset of 500 real-world GitHub issues manually validated for correctness.

This places it ahead of all previously released open-source models and ahead of several closed models, including GPT-4.1-mini, which it surpasses by over 20 percentage points.

“Right now, it’s by pretty far the best open model for SWE-bench verified and for code agents,” said Rozière. “And it’s also a very small model—only 24 billion parameters—that you can run locally, even on a MacBook.”

“Compare Devstral to closed and open models evaluated under any scaffold—we find that Devstral achieves substantially better performance than a number of closed-source alternatives,” wrote Sophia Yang, Ph.D., Head of Developer Relations at Mistral AI, on the social network X. “For example, Devstral surpasses the recent GPT-4.1-mini by over 20%.”

The model is finetuned from Mistral Small 3.1 using reinforcement learning and safety alignment techniques.

“We started from a very good base model with Mistral’s small tree control, which already performs well,” Rozière said. “Then we specialized it using safety and reinforcement learning techniques to improve its performance on SWE-bench.”

Built for the agentic era

Devstral is not just a code generation model — it is optimized for integration into agentic frameworks like OpenHands, SWE-Agent, and OpenDevin.

These scaffolds allow Devstral to interact with test cases, navigate source files, and execute multi-step tasks across projects.

“We’re releasing it with OpenDevin, which is a scaffolding for code agents,” said Rozière. “We build the model, and they build the scaffolding — a set of prompts and tools that the model can use, like a backend for the developer model.”

To ensure robustness, the model was tested across diverse repositories and internal workflows.

“We were very careful not to overfit to SWE-bench,” Rozière explained. “We trained only on data from repositories that are not cloned from the SWE-bench set and validated the model across different frameworks.”

He added that Mistral dogfooded Devstral internally to ensure it generalizes well to new, unseen tasks.

Efficient deployment with permissive open license — even for enterprise and commercial projects

Devstral’s compact 24B architecture makes it practical for developers to run locally, whether on a single RTX 4090 GPU or a Mac with 32GB of RAM. This makes it appealing for privacy-sensitive use cases and edge deployments.

“This model is targeted toward enthusiasts and people who care about running something locally and privately—something they can use even on a plane with no internet,” Rozière said.

Beyond performance and portability, its Apache 2.0 license offers a compelling proposition for commercial applications. The license permits unrestricted use, adaptation, and distribution—even for proprietary products—making Devstral a low-friction option for enterprise adoption.

Detailed specifications and usage instructions are available on the Devstral-Small-2505 model card on Hugging Face.

The model features a 128,000 token context window and uses the Tekken tokenizer with a 131,000 vocabulary.

It supports deployment through all major open source platforms including Hugging Face, Ollama, Kaggle, LM Studio, and Unsloth, and works well with libraries such as vLLM, Transformers, and Mistral Inference.

Available via API or locally

Devstral is accessible via Mistral’s Le Platforme API (application programming interface) under the model name devstral-small-2505, with pricing set at $0.10 per million input tokens and $0.30 per million output tokens.

For those deploying locally, support for frameworks like OpenHands enables integration with codebases and agentic workflows out of the box.

Rozière shared how he incorporates Devstral in his own development flow: “I use it myself. You can ask it to do small tasks, like updating the version of a package or modifying a tokenization script. It finds the right place in your code and makes the changes. It’s really nice to use.”

More to come

While Devstral is currently released as a research preview, Mistral and All Hands AI are already working on a larger follow-up model with expanded capabilities. “There will always be a gap between smaller and larger models,” Rozière noted, “but we’ve gone a long way in bridging that. These models already perform very strongly, even compared to some larger competitors.”

With its performance benchmarks, permissive license, and agentic design, Devstral positions itself not just as a code generation tool—but as a foundational model for building autonomous software engineering systems.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Source link

What's Hot

Why Big Investors Are All Ears For Voice AI Startups

AI gaming startup Born raises $15M to build ‘social’ AI companions that combat loneliness

Moveworks releases its next-generation copilot, taking action across all business systems using natural language

Mistral AI launches Devstral, powerful new open source SWE agent model that runs on laptops

Software is 40% of security budgets as CISOs shift to AI defense

How Intuit killed the chatbot crutch – and built an agentic AI playbook you can copy

Forget data labeling: Tencent’s R-Zero shows how LLMs can train themselves

Leon Black and Leslie Wexner’s Letters to Jeffrey Epstein Released

School of Visual Arts Transfers Ownership to Nonprofit Alumni Society

Cristin Tierney Moves Gallery to Tribeca for 15th Anniversary Exhibition

Anne Imhof Reimagines Football Jerseys with Nike