Key Points
- Mistral AI unveiled Devstral 2, a 123 billion‑parameter open‑weights coding model.
- Devstral 2 achieved a 72.2 percent score on the SWE‑bench Verified benchmark.
- Mistral Vibe CLI lets developers interact with Devstral models directly in the terminal.
- The CLI maintains project context, edits multiple files, and runs autonomous shell commands.
- Devstral Small 2, a 24 billion‑parameter variant, scored 68 percent and runs locally on consumer hardware.
- Both models support a 256,000‑token context window for handling large codebases.
- Devstral 2 is released under a modified MIT license; Devstral Small 2 uses Apache 2.0.
- The releases emphasize open‑source accessibility for AI‑driven software development.
New Open‑Weights Coding Model
Mistral AI, a French artificial‑intelligence startup, announced the release of Devstral 2, an open‑weights coding model built with 123 billion parameters. The model is designed to operate as part of an autonomous software‑engineering agent and performed strongly on the SWE‑bench Verified benchmark, achieving a 72.2 percent score. This places Devstral 2 among the top‑performing open‑weights models for solving real‑world GitHub issues.
Mistral Vibe CLI
In tandem with the model, Mistral introduced a new development application called Mistral Vibe. The tool is a command‑line interface (CLI) similar to existing offerings such as Claude Code, OpenAI Codex, and Gemini CLI. Developers can interact with the Devstral models directly from their terminal, where the CLI can scan file structures and Git status to maintain context across an entire project, make changes across multiple files, and execute shell commands autonomously. Mistral released the CLI under the Apache 2.0 license.
Benchmark Context
The SWE‑bench Verified benchmark tests AI systems on 500 real software‑engineering problems pulled from popular Python repositories on GitHub. Models must read issue descriptions, navigate codebases, and generate working patches that pass unit tests. While many tasks involve relatively simple bug fixes, the benchmark remains one of the few standardized ways to compare coding models.
Devstral Small 2
Alongside Devstral 2, Mistral also released Devstral Small 2, a 24 billion‑parameter version of the model. This smaller model scored 68 percent on the same benchmark and is engineered to run locally on consumer hardware, such as a laptop, without requiring an internet connection. Both Devstral 2 and Devstral Small 2 support a 256,000‑token context window, allowing them to process moderately large codebases.
Licensing and Availability
Devstral 2 is released under a modified MIT license, while Devstral Small 2 is offered under the more permissive Apache 2.0 license. The open‑source licensing strategy reflects Mistral’s commitment to making powerful AI coding tools broadly accessible to developers and researchers.
Source: arstechnica.com