Cohere sends a 30B coding model into agentic harnesses | Radar

Cohere has released North Mini Code, a 30B Mixture of Experts model with 3B active parameters for agentic software engineering. The model is available on Hugging Face under the Apache 2.0 license and the authors report a 33.4 score on Artificial Analysis’ Coding Index.

North Mini Code targets coding agents, not just function generation

The announcement was published on June 9, 2026 on Hugging Face by Cohere Labs. North Mini Code is the first model in Cohere’s new developer focused family. BF16 and FP8 weights are available on Hugging Face, with access through the Cohere API and integration in OpenCode.

The architecture is a decoder only sparse Mixture of Experts model with 128 experts, 8 of which are activated per token. Cohere lists 30B parameters and 3B active parameters. The training target is complex software engineering workflow, terminal based agentic tasks and code generation.

The stronger product signal is toolchain robustness

Cohere is not only pushing one score on one benchmark. The article describes training across different scaffolds and harnesses: SWE-Agent, mini-SWE-agent, OpenCode and Terminus 2. That matters because a coding agent is not just a model. It is a model trapped in a particular interface with tools, logs, errors and tests.

The second SFT stage uses a 4.5 billion token mixture from agentic and reasoning samples. The authors describe more than 70 thousand verifiable tasks across about 5 thousand repositories and deduplication against SWE-Bench and SWE-Bench-Pro. Added harness data reportedly produced a 10 % gain in OpenCode without hurting SWE-Agent performance.

The benchmarks look useful, but the method is still vendor ground

Cohere says North Mini Code outperforms several open source models of similar or larger size, including Qwen3.5, Gemma 4, Devstral Small 2, Nemotron 3 Super, Mistral Small 4 and Devstral 2. For competitor results, however, the article itself says some scores come from public reports and some missing results were run internally.

That is not a disqualification. It is a reason to read the charts as a promising signal, not a closed verdict. Independent replications on the same harnesses and ordinary developer repositories will matter more than another figure with one model at the top.

Adoption in agentic IDEs and local company stacks will decide

The near term test is practical: whether North Mini Code holds up in OpenCode, internal coding agents and company repositories where build systems, private dependencies and test quality vary. The Apache 2.0 license gives it a path into local and controlled deployments where closed models often do not belong.

The second signal is operating cost. 3B active parameters in an MoE model sounds like a sensible compromise, but agentic tasks burn context, tools and repeated attempts. Efficiency will show up on the receipt for real rollouts.

Lilith's verdict

North Mini Code has its best shot where a developer does not want a poetry generator for Python, but a quiet runner in the terminal that finishes the tests and does not trip over its own tool.