Simon Willison shows why an agent sandbox cannot be just another Python process | Radar

Simon Willison released the alpha package micropython-wasm and a Datasette Agent plugin that runs Python inside a WebAssembly sandbox. The important part is not the demo, but the boundary between a useful agent and code that can break its host application.

MicroPython in WASM gives plugins a narrower cage

Willison describes a long-running problem in his Python plugin systems for Datasette, LLM and sqlite-utils: plugins are excellent for experimentation, but they run with the full privileges of the application. A buggy or malicious plugin can read data, touch files or damage the running process.

His latest attempt uses MicroPython compiled to WebAssembly and executed through wasmtime. The micropython-wasm package is available on PyPI as an alpha, and datasette-agent-micropython uses it as a code execution sandbox for Datasette Agent. Version 0.1a2 adds a simple CLI mode via uvx. Willison also says plainly that he is not ready to recommend it to anyone unwilling to accept significant risk.

An agent without a sandbox is a production incident with a prompt

For agentic products, this is a practical signal. The more we let agents run code, call tools and modify local data, the less it helps to say the model has a good system prompt. Without a separate runtime, safety depends on the discipline of code we are explicitly treating as untrusted.

WebAssembly is interesting because it moves the argument from a language promise to process isolation. The host can decide which functions to expose, how to limit memory and how to handle filesystem or network access. That is boring infrastructure, but for agents it matters more than another glossy chat UI.

Alpha means the cage is not certified yet

The author is careful. wasmtime directly supports memory limits, while CPU limits use its fuel concept, but Willison says the units are hard to reason about and he is not fully confident in the current 20 million fuel default. Host functions are handled by custom C code compiled into the WASM blob.

That is not a disqualification. It is the exact gap between a clever prototype and a security product. If a sandbox is meant to protect other people's data, it needs fuzzing, review, documented threat models and people whose job is to break isolation boundaries.

The test is breakout attempts, not PyPI stars

The next signal is simple: whether Python-in-WASM sandboxing attracts companies or teams with professional security review that adopt this approach and open source their results.

Until then, micropython-wasm is a useful reference point for agent developers. It shows what the minimum control layer should look like when an application lets a model run code but does not want to hand over access to everything else.

Lilith's verdict

An agent that can run code without a sandbox is not a colleague. It is an intern with root access and a curious finger hovering over delete.