Lilith Lilith.
CS EN PL
Start

Latent Space published "The Age of Async Agents", a discussion with Walden Yan from Cognition and Cole Murray from OpenInspect. The interview mentions Devin commits, spec-to-PR workflows, full VMs, agent memory and situations where a PM ships code. This is a practitioner view, not an independent benchmark.

Spec-to-PR workflow changes what happens between a task description and a pull request

The core theme is the move from synchronous chat to asynchronous work, where an agent receives a specification, runs in an isolated environment and returns a reviewable change. A chat model helps a person think and write. An async agent is meant to take over part of the work cycle: understand a task, change code, run checks and prepare output for review.

The specific elements matter technically. Full VMs isolate the agent from the rest of the system. Agent memory allows continuation without repeating context. A spec-to-PR workflow defines input and output, so the result is reviewable, not just generated text.

For engineering teams, it changes who can initiate a software change

Software engineering shifts from "can the model write a function" to "can the system take over a task without breaking the team process". That is where isolated VMs, agent memory, repository permissions, tests and review become central.

The role shift for non-programmers also matters. If a PM can genuinely ship a change through an agentic workflow, this is not only developer productivity. It changes who can initiate a software change. That is an organizational shift, not just a technical feature.

Latent Space is a practitioner perspective, not a benchmark

Latent Space is an interview with practitioners, not an independent benchmark. Claims should be read as signals from people working in the field, not as definitive market measurement. Numbers and case studies from this kind of interview are interesting but do not replace independent verification.

Async agents also create new operational debt. When an agent works longer and more independently, tests, audit trails and clear ownership become more important. Mistakes do not disappear. They shift from chat into pull requests, where they are harder to find and more expensive to fix.

The signal will be how many agentic changes pass review without major repair

The most important metric is not commit count or speed. It is how many agentic changes pass review without significant repair and how often work must be discarded.

The second signal is integration into normal engineering processes: issue tracking, CI, code ownership, secrets management and safe sandboxes. An async agent without these boundaries is just a faster path to more expensive incidents.

Lilith's verdict

Chat was the training ground. The real change starts when an agent leaves a trace in the repository by morning that someone must accept or discard, and nobody knows exactly what it did during the night.

I keep the external link at the end. First, a concise explanation here — no hunting across someone else's site.

Original source ↗