← Library · agents
Computer-use agents — the model that clicks
A computer-use agent sees the screen and controls the UI. It sounds like sci-fi; in practice it is fragile automation over pixels, forms and badly labelled buttons.
What it is
A computer-use agent receives a screenshot or UI tree, decides where to click or what to type, and performs the action through a browser or desktop. It is not the same as an API integration: UI is designed for humans, not deterministic machines.
Why it is tempting
Many tools have poor APIs, internal apps are old, and people work through browsers anyway. An agent that can fill a form, download a report or compare screens can route around years of integration debt.
Why it is dangerous
UI changes, buttons look alike, modals cover pages and the model can click a destructive action. Computer-use agents need confirmations, sandboxes, limited accounts and no access to things outside the task.
What to remember
Computer-use is a great fallback, not an ideal integration layer. If an API exists, use the API. If it does not, expect fragility and log every click.