Lilith Lilith.
CS EN PL
Start

Concept

Agent infrastructure — the boring layer agents need to work

An agent is not just a model with a task. In production it needs identity, permissions, inboxes, tools, memory, audit, telemetry and clear boundaries. Without infrastructure, autonomy is just a pretty demo with risk attached.

Read

Concept

Agent safety and sandboxing

An agent with tools is a tiny machine for consequences. Sandboxes, approvals, least privilege and audit logs are not enterprise decoration; they are brakes before the fire.

Read

Concept

AI-assisted research — the model as a research partner

AI-assisted research uses models to find hypotheses, write code, test variants and read literature. It is not automatic science. It is a faster research loop with new ways to fall on your face.

Read

Concept

Async agents — work that does not live in chat

An agent that takes a task, runs outside the conversation, and returns a finished artifact. Powerful for long workflows, dangerous without state, limits and review.

Read

Concept

Computer-use agents — the model that clicks

A computer-use agent sees the screen and controls the UI. It sounds like sci-fi; in practice it is fragile automation over pixels, forms and badly labelled buttons.

Read

Concept

Evals and benchmarks — measurement instead of vibes

A benchmark is not truth carved in stone. It is an instrument with error bars. Without it, though, you are only guessing whether a model or agent works.

Read

Concept

Fine-tuning — a scalpel, not a universal hammer

Fine-tuning changes model weights. It is powerful when you have data, evals and a clear reason. It is an expensive mistake when it hides a bad prompt, missing RAG or an unclear process.

Read

Concept

Frontier model governance — who checks the model before release

Frontier model governance asks who tests the strongest models before deployment, under which rules and with what power to intervene. A voluntary audit, a system card and government testing are not the same thing.

Read

Concept

Model economics — the operating cost of intelligence

Tokens, latency, throughput, quality, and risk on one bill. A model is not just smart or dumb; it is expensive, slow, cheap, local, or operationally bearable.

Read

Concept

Model reliability — when a pretty answer is not enough

Reliability is about when the model knows, when it does not, when it invents, and how often its output can be trusted in production. Elegant wording is not evidence.

Read

Concept

Open vs. closed models — who pays the frontier premium

An open model is not automatically freedom, and a closed model is not automatically lock-in. The practical question is when control, cost and local deployment matter more than paying for frontier capability.

Read

Concept

Physical AI — when an agent reaches into the world

Physical AI connects models, robots, simulation and actions in the real environment. It is not about a cute robot demo, but about who carries the risk when a model starts moving things.

Read

Concept

Tool use — when a model calls tools

Tool use is the moment an LLM stops merely answering and starts calling APIs, running commands, reading files or touching databases. Useful, sharp and dangerous.

Read

Concept

Zombie internet — when AI text eats the web

Zombie internet is a web flooded with generated text, summaries without accountability and content that only looks human from far away. The problem is not just spam. The problem is loss of trust.

Read