Lilith Lilith.
CS EN PL
Start

From Radar

Radar · 2026-06-16

Android 17 turns Pixel into Gemini’s showroom

Google released Android 17 and Wear OS 7 first for Pixel devices, alongside a Pixel Drop with Gemini Omni, Lyria 3 and translation features for the Pixel 10a. The bigger signal is not the OS update itself, but Google using Android as a distribution layer for AI models on the device.

Read

Radar · 2026-06-16

SearchLeak shows why prompt injection hurts more in enterprise AI than in chat

The SearchLeak vulnerability in Microsoft 365 Copilot Enterprise Search could let attackers steal emails, documents or 2FA codes after a user clicked a crafted link, according to Varonis and Ars Technica. Microsoft has patched it, but the lesson remains: an agent with access to corporate data is a security product, not just a productivity assistant.

Read

Radar · 2026-06-15

Thirteen words on Reddit can poison an AI answer

Research described by 404 Media says a 13 word snippet of retrieved text from sites such as Reddit, Wikipedia, Quora or Facebook can push AI agents toward spam or scam output. For AI search, that turns SEO into a prompt injection and user-generated content moderation problem.

Read

Radar · 2026-06-14

The Mythos suspicion turns export control into an access control problem

The Verge, citing Semafor, says the White House restricted exports of Anthropic Mythos partly over suspicions that a China linked group had access to it. For AI labs, the warning is blunt: frontier model security is not just about public APIs, but every path to access.

Read

Radar · 2026-06-10

OpenAI is using Oracle Cloud to solve procurement, not demos

OpenAI is offering its models and Codex to Oracle Cloud customers through existing cloud commitments. For enterprise teams, the interesting part is not the endpoint, but the way AI fits into contracts, governance and billing they already use.

Read

Radar · 2026-06-09

Gemini 3.5 Live Translate moves voice translation a few seconds behind the speaker

Google announced Gemini 3.5 Live Translate for near real-time voice-to-voice translation across more than 70 languages. The practical question is not just translation quality, but latency, voice stability, Meet availability and who carries the risk when a live call is mistranslated.

Read

Radar · 2026-05-07

Mozilla fixed hundreds of Firefox bugs with Claude Mythos. AI security report quality just shifted.

Simon Willison described how Mozilla used early access to Claude Mythos Preview to systematically find and fix Firefox vulnerabilities. In April 2026 the number of fixed security bugs jumped to 423, compared to the usual 20 to 30 per month. The key shift: AI security reports stopped being noise and started being usable input.

Read

Radar · 2026-04-28

OpenAI layers ChatGPT safety from model to abuse detection, but the numbers are missing

OpenAI outlines its layered approach to ChatGPT community safety: model safeguards, abuse detection, policy enforcement, and collaboration with external safety experts.

Read

Radar · 2026-04-23

OpenAI pays up to $25,000 for bio jailbreaks in GPT-5.5, but proof will be in aggregate results

OpenAI launches a bio bug bounty targeting universal jailbreaks in GPT-5.5, with rewards up to $25,000 for critical biological safety findings.

Read

Radar · 2025-12-18

GPT-5.2-Codex targets long-horizon refactors, proof will be independent production tests

GPT-5.2-Codex targets long-horizon coding tasks across large context: large-scale code transformations, security fixes, and multi-file consistency.

Read

Radar · 2025-11-19

GPT-5.1-Codex-Max system card is worth reading, but trust it in proportion to its limits specificity

The GPT-5.1-Codex-Max system card describes two safety layers: model-level safety training and prompt injection protection, and product-level sandboxing with configurable network access.

Read

Radar · 2025-11-02

Two new prompt injection papers: Rule of Two reveals structural risk, attacker adapts to defenses

Simon Willison highlighted two new papers on agent prompt injection. Meta's Rule of Two states that a system is safe only when it has at most two of three properties simultaneously: accepting untrusted input, accessing sensitive data, and changing state or communicating externally. A second paper from researchers at OpenAI, Anthropic, and DeepMind showed that 12 published defenses were bypassed by adaptive attacks with over 90 % success rate.

Read

Radar · 2025-10-29

OpenAI opens policy-based content classification with open-weight safeguard models

OpenAI released gpt-oss-safeguard-120b and 20b: open-weight reasoning models where content classification policy is not baked into the weights but supplied at runtime. Organizations bring their own rules; the model reasons over them.

Read

Radar · 2025-09-05

Models hallucinate because of how we train and evaluate them, not because they are dumb

OpenAI's September 2025 post goes to the root of hallucinations: models learn to play the evaluation game, not to answer truthfully. If evals penalise admitted uncertainty more harshly than confident errors, models calibrate toward persuasiveness.

Read

Radar · 2025-08-27

OpenAI and Anthropic tested each other's models. The findings are instructive, the methodology still open.

OpenAI and Anthropic published results of a joint safety evaluation: they tested each other's models for misalignment, instruction following, hallucinations, and jailbreaking. For the first time, two leading labs show where outside eyes find their blind spots.

Read

From the Glossary