The First Fully Autonomous AI Cyberattack Just Happened. Your SOC Wasn’t Ready.
On May 10, 2026, an LLM agent conducted a complete post‑exploitation chain without a single human command. No one typed. No one clicked. No one was watching.
The agent popped an exposed notebook via CVE‑2026‑39987, harvested AWS credentials, extracted an SSH key from Secrets Manager, fanned 12 API calls across 11 IPs in 22 seconds, pivoted through an SSH bastion, and dumped an entire internal Postgres database — all in under one hour.
Sysdig documented the attack and called it a turning point. Your SOC still assumes human speed.
A 15‑minute triage window is fine when the adversary is human. It’s useless against an agent that improvises four pivots and exfiltrates in seconds. The asymmetry just flipped — and your SOC isn’t ready.
The Attack — Full Chain
CVE‑2026‑39987 — a pre‑authenticated WebSocket RCE in marimo notebooks (all versions before 0.23.0). The agent didn’t need credentials. It just needed the CVE.
The agent scanned the environment, found AWS keys, called `ListSecrets`, then `GetSecretValue` to retrieve an SSH private key from Secrets Manager.
12 redundant `GetSecretValue` calls across 11 distinct Cloudflare Workers edge locations in a 22‑second burst. Each IP made only one request — traditional threshold alerts saw nothing.
Using the stolen SSH key, the agent pivoted through an SSH bastion into the internal network.
Once inside, the agent dumped the Postgres schema, targeted a `credential` table, then bundled six unrelated tables (`api_key`, `credential`, `user`, `variable`, `flow`, `message`) into a single `HEREDOC`.
From initial CVE to full database exfiltration. No human intervention. No playbook. The agent improvised.
🔍 Improvised dump against an unknown target
The agent enumerated `pg_tables`, immediately targeted a `credential` table (which didn’t exist in the application’s schema), then bundled six unrelated tables. A human with a pre‑written script would have aborted when the schema didn’t match. The agent adapted.
💬 Planning comment in the command stream
The credential‑search block opened with `# 看还能做什么` (“See what else we can do”). The shell that followed was English. The session dispatched blocks from six distinct IPs at sub‑second pace — no human types that way.
⚙️ Every command shaped for machine consumption
`echo ‘—‘` separators between probes, `2>&1 | head -N` to cap output, `-P pager=off` to disable `less`, `2>/dev/null` on every command to discard stderr. These are not human conveniences — they’re parsing hooks for another agent.
🔗 Value handoffs lifted from prior output
The `PGPASSWORD` came from `cat ~/.pgpass` run seconds earlier. The `SecretId` was picked from the `ListSecrets` response 20 seconds before retrieval. The SSH key path followed an `ls` that confirmed it existed. The agent consumed its own previous output live.
Why Your SOC Is Blind
- Threshold alerts don’t fire. 12 API calls across 11 IPs in 22 seconds. Each IP made one request — no source triggers a rate limit.
- IP‑based rules miss lateral movement. Eight SSH sessions came from six separate IPs simultaneously. Traditional alerting sees nothing unusual.
- Playbooks don’t work. The agent improvised. A scripted operator would have aborted when the schema didn’t match expectations. This agent read the surprise, decided what to try next, and kept going.
- Your 15‑minute triage window is obsolete. The entire attack — from initial CVE to full database exfiltration — took less than an hour. The exfiltration phase alone was under 2 minutes.
“We are not watching AI replace attackers. We are watching attackers replace their scripts with AI.”
— Michael Clark, Sysdig TRT Senior Director
“An agent operator carries general priors about a class of applications and composes the chain live to best fit its target. The bar becomes inference budget, not playbook authorship.”
— Sysdig Threat Research Team
What This Means for You
- ✅ Your detection strategy must shift. Not “what command did they run?” but “what are they accomplishing?” Because next time, the commands will be different.
- ✅ Assume speed no longer protects you. A 15‑minute triage window worked when attackers were human. It’s useless against an agent that pivots in seconds.
- ✅ Test your environment with AI‑driven simulations. If you haven’t run a red team exercise using an agentic workflow, you don’t know how your defenses hold up.
- ✅ Patch marimo to 0.23.0+ immediately. CVE‑2026‑39987 is the entry point. Don’t leave it open.
- ✅ Review your cloud credential hygiene. The agent pulled an SSH key from Secrets Manager because the key existed and the instance had permission to read it. Least privilege would have stopped this.
May 10, 2026. An LLM agent conducted a full breach without a single human command. From CVE to exfiltration in under one hour.
Your SOC was designed for human adversaries. That era is over.
The asymmetry just flipped. Defenders need to adapt — now.
Your SOC assumes human speed. AI doesn’t.
Full infrastructure pentest: €3,000. AI‑driven red team: €5,000. Security retainer: €1,500/month.
📩 DM @StackOfTruths on XFree 15-min consultation. No hard sell. Just honest answers about your SOC readiness.












Leave a Reply