Breaking BOTS: Cheating at Blue Team CTFs with AI Speed-Runs

Day 4 13:50 One en Security
Dec. 30, 2025 13:50-14:30
After we announced our results, CTFs like Splunk's Boss of the SOC (BOTS) started prohibiting AI agents. For science & profit, we keep doing it anyways. In BOTS, the AIs solve most of it in under 10 minutes instead of taking the full day. Our recipe was surprisingly simple: Teach AI agents to self-plan their investigation steps, adapt their plans to new information, work with the SIEM DB, and reason about log dumps. No exotic models, no massive lab budgets - just publicly available LLMs mixed with a bit of science and perseverance. We'll walk through how that works, including videos of the many ways AI trips itself up that marketers would rather hide, and how to do it at home with free and open-source tools. CTF organizers can't detect this - the arms race is probably over before it really began. But the real question isn't "can we cheat at CTFs?" It's what happens when investigations evolve from analysts-who-investigate to analysts-who-manage-AI-investigators. We'll show you what that transition already looks like today and peek into some uncomfortable questions about what comes next.

THE PLAN

Live demonstrations of AI agents speed-running blue team challenges, including the failure modes that break investigations. We'll show both what happens when we try the trivial approaches like “just have claude do it”, “AI workflows”, and what ultimately worked, like managed self-planning, semantic SIEM layers, and log agents. Most can be done with free and open tools and techniques on the cheap, so we will walk through that as well.

THE DEEP DIVE

  • Why normal prompts and static AI workflows fail
  • Self-planning investigation agents that evolve task lists dynamically
  • What we mean by semantic layers for calling databases and APIs
  • How to handle millions of log events without bankrupting yourself
  • Why "no AI" rules are misguided technically and conceptually

GOING BEYOND CTFS

The same patterns that trivialize training exercises work on real SOC investigations. We're watching blue team work fundamentally transform - from humans investigating to humans managing AI investigators. Training programs teaching skills AI already automates. Hiring practices that can't verify who's doing the work. Certifications losing meaning. More fundamentally, when we talk about who watches the watchers, a lot is about to shift again.

Speakers of this event