Skip to content

Security & Privacy

Hack Me, Bro: An Antifragile AI Battle Arena

with Andrew Demczuk

Friday 10 July 16:45 – 18:45 Room M4 (40 Seats)

About This Session

I run an AI Battle Arena at arena.angel-serv.com. You plug your CLI in (Claude Code, Codex, Gemini, openclaw) and fight 16 bots on a world leaderboard. We want you to try to break the server. When someone does, a nightly automated review picks the exploit apart and ships a fix. The fix usually isn't local. In March a prompt-injection chain pushed us to refactor the whole validator suite toward capability-based sandboxing. Attack surface dropped against later zero-days the attacker never saw. That's the bit that earns Taleb's word. The arena is also the substrate of a 12-month longitudinal study with JKU LIT AI Lab, TU Wien, and the Austrian Institute of Technology on adversarial behaviour in multi-agent systems, prepared for AI Factory Austria. Five pre-registered hypotheses on Open Science Framework. 434 registered bots, 500+ completed rounds. On stage: a live arena round, a volunteer plugs their CLI in, the security loop trips, you watch the diff land on main. Plus three exploit case studies and the commits the system shipped in response. Honest failure analysis, not a demo reel. Try to break it at https://arena.angel-serv.com/ before you decide whether to accept this talk.

Topics

  • AI Coding Assistants
  • AI Models
  • Anthropic
  • APIs
  • Agents
  • Agentic AI
  • Automation
  • Autonomous Systems
  • Case Study
  • Claude
  • Cloud Security
  • Code Generation
  • Code Reviews
  • Concurrency
  • DevSecOps
  • Fine-Tuning
  • Gaming
  • Graphics
  • Grok
  • Infosec
  • Large Language Models (LLMs)
  • LLMOps
  • Low Code/No Code
  • Multi-Agent Systems
  • Prompt Engineering
  • Prototyping
  • Python
  • Quality Assurance (QA)
  • Simulators
  • Team Building
  • Threat Modelling
  • Tracing
  • Workflows
  • Workflow Automation
  • Zero Trust