A Quick Stop at the HostileShop
In this talk, I will give you an overview of LLM Agent hacking. I will briefly introduce LLM Agents, their vulnerability surface, and types of attacks. Because everything old is new again, I will draw some parallels to the 90s hacking scene.
I will then present HostileShop, which is a python-based LLM Agent security testing tool that was selected as one of the ten prize winners in OpenAI's GPT-OSS-20B RedTeam Contest.
HostileShop creates a fully simulated web shopping environment where an Attacker Agent LLM attempts to manipulate a Target Shopping Agent LLM into performing unauthorized actions that are automatically detected by the framework.
HostileShop is best at discovering prompt injections that induce LLM Agents to make improper "tool calls". In other words, HostileShop finds the magic spells that make LLM Agents call functions that they have available to them, often with the specific input of your choice.
HostileShop is also capable of enhancement and mutation of "universal" jailbreaks. This allows cross-LLM adaptation of universal jailbreaks that are powerful enough to make the target LLM become fully under your control, for arbitrary actions. This also enables public jailbreaks that have been partially blocked to work again, until they are more comprehensively addressed.
For 1990s hacking vibes, HostileShop has a text-based chat interface that lets you chat with the attacker agent, or become the attacker yourself.
For 2025 contrarian vibes (in some circles), HostileShop was vibe coded without an ounce of shame. Basically, I used LLMs to write a framework to have LLMs attack other LLMs. Crucially however, for reasons that I will explain in the talk, HostileShop does not use an LLM to judge attack success. Instead, success is determined automatically and immediately by the framework.
At the time of this writing, HostileShop has working attack examples for the entire LLM frontier, though things move very fast in this arena.
There is a chance that by the time of the conference, all of the attacks that HostileShop is capable of discovering will have been fixed. In this case, the talk will focus on the current state of LLM security, the future of private Agentic AI, and either some thoughts on how to avoid complete dystopia, or amusing rants about the dystopia that has already arrived.