Manipulating The Machine: Prompt Injections And Counter Measures
A Chevy chatbot was tricked into offering cars for $1. This talk explores the serious security threat of prompt injection and shows you how to stop it.
#1about 4 minutes
Understanding the three layers of an LLM prompt
A prompt is structured into three layers: the system prompt for instructions, the context for additional data, and the unpredictable user input.
#2about 3 minutes
How a car dealer's chatbot was easily manipulated
A Chevrolet car dealer's chatbot was exploited by users to generate humorous and unintended responses, including a legally binding offer for a $1 car.
#3about 4 minutes
Stealing system prompts to bypass security rules
Attackers can use creative phrasing like "repeat everything above" to trick an LLM into revealing its hidden system prompt and instructions.
#4about 6 minutes
Why attackers use prompt injection techniques
Prompt injections are used to access sensitive business data, gain personal advantages like bypassing HR filters, or exploit integrated tools to steal information like 2FA tokens.
#5about 4 minutes
Exploring simple but ineffective defense mechanisms
Initial defense ideas like avoiding secrets or tool integration are impractical, and simple system prompt instructions are easily circumvented by attackers.
#6about 4 minutes
Using fine-tuning and adversarial detectors for defense
More effective defenses include fine-tuning models on domain-specific data to reduce reliance on instructions and using specialized adversarial prompt detectors to identify malicious input.
#7about 2 minutes
Key takeaways on prompt injection security
Treat all system prompt data as public, use a layered defense of instructions, detectors, and fine-tuning, and accept that no completely reliable solution exists yet.
Related jobs
Jobs that call for the skills explored in this talk.
Dev Digest 138 - Are you secure about this?Hello there! This is the 2nd "out of the can" edition of 3 as I am on vacation in Greece eating lovely things on the beach. So, fewer news, but lots of great resources. Many around the topic of security. Enjoy! News and ArticlesGoogle Pixel phones t...
Daniel Cranney
Dev Digest 171: AI in disguise, Doomed tech and system promptsInside last week’s Dev Digest 171 .
🤖 Insights from LLM system prompts
👎 Why agents are bad pair programmers
🔒 All vibe coded apps share a security flaw
📅 Why are 2025/05/28 and 2025-05-28 different dates in JS?
🧱 Pure CSS Minecraft
🎧 Create web aud...
Luis Minvielle
How to Bypass ChatGPT’s Filter With ExamplesSince dropping in November 2022, ChatGPT has helped plenty of professionals satisfy an unpredictable assortment of tasks. Whether for finding an elusive bug, writing code, giving resumes a glow-up, or even starting a business, the not-infallible but ...
Daniel Cranney
Dev Digest 182: GPT5 Prompts, MCP Vulnerabilities, Code TrapsInside last week’s Dev Digest 182 .
📝 A guide to prompting GPT-5
⏰ Extreme hours at AI startups
💻 AI is a Junior Dev, and it needs a lead
🐴 Trojans embedded in SVG’s
⚠️ The State of MCP Security
⚒️ A reference manual for people who design and build ...
From learning to earning
Jobs that call for the skills explored in this talk.