Mirko Ross
Hacking AI - how attackers impose their will on AI
#1about 2 minutes
Understanding the core principles of hacking AI systems
AI systems can be hacked by manipulating their statistical outputs through data poisoning to force the model to produce attacker-controlled results.
#2about 2 minutes
Exploring the three primary data poisoning attack methods
Attackers compromise AI systems through prompt injection, manipulating training data to create backdoors, or injecting specific patterns into a live model.
#3about 3 minutes
Why the AI industry repeats early software security mistakes
The AI industry's tendency to trust all input data, unlike the hardened practices of software development, creates significant vulnerabilities for attackers to exploit.
#4about 3 minutes
How adversarial attacks manipulate image recognition models
Adversarial attacks overlay a carefully crafted noise pattern onto an image, causing subtle mathematical changes that force a neural network to misclassify the input.
#5about 5 minutes
Applying adversarial attacks in the physical world
Adversarial patterns can be printed on physical objects like stickers or clothing to deceive AI systems, such as tricking self-driving cars or evading surveillance cameras.
#6about 2 minutes
Creating robust 3D objects for adversarial attacks
By embedding adversarial noise into a 3D model's geometry, an object can be consistently misclassified by AI from any viewing angle, as shown by a turtle identified as a rifle.
#7about 2 minutes
Techniques for defending against adversarial image attacks
Defenses against adversarial attacks involve de-poisoning input images by reducing their information level, such as lowering bit depth, to disrupt the malicious noise pattern.
#8about 4 minutes
Understanding the complexity of prompt injection attacks
Prompt injection bypasses safety filters by framing forbidden requests in complex contexts, such as asking for Python code to perform an unethical task, exploiting the model's inability to grasp the full impact.
#9about 2 minutes
The inherent bias of manual prompt injection filters
Manual content filtering in AI models introduces human bias, as demonstrated by inconsistent rules for jokes about different genders, which highlights a fundamental scaling and fairness problem.
#10about 2 minutes
Q&A on creating patterns and de-poisoning images
The Q&A covers how adversarial patterns are now AI-generated and discusses image de-poisoning techniques like autoencoders, bit depth reduction, and rotation to reduce malicious information.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
04:21 MIN
Fundamental AI vulnerabilities and malicious misuse
A hundred ways to wreck your AI - the (in)security of machine learning systems
10:20 MIN
Deconstructing AI attacks from evasion to model stealing
A hundred ways to wreck your AI - the (in)security of machine learning systems
13:54 MIN
The ethical risks of outdated and insecure AI models
AI & Ethics
25:33 MIN
AI privacy concerns and prompt engineering
Coffee with Developers - Cassidy Williams -
13:32 MIN
Exploring evasion and poisoning attacks in ML
Machine Learning: Promising, but Perilous
00:04 MIN
Understanding the current state of AI security challenges
Delay the AI Overlords: How OAuth and OpenFGA Can Keep Your AI Agents from Going Rogue
09:25 MIN
Understanding the security risks of AI-generated code
WeAreDevelopers LIVE – Building on Algorand: Real Projects and Developer Tools
03:07 MIN
The dual nature of machine learning's power
Machine Learning: Promising, but Perilous
Featured Partners
Related Videos
A hundred ways to wreck your AI - the (in)security of machine learning systems
Balázs Kiss
Machine Learning: Promising, but Perilous
Nura Kawa
The AI Security Survival Guide: Practical Advice for Stressed-Out Developers
Mackenzie Jackson
Skynet wants your Passwords! The Role of AI in Automating Social Engineering
Wolfgang Ettlinger & Alexander Hurbean
Confuse, Obfuscate, Disrupt: Using Adversarial Techniques for Better AI and True Anonymity
David vonThenen
Prompt Injection, Poisoning & More: The Dark Side of LLMs
Keno Dreßel
GenAI Security: Navigating the Unseen Iceberg
Maish Saidel-Keesing
Manipulating The Machine: Prompt Injections And Counter Measures
Georg Dresler
From learning to earning
Jobs that call for the skills explored in this talk.

Senior Machine Learning Engineer (f/m/d)
MARKT-PILOT GmbH
Stuttgart, Germany
Remote
€75-90K
Senior
Python
Docker
Machine Learning

Lead Fullstack Engineer AI
Hubert Burda Media
München, Germany
€80-95K
Intermediate
React
Python
Vue.js
Langchain
+1

Machine Learning Engineer
Picnic Technologies B.V.
Amsterdam, Netherlands
Intermediate
Senior
Python
Machine Learning
Structured Query Language (SQL)





