Mirko Ross
Hacking AI - how attackers impose their will on AI
#1about 2 minutes
Understanding the core principles of hacking AI systems
AI systems can be hacked by manipulating their statistical outputs through data poisoning to force the model to produce attacker-controlled results.
#2about 2 minutes
Exploring the three primary data poisoning attack methods
Attackers compromise AI systems through prompt injection, manipulating training data to create backdoors, or injecting specific patterns into a live model.
#3about 3 minutes
Why the AI industry repeats early software security mistakes
The AI industry's tendency to trust all input data, unlike the hardened practices of software development, creates significant vulnerabilities for attackers to exploit.
#4about 3 minutes
How adversarial attacks manipulate image recognition models
Adversarial attacks overlay a carefully crafted noise pattern onto an image, causing subtle mathematical changes that force a neural network to misclassify the input.
#5about 5 minutes
Applying adversarial attacks in the physical world
Adversarial patterns can be printed on physical objects like stickers or clothing to deceive AI systems, such as tricking self-driving cars or evading surveillance cameras.
#6about 2 minutes
Creating robust 3D objects for adversarial attacks
By embedding adversarial noise into a 3D model's geometry, an object can be consistently misclassified by AI from any viewing angle, as shown by a turtle identified as a rifle.
#7about 2 minutes
Techniques for defending against adversarial image attacks
Defenses against adversarial attacks involve de-poisoning input images by reducing their information level, such as lowering bit depth, to disrupt the malicious noise pattern.
#8about 4 minutes
Understanding the complexity of prompt injection attacks
Prompt injection bypasses safety filters by framing forbidden requests in complex contexts, such as asking for Python code to perform an unethical task, exploiting the model's inability to grasp the full impact.
#9about 2 minutes
The inherent bias of manual prompt injection filters
Manual content filtering in AI models introduces human bias, as demonstrated by inconsistent rules for jokes about different genders, which highlights a fundamental scaling and fairness problem.
#10about 2 minutes
Q&A on creating patterns and de-poisoning images
The Q&A covers how adversarial patterns are now AI-generated and discusses image de-poisoning techniques like autoencoders, bit depth reduction, and rotation to reduce malicious information.
Related jobs
Jobs that call for the skills explored in this talk.
Wilken GmbH
Ulm, Germany
Senior
Kubernetes
AI Frameworks
+3
Matching moments
03:13 MIN
How AI can create more human moments in HR
The Future of HR Lies in AND – Not in OR
03:28 MIN
Shifting from talent acquisition to talent architecture
The Future of HR Lies in AND – Not in OR
04:22 MIN
Navigating ambiguity as a core HR competency
The Future of HR Lies in AND – Not in OR
06:51 MIN
Balancing business, technology, and people for holistic success
The Future of HR Lies in AND – Not in OR
06:04 MIN
The importance of a fighting spirit to avoid complacency
The Future of HR Lies in AND – Not in OR
06:59 MIN
Moving from 'or' to 'and' thinking in HR strategy
The Future of HR Lies in AND – Not in OR
06:10 MIN
Understanding global differences in work culture and motivation
The Future of HR Lies in AND – Not in OR
05:10 MIN
How the HR function has evolved over three decades
The Future of HR Lies in AND – Not in OR
Featured Partners
Related Videos
A hundred ways to wreck your AI - the (in)security of machine learning systems
Balázs Kiss
The AI Security Survival Guide: Practical Advice for Stressed-Out Developers
Mackenzie Jackson
Machine Learning: Promising, but Perilous
Nura Kawa
Skynet wants your Passwords! The Role of AI in Automating Social Engineering
Wolfgang Ettlinger & Alexander Hurbean
Confuse, Obfuscate, Disrupt: Using Adversarial Techniques for Better AI and True Anonymity
David vonThenen
Manipulating The Machine: Prompt Injections And Counter Measures
Georg Dresler
Beyond the Hype: Building Trustworthy and Reliable LLM Applications with Guardrails
Alex Soto
Prompt Injection, Poisoning & More: The Dark Side of LLMs
Keno Dreßel
Related Articles
View all articles



From learning to earning
Jobs that call for the skills explored in this talk.

Forschungszentrum Jülich GmbH
Jülich, Germany
Intermediate
Senior
Linux
Docker
AI Frameworks
Machine Learning

autonomous-teaming
Berlin, Germany
Remote
ETL
NoSQL
NumPy
Python
+3

autonomous-teaming
München, Germany
Remote
ETL
NoSQL
NumPy
Python
+3

engineering people GmbH
Hannover, Germany



Imec
Azure
Python
PyTorch
TensorFlow
Computer Vision
+1


Agenda GmbH
Rosenheim, Germany
Intermediate
API
Azure
Python
Docker
PyTorch
+9