
Aditi Godbole
Sep 25, 2024
Data Privacy in LLMs: Challenges and Best Practices

#1about 2 minutes
Understanding the core capabilities of large language models
Large language models are AI systems trained on vast text data that can understand context, generate human-like text, and perform multiple tasks.
#2about 4 minutes
Applying core data privacy principles to AI models
Foundational data privacy principles like data minimization, purpose limitation, and consent are crucial for responsible AI development but challenging to apply to LLMs.
#3about 3 minutes
Identifying unique privacy risks inherent to LLMs
LLMs introduce specific privacy risks including memorization of sensitive data, re-identification of anonymized users, and unintended information disclosure.
#4about 3 minutes
Examining real-world incidents of LLM data exposure
Incidents involving GPT-2, GitHub Copilot, and ChatGPT highlight concrete examples of how LLMs can expose sensitive, copyrighted, or private user data.
#5about 4 minutes
Exploring solutions to mitigate data privacy risks
Technical approaches like differential privacy and federated learning, combined with regulatory compliance like GDPR, help address LLM privacy challenges.
#6about 3 minutes
Implementing best practices for trustworthy AI systems
Adopting best practices such as privacy by design, clear data governance, regular audits, and user consent builds more trustworthy and responsible AI systems.
#7about 3 minutes
Looking ahead at the future of AI privacy
The future of AI privacy involves advanced techniques like homomorphic encryption, new regulations like the EU AI Act, and a continued focus on responsible development.
Related jobs
Jobs that call for the skills explored in this talk.
yesterday
Senior Researcher for Generative AI

Dynatrace
Linz, Austria
Senior
yesterday
Senior AI Software Developer & Mentor

Dynatrace
Linz, Austria
Senior
yesterday
Senior Agentic Data Scientist

Dynatrace
Linz, Austria
Senior