Skip to content

Security & Privacy

Hack Me If You Can: Designing Unbreakable LLM Guardrails

with Cansu Kavili Örnek

Friday 10 July 11:40 – 12:10 Stage 8 - powered by Red Hat

About This Session

Organizations are moving quickly to add GenAI features to existing applications, but their security practices have not caught up. Traditional controls such as firewalls, authentication, and encryption do not stop prompt injection, data leakage through model outputs, or violations of company policies. These gaps can expose sensitive data, damage brand reputation, or undermine regulatory compliance efforts. This session presents a Guardrails-as-a-Service pattern in which model traffic is intercepted and evaluated by a guardrail orchestrator that chains specialized detectors and policy engines before responses reach end users. It covers a modular deployment model for combining prompt-injection detection, content safety, and custom business rules; observability patterns for measuring block rates, violations, and false positives; and continuous evaluation of defenses with open-source testing tools as threats evolve. Attendees will leave with concrete patterns for deploying, monitoring, and iterating on AI guardrails across multiple workloads in production.

Topics

  • AI Standards
  • Safety