
AI Security
Description
The evolving world of artificial intelligence (AI) brings both opportunities and risks. To protect assets, organizations must understand how to secure their AI systems. This in-depth course delves into the AI security landscape, addressing vulnerabilities like prompt injection, denial of service attacks, model theft, and more. Learn how attackers exploit these weaknesses and gain hands-on experience with proven defense strategies and security APIs.
Discover how to securely integrate LLMs into your applications, safeguard training data, build robust AI infrastructure, and ensure effective human-AI interaction. By the end of this course, you'll be equipped to protect your organization's AI assets and maintain the integrity of your systems.
Participants attending this course will
- Gain a comprehensive understanding of AI technologies and the unique security risks they pose
- Learn to identify and mitigate common AI vulnerabilities
- Gain practical skills in securely integrating LLMs into applications
- Understand the principles of responsible, reliable, and explainable AI
- Familiarize themselves with security best practices for AI systems
- Stay updated with the evolving threat landscape in AI security
- Engage in hands-on exercises that simulate real-world scenarios
Outline
- Introduction to AI Security
- Types of AI Systems and Their Vulnerabilities
- Understanding and Countering AI-specific Attacks
- Ethical and Reliable AI
- Prompt Injection
- Model Jailbreaks and Extraction Techniques
- Visual Prompt Injection
- Denial of Service Attacks
- Secure LLM Integration
- Training Data Manipulation
- Human-AI Interaction
- Secure AI Infrastructure
Course information
Preparedness
AI Fundamentals, Security Fundamentals, Software development
Exercises
Hands-on
Delivery methods
Onsite / Virtual classroom
Course reviews
Related courses
Table of contents
- Day 1
- Day 2
- Day 3
- Introduction to AI security
- What is AI Security?
- Defining AI
- Defining Security
- AI Security scope
- Beyond this course
- Defining AI
- Different types of AI systems
- Neural & deep neural networks
- Machine learning (ML)
- Large language models (LLM)
- Types of models
- Multi-modal models
- Neural & deep neural networks
- From Prompts to Hacks
- Use-cases of AI systems
- Attacking Predictive AI systems
- Examples for attacking PredAI systems
- Attacking Generative AI systems
- Interacting with AI systems
- Use-cases of AI systems
- What does "Secure AI" mean?
- Responsible AI
- Reliable, trustworthy AI
- Explainable AI
- A word on alignment
- Responsible AI
- What about inappropriate content?
- Exercise: OpenAI Moderation endpoint - We like to keep it cultural
- Text moderation
- Image moderation
- Image and text moderation
- Text moderation
- To censor or not to censor
- Exercise: Using an uncensored model
- Using an uncensored model
- Using an uncensored model
- What is AI Security?
- Using AI for malicious intents
- Deepfake scam earns $25M
- You would never believe, until you do
- Behind deepfake technology
- You would never believe, until you do
- Voice cloning for the masses
- Imagine yourself in their shoes
- Technological dissipation
- Imagine yourself in their shoes
- Social engineering on steroids
- Leveling the playing field
- Profitability from the masses
- Leveling the playing field
- Shaking the fundamentals of reality
- What is real?
- Fake or real?
- Donald Trump arrested
- Pentagon explosion shakes the US stock market
- Eiffel Tower on fire
- Durable content credentials
- Content Credentials in action
- Content authenticity
- What is real?
- Exercise: Image watermarking
- Real or fake?
- Decide whether it is AI generated or not
- Real or fake?
- Deepfake scam earns $25M
- The AI Security landscape
- Attack surface of an AI system
- Components of an AI system
- AI systems and model lifecycle
- Supply-chain is more important than ever
- Models and APIs
- Non-AI attacks are here to stay
- Components of an AI system
- OWASP Top 10 and AI
- About OWASP and it's Top 10 lists
- OWASP ML Top 10 - Model Manipulation
- OWASP ML Top 10 - Others
- OWASP LLM Top 10 - Injection
- OWASP LLM Top 10 - Others
- Beyond OWASP Top 10
- About OWASP and it's Top 10 lists
- Threatmodeling an LLM integrated application
- Threat model
- Threat modeling: Workshop
- Threat modeling: AI assistant
- Threat modeling: Diagram
- Threat modeling: STRIDE
- Threat modeling: Trust boundaries
- Threat modeling: TB01
- Threat modeling: Threats & Mitigations
- Threat model
- Exercise: Threatmodeling an LLM integrated application
- Meet TicketAI, a ticketing system
- TicketAI's data flow diagram
- Find potential threats
- Meet TicketAI, a ticketing system
- Attack surface of an AI system
- Attacks on AI systems
- Red Teaming
- What is Red Teaming?
- Security and safety
- What is Red Teaming?
- Attacks on AI systems - Prompt injection
- Prompt injection
- Impact
- Examples
- Indirect prompt injection
- From prompt injection to phishing
- Prompt injection
- Advanced techniques - SudoLang: pseudocode for LLMs
- Introducing SudoLang
- SudoLang examples
- Behind the tech
- A SudoLang program
- Integrating an LLM
- Integrating an LLM with SudoLang
- Introducing SudoLang
- Exercise: Translate a prompt to SudoLang
- A long prompt
- A different solution
- A long prompt
- Exercise: Red Teaming - Prompt injection (Levels 1-2)
- Get the password!
- Classic injection defense
- Levels 1-2
- Solutions for levels 1-2
- Red Teaming
- Attacks on AI systems
- Attacks on AI systems - Model jailbreaks
- What's a model jailbreak?
- How jailbreaks work?
- What's a model jailbreak?
- Jailbreaking ChatGPT
- The most famous ChatGPT jailbreak
- The 6.0 DAN prompt
- AutoDAN - Generating Stealthy Jailbreak Prompts
- The most famous ChatGPT jailbreak
- Exercise: Red Teaming - Jailbreaking (Levels 3-5)
- Get the password!
- Levels 3-5
- Looking for external help
- Tree of Attacks with Pruning (TAP)
- Tree of Attacks explained
- TAP results
- Tree of Attacks explained
- Attacks on AI systems - Prompt extraction
- Prompt extraction
- Prompt extraction
- Exercise: Red Teaming - Prompt extraction (Levels 6-7)
- Get the password!
- Level 6
- Level 7
- Extract the boundaries of levels 6 and 7
- Defending AI systems - Prompt injection defenses
- Intermediate techniques
- Advanced techniques
- More Security APIs
- ReBuff example
- Llama Guard example I
- Llama Guard example II
- Lakera example
- Intermediate techniques
- Attempts against a similar exercise
- Gandalf from Lakera
- Types of Gandalf exploits
- Gandalf from Lakera
- Exercise: Red Teaming - Give it your best shot!
- Get the password!
- Level 8
- Level 9
- Other injection methods
- Attack categories
- Reverse Psychology
- Attack categories
- Exercise: Reverse Psychology
- Write an exploit with the ChatbotUI
- A possible solution
- Write an exploit with the ChatbotUI
- Other protection methods
- Protection categories
- A different categorization
- Bergeron method
- Protection categories
- Sensitive Information Disclosure
- Relevance
- Best practices
- Relevance
- Attacks on AI systems - Model jailbreaks
- Visual Prompt Injection
- Attack types
- Visuals with threats
- Trivial examples
- Adversarial attacks
- Visuals with threats
- Tricking self-driving cars
- How to fool a Tesla
- This is just the beginning
- Protection against adversarial attacks
- How to fool a Tesla
- Exercise: Image recognition with OpenAI
- Painting with (invisible) words
- Invisible instructions
- Painting with (invisible) words
- Exercise: Adversarial attack
- Untargeted attack with FGSM
- Untargeted attack with FGSM
- Protection methods
- Protection methods 1 / 2
- Protection methods 2 / 2
- Protection methods 1 / 2
- Attack types
- Denial of Service
- Denial of Service on AI systems
- Attack scenarios
- Attack scenarios
- Prompt routing challenges
- Attacks
- Protections
- Attacks
- Exercise: Denial of Service
- Halting Model Responses
- A possible solution
- Halting Model Responses
- Denial of Service on AI systems
- Model theft
- Know your enemy
- Risks
- Risks
- Attack types
- Training or fine-tuning a new model
- Dataset exploration
- Training or fine-tuning a new model
- Exercise: Query-based model stealing
- OpenAI API parameters
- How to steal a model
- OpenAI API parameters
- Protection against model theft
- Simple protections
- Advanced protections
- Simple protections
- Know your enemy
- LLM integration
- The LLM trust boundary
- An LLM is a system just like any other
- It's not like any other system
- Classical problems in novel integrations
- Treating LLM output as user input
- Typical exchange formats
- Applying common best practices
- An LLM is a system just like any other
- Exercise: SQL Injection via an LLM
- SQL Injection via an LLM
- FreshCart Introduction
- SQL Injection via an LLM
- SQL Injection via an LLM
- Exercise: Generating XSS payloads
- XSS attack on Fresh Cart
- XSS attack on Fresh Cart
- LLMs interaction with other systems
- Typical integration patterns
- Function calling dangers
- The rise of custom GPTs
- Security considerations
- Identity and authorization across applications
- Best practices for secure integration
- Typical integration patterns
- Exercise: Function calling through OpenAI API
- Function calling
- Function calling
- Function calling
- Function calling
- Exercise: Privilege escalation via prompt injection
- Privilege escalation
- Privilege escalation
- Principles of security and secure coding
- Matt Bishop's principles of robust programming
- Matt Bishop's principles of robust programming - I
- Matt Bishop's principles of robust programming - II
- The security principles of Saltzer and Schroeder
- The security principles of Saltzer and Schroeder - I
- The security principles of Saltzer and Schroeder - II
- The security principles of Saltzer and Schroeder - III
- The security principles of Saltzer and Schroeder - IV
- Matt Bishop's principles of robust programming
- Racking up privileges
- The case for a very capable model
- Exploiting excessive privileges
- Separation of privileges
- A model can't be cut in half
- Designing your model privileges
- The case for a very capable model
- A customer support bot going wild
- A customer support bot going wild
- A customer support bot going wild
- Best practices in practice
- Input validation
- Output encoding
- Use frameworks
- Input validation
- The LLM trust boundary
- Training data manipulation
- What you train on matters
- What data are models trained on?
- Model assurances
- Model and dataset cards
- What data are models trained on?
- Exercise: Verifying model cards
- The content of a model card
- The content of a model card
- A malicious model
- A malicious model
- A malicious model - mitigation
- A malicious model
- Verifying datasets
- Getting clear on objectives
- Dataset providers
- A glance at the dataset card
- Analyzing a dataset
- Getting clear on objectives
- Exercise: Analyzing datasets
- Content of a dataset card
- Use Great Expectations to analyze a dataset
- Content of a dataset card
- A secure supply chain
- Proving model integrity is hard
- Cryptographic solutions are emerging
- Cryptographic solutions are emerging
- Hardware-assisted attestation
- Proving model integrity is hard
- What you train on matters
- Human-AI interaction
- Relying too much on LLM output
- What could go wrong?
- Countering hallucinations
- Verifying the verifiable
- Referencing what's possible 1/2
- Referencing what's possible 2/2
- Clear communication is key
- What could go wrong?
- Relying too much on LLM output
- Secure AI infrastructure
- Requirements of a secure AI infrastructure
- Core Requirements
- Data Security
- Privacy and human intervention
- Core Requirements
- Privacy and the Samsung data leak
- The Samsung data leak
- The Samsung data leak
- OpenAI Evaluation
- Analyzing model accuracy and efficiency
- Getting datasets for evaluation
- Analyzing model accuracy and efficiency
- Evaluation
- LangSmith
- What is LangSmith?
- Admin
- Evaluation Workflow 1 / 2
- Evaluation Workflow 2 / 2
- Tracing 1 / 2
- Tracing 2 / 2
- What is LangSmith?
- Exercise: LangSmith
- Tracing with LangSmith
- Tracing with LangSmith
- BlindLlama
- BlindLlama
- BlindLlama
- Requirements of a secure AI infrastructure