• CL-AISEC
  • 3 days
  • AI

Description

The evolving world of artificial intelligence (AI) brings both opportunities and risks. To protect assets, organizations must understand how to secure their AI systems. This in-depth course delves into the AI security landscape, addressing vulnerabilities like prompt injection, denial of service attacks, model theft, and more. Learn how attackers exploit these weaknesses and gain hands-on experience with proven defense strategies and security APIs.

Discover how to securely integrate LLMs into your applications, safeguard training data, build robust AI infrastructure, and ensure effective human-AI interaction. By the end of this course, you'll be equipped to protect your organization's AI assets and maintain the integrity of your systems.

Participants attending this course will

  • Gain a comprehensive understanding of AI technologies and the unique security risks they pose
  • Learn to identify and mitigate common AI vulnerabilities
  • Gain practical skills in securely integrating LLMs into applications
  • Understand the principles of responsible, reliable, and explainable AI
  • Familiarize themselves with security best practices for AI systems
  • Stay updated with the evolving threat landscape in AI security
  • Engage in hands-on exercises that simulate real-world scenarios

Outline

  • Introduction to AI Security
  • Types of AI Systems and Their Vulnerabilities
  • Understanding and Countering AI-specific Attacks
  • Ethical and Reliable AI
  • Prompt Injection
  • Model Jailbreaks and Extraction Techniques
  • Visual Prompt Injection
  • Denial of Service Attacks
  • Secure LLM Integration
  • Training Data Manipulation
  • Human-AI Interaction
  • Secure AI Infrastructure

Course information

Preparedness

AI Fundamentals, Security Fundamentals, Software development

Exercises

Hands-on

Delivery methods

Onsite / Virtual classroom

Course reviews

Table of contents

  • Day 1
  • Day 2
  • Day 3
  • Introduction to AI security
    • What is AI Security?
      • Defining AI
        • Defining Security
          • AI Security scope
            • Beyond this course
            • Different types of AI systems
              • Neural & deep neural networks
                • Machine learning (ML)
                  • Large language models (LLM)
                    • Types of models
                      • Multi-modal models
                      • From Prompts to Hacks
                        • Use-cases of AI systems
                          • Attacking Predictive AI systems
                            • Examples for attacking PredAI systems
                              • Attacking Generative AI systems
                                • Interacting with AI systems
                                • What does "Secure AI" mean?
                                  • Responsible AI
                                    • Reliable, trustworthy AI
                                      • Explainable AI
                                        • A word on alignment
                                          • What about inappropriate content?
                                          • Exercise: OpenAI Moderation endpoint - We like to keep it cultural
                                            • Text moderation
                                              • Image moderation
                                                • Image and text moderation
                                                  • To censor or not to censor
                                                  • Exercise: Using an uncensored model
                                                    • Using an uncensored model
                                                  • Using AI for malicious intents
                                                    • Deepfake scam earns $25M
                                                      • You would never believe, until you do
                                                        • Behind deepfake technology
                                                        • Voice cloning for the masses
                                                          • Imagine yourself in their shoes
                                                            • Technological dissipation
                                                            • Social engineering on steroids
                                                              • Leveling the playing field
                                                                • Profitability from the masses
                                                                • Shaking the fundamentals of reality
                                                                  • What is real?
                                                                    • Fake or real?
                                                                      • Donald Trump arrested
                                                                        • Pentagon explosion shakes the US stock market
                                                                          • Eiffel Tower on fire
                                                                            • Durable content credentials
                                                                              • Content Credentials in action
                                                                                • Content authenticity
                                                                                • Exercise: Image watermarking
                                                                                  • Real or fake?
                                                                                    • Decide whether it is AI generated or not
                                                                                  • The AI Security landscape
                                                                                    • Attack surface of an AI system
                                                                                      • Components of an AI system
                                                                                        • AI systems and model lifecycle
                                                                                          • Supply-chain is more important than ever
                                                                                            • Models and APIs
                                                                                              • Non-AI attacks are here to stay
                                                                                              • OWASP Top 10 and AI
                                                                                                • About OWASP and it's Top 10 lists
                                                                                                  • OWASP ML Top 10 - Model Manipulation
                                                                                                    • OWASP ML Top 10 - Others
                                                                                                      • OWASP LLM Top 10 - Injection
                                                                                                        • OWASP LLM Top 10 - Others
                                                                                                          • Beyond OWASP Top 10
                                                                                                          • Threatmodeling an LLM integrated application
                                                                                                            • Threat model
                                                                                                              • Threat modeling: Workshop
                                                                                                                • Threat modeling: AI assistant
                                                                                                                  • Threat modeling: Diagram
                                                                                                                    • Threat modeling: STRIDE
                                                                                                                      • Threat modeling: Trust boundaries
                                                                                                                        • Threat modeling: TB01
                                                                                                                          • Threat modeling: Threats & Mitigations
                                                                                                                          • Exercise: Threatmodeling an LLM integrated application
                                                                                                                            • Meet TicketAI, a ticketing system
                                                                                                                              • TicketAI's data flow diagram
                                                                                                                                • Find potential threats
                                                                                                                                • Attacks on AI systems
                                                                                                                                  • Red Teaming
                                                                                                                                    • What is Red Teaming?
                                                                                                                                      • Security and safety
                                                                                                                                      • Attacks on AI systems - Prompt injection
                                                                                                                                        • Prompt injection
                                                                                                                                          • Impact
                                                                                                                                            • Examples
                                                                                                                                              • Indirect prompt injection
                                                                                                                                                • From prompt injection to phishing
                                                                                                                                                • Advanced techniques - SudoLang: pseudocode for LLMs
                                                                                                                                                  • Introducing SudoLang
                                                                                                                                                    • SudoLang examples
                                                                                                                                                      • Behind the tech
                                                                                                                                                        • A SudoLang program
                                                                                                                                                          • Integrating an LLM
                                                                                                                                                            • Integrating an LLM with SudoLang
                                                                                                                                                            • Exercise: Translate a prompt to SudoLang
                                                                                                                                                              • A long prompt
                                                                                                                                                                • A different solution
                                                                                                                                                                • Exercise: Red Teaming - Prompt injection (Levels 1-2)
                                                                                                                                                                    • Get the password!
                                                                                                                                                                      • Classic injection defense
                                                                                                                                                                        • Levels 1-2
                                                                                                                                                                          • Solutions for levels 1-2
                                                                                                                                                                          • Attacks on AI systems
                                                                                                                                                                            • Attacks on AI systems - Model jailbreaks
                                                                                                                                                                              • What's a model jailbreak?
                                                                                                                                                                                • How jailbreaks work?
                                                                                                                                                                                • Jailbreaking ChatGPT
                                                                                                                                                                                  • The most famous ChatGPT jailbreak
                                                                                                                                                                                    • The 6.0 DAN prompt
                                                                                                                                                                                      • AutoDAN - Generating Stealthy Jailbreak Prompts
                                                                                                                                                                                      • Exercise: Red Teaming - Jailbreaking (Levels 3-5)
                                                                                                                                                                                          • Get the password!
                                                                                                                                                                                            • Levels 3-5
                                                                                                                                                                                              • Looking for external help
                                                                                                                                                                                              • Tree of Attacks with Pruning (TAP)
                                                                                                                                                                                                • Tree of Attacks explained
                                                                                                                                                                                                  • TAP results
                                                                                                                                                                                                  • Attacks on AI systems - Prompt extraction
                                                                                                                                                                                                    • Prompt extraction
                                                                                                                                                                                                    • Exercise: Red Teaming - Prompt extraction (Levels 6-7)
                                                                                                                                                                                                        • Get the password!
                                                                                                                                                                                                          • Level 6
                                                                                                                                                                                                            • Level 7
                                                                                                                                                                                                              • Extract the boundaries of levels 6 and 7
                                                                                                                                                                                                              • Defending AI systems - Prompt injection defenses
                                                                                                                                                                                                                • Intermediate techniques
                                                                                                                                                                                                                  • Advanced techniques
                                                                                                                                                                                                                    • More Security APIs
                                                                                                                                                                                                                      • ReBuff example
                                                                                                                                                                                                                        • Llama Guard example I
                                                                                                                                                                                                                          • Llama Guard example II
                                                                                                                                                                                                                            • Lakera example
                                                                                                                                                                                                                            • Attempts against a similar exercise
                                                                                                                                                                                                                              • Gandalf from Lakera
                                                                                                                                                                                                                                • Types of Gandalf exploits
                                                                                                                                                                                                                                • Exercise: Red Teaming - Give it your best shot!
                                                                                                                                                                                                                                    • Get the password!
                                                                                                                                                                                                                                      • Level 8
                                                                                                                                                                                                                                        • Level 9
                                                                                                                                                                                                                                        • Other injection methods
                                                                                                                                                                                                                                          • Attack categories
                                                                                                                                                                                                                                            • Reverse Psychology
                                                                                                                                                                                                                                            • Exercise: Reverse Psychology
                                                                                                                                                                                                                                              • Write an exploit with the ChatbotUI
                                                                                                                                                                                                                                                • A possible solution
                                                                                                                                                                                                                                                • Other protection methods
                                                                                                                                                                                                                                                  • Protection categories
                                                                                                                                                                                                                                                    • A different categorization
                                                                                                                                                                                                                                                      • Bergeron method
                                                                                                                                                                                                                                                      • Sensitive Information Disclosure
                                                                                                                                                                                                                                                        • Relevance
                                                                                                                                                                                                                                                          • Best practices
                                                                                                                                                                                                                                                        • Visual Prompt Injection
                                                                                                                                                                                                                                                          • Attack types
                                                                                                                                                                                                                                                            • Visuals with threats
                                                                                                                                                                                                                                                              • Trivial examples
                                                                                                                                                                                                                                                                • Adversarial attacks
                                                                                                                                                                                                                                                                • Tricking self-driving cars
                                                                                                                                                                                                                                                                  • How to fool a Tesla
                                                                                                                                                                                                                                                                    • This is just the beginning
                                                                                                                                                                                                                                                                      • Protection against adversarial attacks
                                                                                                                                                                                                                                                                      • Exercise: Image recognition with OpenAI
                                                                                                                                                                                                                                                                        • Painting with (invisible) words
                                                                                                                                                                                                                                                                          • Invisible instructions
                                                                                                                                                                                                                                                                          • Exercise: Adversarial attack
                                                                                                                                                                                                                                                                            • Untargeted attack with FGSM
                                                                                                                                                                                                                                                                            • Protection methods
                                                                                                                                                                                                                                                                              • Protection methods 1 / 2
                                                                                                                                                                                                                                                                                • Protection methods 2 / 2
                                                                                                                                                                                                                                                                              • Denial of Service
                                                                                                                                                                                                                                                                                • Denial of Service on AI systems
                                                                                                                                                                                                                                                                                  • Attack scenarios
                                                                                                                                                                                                                                                                                  • Prompt routing challenges
                                                                                                                                                                                                                                                                                    • Attacks
                                                                                                                                                                                                                                                                                      • Protections
                                                                                                                                                                                                                                                                                      • Exercise: Denial of Service
                                                                                                                                                                                                                                                                                        • Halting Model Responses
                                                                                                                                                                                                                                                                                          • A possible solution
                                                                                                                                                                                                                                                                                        • Model theft
                                                                                                                                                                                                                                                                                          • Know your enemy
                                                                                                                                                                                                                                                                                            • Risks
                                                                                                                                                                                                                                                                                            • Attack types
                                                                                                                                                                                                                                                                                              • Training or fine-tuning a new model
                                                                                                                                                                                                                                                                                                • Dataset exploration
                                                                                                                                                                                                                                                                                                • Exercise: Query-based model stealing
                                                                                                                                                                                                                                                                                                  • OpenAI API parameters
                                                                                                                                                                                                                                                                                                    • How to steal a model
                                                                                                                                                                                                                                                                                                    • Protection against model theft
                                                                                                                                                                                                                                                                                                      • Simple protections
                                                                                                                                                                                                                                                                                                        • Advanced protections
                                                                                                                                                                                                                                                                                                        • LLM integration
                                                                                                                                                                                                                                                                                                          • The LLM trust boundary
                                                                                                                                                                                                                                                                                                            • An LLM is a system just like any other
                                                                                                                                                                                                                                                                                                              • It's not like any other system
                                                                                                                                                                                                                                                                                                                • Classical problems in novel integrations
                                                                                                                                                                                                                                                                                                                  • Treating LLM output as user input
                                                                                                                                                                                                                                                                                                                    • Typical exchange formats
                                                                                                                                                                                                                                                                                                                      • Applying common best practices
                                                                                                                                                                                                                                                                                                                      • Exercise: SQL Injection via an LLM
                                                                                                                                                                                                                                                                                                                        • SQL Injection via an LLM
                                                                                                                                                                                                                                                                                                                          • FreshCart Introduction
                                                                                                                                                                                                                                                                                                                            • SQL Injection via an LLM
                                                                                                                                                                                                                                                                                                                            • Exercise: Generating XSS payloads
                                                                                                                                                                                                                                                                                                                              • XSS attack on Fresh Cart
                                                                                                                                                                                                                                                                                                                              • LLMs interaction with other systems
                                                                                                                                                                                                                                                                                                                                • Typical integration patterns
                                                                                                                                                                                                                                                                                                                                  • Function calling dangers
                                                                                                                                                                                                                                                                                                                                    • The rise of custom GPTs
                                                                                                                                                                                                                                                                                                                                      • Security considerations
                                                                                                                                                                                                                                                                                                                                        • Identity and authorization across applications
                                                                                                                                                                                                                                                                                                                                          • Best practices for secure integration
                                                                                                                                                                                                                                                                                                                                          • Exercise: Function calling through OpenAI API
                                                                                                                                                                                                                                                                                                                                            • Function calling
                                                                                                                                                                                                                                                                                                                                              • Function calling
                                                                                                                                                                                                                                                                                                                                                • Function calling
                                                                                                                                                                                                                                                                                                                                                • Exercise: Privilege escalation via prompt injection
                                                                                                                                                                                                                                                                                                                                                  • Privilege escalation
                                                                                                                                                                                                                                                                                                                                                  • Principles of security and secure coding
                                                                                                                                                                                                                                                                                                                                                    • Matt Bishop's principles of robust programming
                                                                                                                                                                                                                                                                                                                                                      • Matt Bishop's principles of robust programming - I
                                                                                                                                                                                                                                                                                                                                                        • Matt Bishop's principles of robust programming - II
                                                                                                                                                                                                                                                                                                                                                          • The security principles of Saltzer and Schroeder
                                                                                                                                                                                                                                                                                                                                                            • The security principles of Saltzer and Schroeder - I
                                                                                                                                                                                                                                                                                                                                                              • The security principles of Saltzer and Schroeder - II
                                                                                                                                                                                                                                                                                                                                                                • The security principles of Saltzer and Schroeder - III
                                                                                                                                                                                                                                                                                                                                                                  • The security principles of Saltzer and Schroeder - IV
                                                                                                                                                                                                                                                                                                                                                                  • Racking up privileges
                                                                                                                                                                                                                                                                                                                                                                    • The case for a very capable model
                                                                                                                                                                                                                                                                                                                                                                      • Exploiting excessive privileges
                                                                                                                                                                                                                                                                                                                                                                        • Separation of privileges
                                                                                                                                                                                                                                                                                                                                                                          • A model can't be cut in half
                                                                                                                                                                                                                                                                                                                                                                            • Designing your model privileges
                                                                                                                                                                                                                                                                                                                                                                            • A customer support bot going wild
                                                                                                                                                                                                                                                                                                                                                                              • A customer support bot going wild
                                                                                                                                                                                                                                                                                                                                                                              • Best practices in practice
                                                                                                                                                                                                                                                                                                                                                                                • Input validation
                                                                                                                                                                                                                                                                                                                                                                                  • Output encoding
                                                                                                                                                                                                                                                                                                                                                                                    • Use frameworks
                                                                                                                                                                                                                                                                                                                                                                                  • Training data manipulation
                                                                                                                                                                                                                                                                                                                                                                                    • What you train on matters
                                                                                                                                                                                                                                                                                                                                                                                      • What data are models trained on?
                                                                                                                                                                                                                                                                                                                                                                                        • Model assurances
                                                                                                                                                                                                                                                                                                                                                                                          • Model and dataset cards
                                                                                                                                                                                                                                                                                                                                                                                          • Exercise: Verifying model cards
                                                                                                                                                                                                                                                                                                                                                                                            • The content of a model card
                                                                                                                                                                                                                                                                                                                                                                                            • A malicious model
                                                                                                                                                                                                                                                                                                                                                                                              • A malicious model
                                                                                                                                                                                                                                                                                                                                                                                                • A malicious model - mitigation
                                                                                                                                                                                                                                                                                                                                                                                                • Verifying datasets
                                                                                                                                                                                                                                                                                                                                                                                                  • Getting clear on objectives
                                                                                                                                                                                                                                                                                                                                                                                                    • Dataset providers
                                                                                                                                                                                                                                                                                                                                                                                                      • A glance at the dataset card
                                                                                                                                                                                                                                                                                                                                                                                                        • Analyzing a dataset
                                                                                                                                                                                                                                                                                                                                                                                                        • Exercise: Analyzing datasets
                                                                                                                                                                                                                                                                                                                                                                                                          • Content of a dataset card
                                                                                                                                                                                                                                                                                                                                                                                                            • Use Great Expectations to analyze a dataset
                                                                                                                                                                                                                                                                                                                                                                                                            • A secure supply chain
                                                                                                                                                                                                                                                                                                                                                                                                              • Proving model integrity is hard
                                                                                                                                                                                                                                                                                                                                                                                                                • Cryptographic solutions are emerging
                                                                                                                                                                                                                                                                                                                                                                                                                  • Cryptographic solutions are emerging
                                                                                                                                                                                                                                                                                                                                                                                                                    • Hardware-assisted attestation
                                                                                                                                                                                                                                                                                                                                                                                                                  • Human-AI interaction
                                                                                                                                                                                                                                                                                                                                                                                                                    • Relying too much on LLM output
                                                                                                                                                                                                                                                                                                                                                                                                                      • What could go wrong?
                                                                                                                                                                                                                                                                                                                                                                                                                        • Countering hallucinations
                                                                                                                                                                                                                                                                                                                                                                                                                          • Verifying the verifiable
                                                                                                                                                                                                                                                                                                                                                                                                                            • Referencing what's possible 1/2
                                                                                                                                                                                                                                                                                                                                                                                                                              • Referencing what's possible 2/2
                                                                                                                                                                                                                                                                                                                                                                                                                                • Clear communication is key
                                                                                                                                                                                                                                                                                                                                                                                                                              • Secure AI infrastructure
                                                                                                                                                                                                                                                                                                                                                                                                                                • Requirements of a secure AI infrastructure
                                                                                                                                                                                                                                                                                                                                                                                                                                  • Core Requirements
                                                                                                                                                                                                                                                                                                                                                                                                                                    • Data Security
                                                                                                                                                                                                                                                                                                                                                                                                                                      • Privacy and human intervention
                                                                                                                                                                                                                                                                                                                                                                                                                                      • Privacy and the Samsung data leak
                                                                                                                                                                                                                                                                                                                                                                                                                                        • The Samsung data leak
                                                                                                                                                                                                                                                                                                                                                                                                                                        • OpenAI Evaluation
                                                                                                                                                                                                                                                                                                                                                                                                                                          • Analyzing model accuracy and efficiency
                                                                                                                                                                                                                                                                                                                                                                                                                                            • Getting datasets for evaluation
                                                                                                                                                                                                                                                                                                                                                                                                                                              • Evaluation
                                                                                                                                                                                                                                                                                                                                                                                                                                              • LangSmith
                                                                                                                                                                                                                                                                                                                                                                                                                                                • What is LangSmith?
                                                                                                                                                                                                                                                                                                                                                                                                                                                  • Admin
                                                                                                                                                                                                                                                                                                                                                                                                                                                    • Evaluation Workflow 1 / 2
                                                                                                                                                                                                                                                                                                                                                                                                                                                      • Evaluation Workflow 2 / 2
                                                                                                                                                                                                                                                                                                                                                                                                                                                        • Tracing 1 / 2
                                                                                                                                                                                                                                                                                                                                                                                                                                                          • Tracing 2 / 2
                                                                                                                                                                                                                                                                                                                                                                                                                                                          • Exercise: LangSmith
                                                                                                                                                                                                                                                                                                                                                                                                                                                            • Tracing with LangSmith
                                                                                                                                                                                                                                                                                                                                                                                                                                                            • BlindLlama
                                                                                                                                                                                                                                                                                                                                                                                                                                                              • BlindLlama

                                                                                                                                                                                                                                                                                                                                                                                                                                                            Get more information

                                                                                                                                                                                                                                                                                                                                                                                                                                                            Send inquiry
                                                                                                                                                                                                                                                                                                                                                                                                                                                            Loading...
                                                                                                                                                                                                                                                                                                                                                                                                                                                            Sending...