Learn Data Poisoning & Training Attacks
Explore how adversaries manipulate training data to insert backdoors, bias models, and compromise AI systems. Learn detection techniques, data validation strategies, and defense frameworks to protect the ML training pipeline.
Your Learning Path
Follow these lessons in order, or jump to any topic that interests you.
1. Introduction
What is data poisoning? Understand threat models, attack surfaces in the ML pipeline, and real-world incidents.
2. Poisoning Techniques
Learn label flipping, clean-label attacks, gradient-based poisoning, and training data manipulation strategies.
3. Backdoor Insertion
Explore trigger-based backdoors, trojan attacks, sleeper agents, and how hidden behaviors are embedded in models.
4. Detection
Detect poisoned data using statistical analysis, spectral signatures, activation clustering, and neural cleanse techniques.
5. Prevention
Implement data validation pipelines, provenance tracking, robust training algorithms, and supply chain security.
6. Best Practices
Production-ready defense frameworks, continuous monitoring, incident response, and organizational security policies.
What You'll Learn
By the end of this course, you'll be able to:
Identify Poisoning Attacks
Recognize the signs of data poisoning including label corruption, backdoor triggers, and distribution shifts in training data.
Detect Backdoors
Use spectral signatures, activation analysis, and neural cleanse methods to find hidden backdoors in trained models.
Validate Training Data
Build robust data pipelines with provenance tracking, outlier detection, and integrity verification at every stage.
Secure the ML Pipeline
Implement end-to-end security controls across data collection, preprocessing, training, and deployment phases.
Lilly Tech Systems