Backdoor Attacks & Defense
Explore how adversaries can embed hidden behaviors in ML models, learn to detect trojan models using Neural Cleanse and spectral signatures, and master defense techniques like fine-pruning and activation clustering.
Your Learning Path
Follow these lessons in order, or jump to any topic that interests you.
1. Introduction
What are backdoor attacks? Threat landscape, attack surface in ML pipelines, and why backdoors are uniquely dangerous.
2. Attack Methods
BadNets, clean-label backdoors, hidden trigger attacks, and data poisoning strategies used by adversaries.
3. Trojan Models
How trojan models work, supply chain attacks on pre-trained models, and trojan insertion during fine-tuning.
4. Detection
Neural Cleanse, spectral signatures, activation clustering, STRIP, and meta-neural analysis for backdoor detection.
5. Removal
Fine-pruning, knowledge distillation, unlearning techniques, and certified backdoor removal methods.
6. Best Practices
Defense-in-depth strategies, supply chain security, model auditing workflows, and organizational policies.
What You'll Learn
By the end of this course, you'll be able to:
Identify Backdoor Threats
Recognize how and where backdoors can be inserted into ML models, datasets, and training pipelines.
Detect Trojan Models
Apply state-of-the-art detection methods including Neural Cleanse, spectral signatures, and activation analysis.
Remove Backdoors
Use fine-pruning, distillation, and unlearning techniques to neutralize backdoors from compromised models.
Secure ML Supply Chains
Implement policies and technical controls to prevent backdoor injection throughout the ML lifecycle.
Lilly Tech Systems