Advanced
Privacy in Federated Learning
FL keeps data local, but model updates can still leak information. Differential privacy, secure aggregation, and encryption provide stronger guarantees.
Why FL Alone Isn't Enough
Although FL keeps raw data on-device, model updates (gradients or weights) can reveal information about the training data:
- Gradient inversion attacks: Researchers have shown that raw pixel-level images can be reconstructed from shared gradients using optimization techniques.
- Membership inference: An attacker can determine whether a specific data point was used in training by analyzing the model.
- Model memorization: Neural networks can memorize rare training examples, which can be extracted through careful querying.
Important: Federated Learning is a necessary but not sufficient condition for privacy. Additional techniques are needed for strong privacy guarantees.
Differential Privacy (DP)
Differential privacy provides a mathematical guarantee that the output of a computation does not reveal whether any individual's data was included. It works by adding calibrated noise:
- Local DP: Each client adds noise to their updates before sending. Provides strong privacy but reduces model quality.
- Central DP: The server adds noise after aggregation. Better utility but requires trusting the server.
- ε (epsilon): The privacy budget. Lower ε means more privacy but less accuracy. Typical values: 1-10.
- Gradient clipping: Bound the L2 norm of each client's gradient update before adding noise. Limits the influence of any single data point.
Python - Adding Differential Privacy
import torch from opacus import PrivacyEngine # Wrap model with Opacus for DP-SGD model = Net() optimizer = torch.optim.SGD(model.parameters(), lr=0.01) privacy_engine = PrivacyEngine() model, optimizer, dataloader = privacy_engine.make_private_with_epsilon( module=model, optimizer=optimizer, data_loader=dataloader, epochs=10, target_epsilon=3.0, # Privacy budget target_delta=1e-5, # Probability of privacy breach max_grad_norm=1.0, # Gradient clipping bound ) # Training loop (noise is added automatically) for epoch in range(10): for batch in dataloader: optimizer.zero_grad() loss = criterion(model(batch.x), batch.y) loss.backward() optimizer.step() print(f"Final epsilon: {privacy_engine.get_epsilon(delta=1e-5):.2f}")
Secure Aggregation
Secure aggregation ensures the server only sees the aggregated result, not individual client updates. Uses cryptographic techniques so the server learns the sum of updates without seeing any single update:
- Pairwise masking: Clients add canceling random masks to their updates. When aggregated, masks cancel out, revealing only the sum.
- Threshold decryption: Updates are encrypted and can only be decrypted when enough clients participate.
- Cost: Adds communication overhead (approximately 2x) but provides strong guarantees against a curious server.
Privacy Techniques Comparison
| Technique | Protects Against | Cost | Maturity |
|---|---|---|---|
| Differential Privacy | Data reconstruction, membership inference | Reduced model accuracy | Production-ready |
| Secure Aggregation | Curious server seeing individual updates | Communication overhead | Production-ready |
| Homomorphic Encryption | Server processing encrypted data | 100-10000x slower computation | Research stage |
| Trusted Execution Environments | Server-side data access | Hardware requirements (Intel SGX) | Available but limited |
Threat Models
- Honest-but-curious server: The server follows the protocol but tries to extract information from what it observes. Secure aggregation protects against this.
- Malicious clients: A client sends manipulated updates to poison the global model (model poisoning) or to infer other clients' data. Byzantine-robust aggregation methods address this.
- External adversary: Intercepts communication between clients and server. TLS encryption protects against this.
Key takeaway: Strong privacy in FL requires combining multiple techniques. Differential privacy bounds information leakage mathematically. Secure aggregation prevents the server from seeing individual updates. The privacy-utility tradeoff is real: more privacy means less model accuracy. Choose the right balance for your use case.
Lilly Tech Systems