Beginner

Introduction to Federated Learning

Federated Learning enables training ML models across multiple devices or organizations without sharing raw data — bringing the model to the data instead of the data to the model.

What is Federated Learning?

Federated Learning (FL) is a machine learning approach where a model is trained collaboratively across multiple decentralized devices or servers holding local data, without exchanging raw data. Introduced by Google in 2016, FL was originally designed to improve keyboard predictions on Android phones without uploading users' typing data to the cloud.

The key insight: instead of sending data to the model, we send the model to the data.

Traditional ML vs Federated Learning

AspectTraditional MLFederated Learning
Data LocationCentralized serverDistributed across devices/organizations
Data MovementAll data sent to one placeData stays on device; only model updates are shared
PrivacyData is visible to the serverRaw data never leaves the device
ComplianceComplex with GDPR, HIPAAEasier — data remains under local control
CommunicationOne-time data uploadMultiple rounds of model exchange

Why Federated Learning?

  • Privacy: Sensitive data (medical records, financial transactions, personal messages) never leaves the user's device or the hospital's server.
  • Regulation: Laws like GDPR, HIPAA, and CCPA restrict data sharing. FL enables collaboration without violating these regulations.
  • Data silos: Many organizations have valuable data they cannot share due to competitive, legal, or ethical reasons. FL allows them to collaborate.
  • Bandwidth: Moving large datasets is expensive and slow. Sending small model updates is much more efficient.
  • Real-time learning: Models can be updated on-device with the latest data, without waiting for a centralized training cycle.

Types of Federated Learning

TypeDescriptionExample
Cross-Device FLMillions of mobile devices (phones, IoT) each with a small datasetGoogle Keyboard, Apple Siri
Cross-Silo FLA few organizations (hospitals, banks) each with a large datasetMulti-hospital medical research

The Federated Learning Process

  1. Initialization

    A central server initializes a global model and sends it to participating clients (devices or organizations).

  2. Local Training

    Each client trains the model on its local data for a few epochs, producing updated model weights.

  3. Upload Updates

    Clients send only their model updates (gradients or weights) back to the server — not their data.

  4. Aggregation

    The server aggregates all client updates (e.g., by averaging) to produce a new global model.

  5. Repeat

    The updated global model is sent back to clients, and the process repeats for multiple rounds until convergence.

Real-World FL Deployments

  • Google Gboard: Next-word prediction trained across millions of Android phones. The model that powers autocorrect and suggestions.
  • Apple: Uses FL for Siri improvements, QuickType suggestions, and "Hey Siri" detection without uploading voice recordings.
  • Healthcare consortiums: Multiple hospitals training diagnostic models on patient data that cannot leave the institution due to HIPAA.
  • Financial institutions: Banks collaborating on fraud detection models without sharing customer transaction data.
Key takeaway: Federated Learning solves the fundamental tension between wanting more data for better models and the need to protect privacy. By keeping data local and sharing only model updates, FL enables collaborative AI training across privacy boundaries.