Beginner

Introduction to Edge AI / TinyML

Edge AI runs machine learning models directly on devices — phones, microcontrollers, cameras, and IoT sensors — without sending data to the cloud.

What is Edge AI?

Edge AI refers to running AI algorithms locally on hardware devices at the "edge" of the network, close to where data is generated. Instead of sending data to a cloud server for processing, the AI model runs directly on the device in real time.

TinyML is a subset of Edge AI focused on running ML models on extremely resource-constrained devices — microcontrollers with kilobytes of memory, milliwatts of power, and no operating system.

Why Edge AI?

BenefitCloud AIEdge AI
Latency100-500ms round-trip<10ms local inference
PrivacyData sent to serverData stays on device
ConnectivityRequires internetWorks offline
BandwidthHigh data transfer costMinimal (only results sent)
CostPer-query cloud feesOne-time device cost
Model SizeUnlimitedConstrained by device

Edge AI Applications

  • Smartphones: Face unlock, voice assistants, computational photography, real-time translation. Apple's Neural Engine processes 15.8 trillion operations per second on-device.
  • Smart Cameras: Person detection, license plate recognition, and anomaly detection without streaming video to the cloud.
  • Wearables: Heart rate anomaly detection on smartwatches, fall detection, and activity recognition.
  • Industrial IoT: Predictive maintenance on factory sensors, quality inspection on production lines, real-time vibration analysis.
  • Autonomous Vehicles: Object detection and path planning must happen in milliseconds. Cloud latency would be dangerous.
  • Agriculture: Crop disease detection from drone cameras, soil quality analysis from IoT sensors.

The Edge AI Stack

  1. Train in the Cloud

    Train your full-size model using GPUs in the cloud with frameworks like PyTorch or TensorFlow.

  2. Optimize the Model

    Apply quantization (FP32 to INT8), pruning, and knowledge distillation to shrink the model for edge deployment.

  3. Convert to Edge Format

    Export to TensorFlow Lite (.tflite), ONNX (.onnx), CoreML (.mlmodel), or TensorRT for the target device.

  4. Deploy to Device

    Load the optimized model on the edge device and run inference using the appropriate runtime.

  5. Monitor and Update

    Use OTA (over-the-air) updates to push improved models to deployed devices.

Edge AI vs TinyML

AspectEdge AITinyML
DevicesPhones, Jetson, Raspberry PiArduino, ESP32, STM32
MemoryGBs of RAMKBs to MBs of RAM
PowerWattsMilliwatts to microwatts
ModelsMobileNet, YOLOv8-nanoTiny CNNs, keyword spotting
RuntimeTFLite, ONNX RuntimeTFLite Micro, Edge Impulse
Key takeaway: Edge AI brings intelligence to where data is generated, eliminating latency, preserving privacy, and enabling offline operation. The challenge is fitting powerful models into constrained devices through optimization techniques like quantization and pruning.