Beginner

Introduction to Edge AI / TinyML

Edge AI runs machine learning models directly on devices — phones, microcontrollers, cameras, and IoT sensors — without sending data to the cloud.

What is Edge AI?

Edge AI refers to running AI algorithms locally on hardware devices at the "edge" of the network, close to where data is generated. Instead of sending data to a cloud server for processing, the AI model runs directly on the device in real time.

TinyML is a subset of Edge AI focused on running ML models on extremely resource-constrained devices — microcontrollers with kilobytes of memory, milliwatts of power, and no operating system.

Why Edge AI?

Benefit	Cloud AI	Edge AI
Latency	100-500ms round-trip	<10ms local inference
Privacy	Data sent to server	Data stays on device
Connectivity	Requires internet	Works offline
Bandwidth	High data transfer cost	Minimal (only results sent)
Cost	Per-query cloud fees	One-time device cost
Model Size	Unlimited	Constrained by device

Edge AI Applications

Smartphones: Face unlock, voice assistants, computational photography, real-time translation. Apple's Neural Engine processes 15.8 trillion operations per second on-device.
Smart Cameras: Person detection, license plate recognition, and anomaly detection without streaming video to the cloud.
Wearables: Heart rate anomaly detection on smartwatches, fall detection, and activity recognition.
Industrial IoT: Predictive maintenance on factory sensors, quality inspection on production lines, real-time vibration analysis.
Autonomous Vehicles: Object detection and path planning must happen in milliseconds. Cloud latency would be dangerous.
Agriculture: Crop disease detection from drone cameras, soil quality analysis from IoT sensors.

The Edge AI Stack

Train in the Cloud
Train your full-size model using GPUs in the cloud with frameworks like PyTorch or TensorFlow.
Optimize the Model
Apply quantization (FP32 to INT8), pruning, and knowledge distillation to shrink the model for edge deployment.
Convert to Edge Format
Export to TensorFlow Lite (.tflite), ONNX (.onnx), CoreML (.mlmodel), or TensorRT for the target device.
Deploy to Device
Load the optimized model on the edge device and run inference using the appropriate runtime.
Monitor and Update
Use OTA (over-the-air) updates to push improved models to deployed devices.

Edge AI vs TinyML

Aspect	Edge AI	TinyML
Devices	Phones, Jetson, Raspberry Pi	Arduino, ESP32, STM32
Memory	GBs of RAM	KBs to MBs of RAM
Power	Watts	Milliwatts to microwatts
Models	MobileNet, YOLOv8-nano	Tiny CNNs, keyword spotting
Runtime	TFLite, ONNX Runtime	TFLite Micro, Edge Impulse

✅

Key takeaway: Edge AI brings intelligence to where data is generated, eliminating latency, preserving privacy, and enabling offline operation. The challenge is fitting powerful models into constrained devices through optimization techniques like quantization and pruning.

Next → Hardware

Introduction to Edge AI / TinyML

What is Edge AI?

Why Edge AI?

Edge AI Applications

The Edge AI Stack

Train in the Cloud

Optimize the Model

Convert to Edge Format

Deploy to Device

Monitor and Update

Edge AI vs TinyML