Beginner

Project Setup

Architecture overview, YOLOv8 and OpenCV setup, Streamlit integration, and project scaffolding.

What We Are Building

A fully functional computer vision application that detects and tracks objects in real-time:

  1. Detect objects in images and video using YOLOv8
  2. Track objects across video frames with persistent IDs
  3. Generate analytics like counts, dwell times, and heatmaps
  4. Serve everything through a Streamlit web interface
💡
Real-world relevance: Computer vision systems like this power retail analytics, traffic monitoring, security cameras, and warehouse automation.

Architecture Overview

Video Input (File / Webcam)
    |
    v
+-------------------+     +-------------------+     +-------------------+
| Object Detector   | --> | Object Tracker    | --> | Analytics Engine  |
| (YOLOv8)          |     | (ByteTrack/SORT)  |     | (Counting/Heatmap)|
+-------------------+     +-------------------+     +-------------------+
    |
    v
+-------------------+
| Web Interface     |
| (Streamlit)       |
+-------------------+

Tech Stack

YOLOv8 (Ultralytics)

State-of-the-art object detection. Supports 80+ classes out of the box.

OpenCV

Industry-standard library for video capture, frame processing, and annotations.

Streamlit

Python web framework for interactive dashboards with video and file uploads.

Supervision

Tracking algorithms (ByteTrack/SORT) and annotation utilities.

Step 1: Create the Project

mkdir cv-app && cd cv-app
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install ultralytics opencv-python-headless streamlit numpy supervision

Step 2: Project Structure

cv-app/
  src/
    detector.py        # YOLOv8 detection (Lesson 2)
    video.py           # Video processing (Lesson 3)
    tracker.py         # Object tracking (Lesson 4)
    analytics.py       # Analytics engine (Lesson 5)
    app.py             # Streamlit interface (Lesson 6)
  config.yaml
  requirements.txt

Step 3: Configuration

# config.yaml
model:
  name: yolov8n.pt
  confidence: 0.5
  classes: [0, 1, 2, 3]

video:
  source: 0
  width: 1280
  height: 720

tracking:
  algorithm: bytetrack
  max_age: 30

Step 4: Base Detector Skeleton

# src/detector.py
from ultralytics import YOLO
import numpy as np

class ObjectDetector:
    def __init__(self, model_name="yolov8n.pt", confidence=0.5, classes=None):
        self.model = YOLO(model_name)
        self.confidence = confidence
        self.classes = classes
        print(f"Loaded model: {model_name}")

    def detect(self, frame):
        results = self.model(frame, conf=self.confidence,
                             classes=self.classes, verbose=False)
        return results[0]

    def get_detections(self, frame):
        result = self.detect(frame)
        detections = []
        for box in result.boxes:
            detections.append({
                "bbox": box.xyxy[0].cpu().numpy().tolist(),
                "class_id": int(box.cls[0]),
                "class_name": result.names[int(box.cls[0])],
                "confidence": float(box.conf[0])
            })
        return detections

if __name__ == "__main__":
    import cv2
    detector = ObjectDetector()
    cap = cv2.VideoCapture(0)
    ret, frame = cap.read()
    if ret:
        dets = detector.get_detections(frame)
        print(f"Found {len(dets)} objects")
        for d in dets:
            print(f"  {d['class_name']}: {d['confidence']:.2f}")
    cap.release()

Step 5: Test

python src/detector.py
# Found N objects
💡
Success! If you see detection results, your skeleton works. Next we build the full object detection pipeline.