Real-time Tracking

Object Detection

A computer vision pipeline that detects and persistently tracks cats across video frames — built with YOLO, BotSORT, and deployed on Google Cloud.

Demo 1 · Wall

How It Works

01

Frame Ingestion

Each video frame is passed through OpenCV and fed into the YOLOv11 detection model.

02

Cat Detection

YOLO identifies cats using COCO class 15, filtered at conf=0.35 and iou=0.6.

03

Persistent Tracking

BotSORT assigns consistent IDs across frames. A cat is catalogued after 7+ consecutive frames to eliminate false positives.

04

Output

Bounding boxes and IDs are drawn on each frame and compiled into an annotated output video.

Tech Stack

YOLOv11OpenCVBotSORTPyTorchGoogle Cloud StorageDocker

Other Key Decisions

Frame confirmation logic — requiring 7 consecutive frames before cataloguing avoids counting fleeting false detections.

YOLOv11l model — upgraded from nano to large for higher detection accuracy on moving subjects.

GCS for video delivery — processed demos are stored in Google Cloud Storage and streamed directly to the browser, keeping the backend lightweight.