Object Detection
A computer vision pipeline that detects and persistently tracks cats across video frames — built with YOLO, BotSORT, and deployed on Google Cloud.
How It Works
Frame Ingestion
Each video frame is passed through OpenCV and fed into the YOLOv11 detection model.
Cat Detection
YOLO identifies cats using COCO class 15, filtered at conf=0.35 and iou=0.6.
Persistent Tracking
BotSORT assigns consistent IDs across frames. A cat is catalogued after 7+ consecutive frames to eliminate false positives.
Output
Bounding boxes and IDs are drawn on each frame and compiled into an annotated output video.
Tech Stack
Other Key Decisions
Frame confirmation logic — requiring 7 consecutive frames before cataloguing avoids counting fleeting false detections.
YOLOv11l model — upgraded from nano to large for higher detection accuracy on moving subjects.
GCS for video delivery — processed demos are stored in Google Cloud Storage and streamed directly to the browser, keeping the backend lightweight.