youtube-8m is Google’s open source starter code and reference implementation for training and evaluating machine learning models on the YouTube-8M dataset, one of the largest video understanding datasets publicly released. The repository provides a complete pipeline for video-level and frame-level modeling using TensorFlow, including data reading, model training, evaluation, and inference. It was developed to support the YouTube-8M Video Understanding Challenge (hosted on Kaggle and featured at ICCV 2019), enabling researchers and practitioners to benchmark video classification models on large-scale datasets with over millions of labeled videos. The code demonstrates how to process frame-level features, train logistic and deep learning models, evaluate them using metrics like global Average Precision (gAP) and mean Average Precision (mAP), and export trained models for MediaPipe inference.
Features
- Provides TensorFlow starter code for training and evaluating video models
- Supports frame-level and video-level feature modeling pipelines
- Includes tools for evaluation, inference, and Kaggle competition submissions
- Compatible with GPU acceleration and Google Cloud AI Platform
- Offers model export for deployment in MediaPipe for real-time inference
- Allows fine-tuning or training on custom TFRecord-based datasets