Video Classification Dataset

Overview

This dataset includes labeled video clips across multiple categories for video classification tasks. Perfect for video understanding, video categorization, and video deep learning applications.

Core Features

Multiple Categories

Videos organized across multiple categories for comprehensive classification tasks.

Labeled Clips

All video clips are properly labeled and organized for easy access.

Train/Test Sets

Pre-organized training and test sets for immediate use in machine learning projects.

Frame Extraction

Built-in utilities for extracting frames from videos at specified intervals.

Ready for Models

Preprocessed and ready to use with popular video classification models.

OpenCV & FFmpeg

Comprehensive tools using OpenCV and FFmpeg for video processing.

🚀 Advanced Features

1Video Data Augmentation

Automatically augment video frames with flipping, rotation, brightness, and contrast adjustments to increase dataset diversity and improve model generalization.

2Intelligent Key Frame Extraction

Extract key frames using multiple methods: uniform sampling, scene change detection, or random selection for optimal feature representation.

3Batch Processing

Process multiple videos efficiently in batches with memory optimization and progress tracking for large-scale datasets.

4Video Quality Analysis

Automatically analyze video quality metrics including sharpness, brightness, resolution, and generate quality scores for dataset curation.

5Video Summary Generation

Create concise summary videos from long videos by extracting and combining key frames for quick preview and analysis.

6Comprehensive Dataset Reports

Generate detailed analytics reports with statistics, category distributions, and quality metrics for dataset management.

✨ Unique Features

★Duplicate Video Detection

Automatically detect duplicate or similar videos using perceptual hashing to maintain dataset quality and avoid redundancy.

★Smart Video Splitting

Intelligently split long videos into shorter segments with configurable duration and overlap for better training data preparation.

★Auto Thumbnail Generation

Automatically generate high-quality thumbnails from videos using multiple methods: middle frame, first frame, or best quality frame selection.

★Video Montage Creation

Create stunning montage videos from multiple sources arranged in customizable grid layouts for visualization and presentation.

★Dataset Balance Analysis

Analyze and get recommendations for dataset balance across categories to ensure optimal training conditions.

★Auto-Categorization (ML-Ready)

Framework for automatic video categorization using machine learning models with easy integration points.

Technologies

MP4 MOV OpenCV FFmpeg Video Processing Python Machine Learning

📖 How to Use - Step by Step Guide

Step 1: Install Dependencies

First, install all required Python packages:

pip install -r requirements.txt

This installs OpenCV, NumPy, and other essential libraries for video processing.

Step 2: Create Directory Structure

Set up the folder structure for organizing your videos:

python scripts/download_sample_data.py --create-structure

This creates the raw_videos/ directory with category folders (action, comedy, drama, sports, etc.)

Step 3: Add Your Videos

Place your video files in the appropriate category folders:

                            
raw_videos/

├── action/

│   └── your_video.mp4

├── comedy/

│   └── your_video.mp4

└── ...

Tip: You can also use the interactive mode: python scripts/add_videos.py --interactive

Step 4: Organize Dataset

Automatically split videos into train (70%), test (20%), and validation (10%) sets:

python scripts/organize_dataset.py --input raw_videos --output data

This organizes your videos into the data/train/, data/test/, and data/validation/ directories.

Step 5: Process Videos (Optional)

Resize and normalize videos for consistent format:

python scripts/process_videos.py --input raw_videos --output data/train

This ensures all videos have uniform resolution (224x224) and format.

Step 6: Extract Frames (Optional)

Extract frames from videos for frame-based models:

python scripts/extract_frames.py --input data/train --output frames/train

Extracts frames at 1 frame per second (configurable in config.yaml).

Step 7: Verify Dataset

Check your dataset statistics and metadata:

python scripts/create_sample_metadata.py --summary

This generates a comprehensive report of your dataset including video counts per category.

Step 8: Use the Dataset

Start using your dataset in Python:

                            
from utils.dataset_utils import get_videos_by_category

# Get videos by category

videos = get_videos_by_category('data/train')

print(videos)

See examples/video_loader_example.py for complete usage examples.

Quick Start Commands

For experienced users, here's a quick reference:

                    
# Install dependencies

pip install -r requirements.txt

# Create structure

python scripts/download_sample_data.py --create-structure

# Organize dataset

python scripts/organize_dataset.py --input raw_videos --output data

# Extract frames

python scripts/extract_frames.py --input data/train --output frames/train

Documentation

For detailed usage instructions, examples, and API documentation, please refer to:

README.md - Project overview and setup
USAGE.md - Detailed usage guide
Examples - Code examples and tutorials

Get Started

View README Usage Guide Visit RSK World