gcp

System Design

advanced

Video sharing and streaming platform

YouTube Video Streaming System

YouTube / Google

YouTube handles over 500 hours of video uploads every minute and serves billions of views daily. This GCP-native architecture separates the upload pipeline (chunked upload → transcoding via GKE → multiple resolutions in Cloud Storage) from the viewing pipeline (Cloud CDN → adaptive bitrate streaming). Aimed at engineers building video platforms that need adaptive bitrate streaming, automated moderation, and personalized recommendations.

Data Flow

Cloud CDN

Upload API

Transcode Queue

Upload Service

Transcoding Workers

Recommendation Engine

Video Storage

Video Metadata DB

Memorystore Cache

Share this architecture with your network

Service Breakdown (9 services)

Other9 services

Cloud CDN

•Distributes traffic across backend targets
•Health-checks services and routes around failures
•Scales automatically with traffic spikes

Upload API

•Routes API traffic and enforces policies
•Manages authentication and rate limiting
•Provides a unified API endpoint

Upload Service

•Validates and ingests video uploads from creators
•Extracts metadata and generates processing tasks
•Enforces content size and format restrictions

Transcoding Workers

•Converts videos to multiple resolutions and codecs
•Parallelizes encoding for fast turnaround
•Produces adaptive bitrate streaming variants

Recommendation Engine

•Ranks and personalizes video suggestions per user
•Combines collaborative filtering with watch history
•Optimizes for engagement and content diversity

Video Storage

•Stores transcoded video segments and thumbnails
•Serves content globally via CDN integration
•Manages storage tiers for hot and archive content

Transcode Queue

•Buffers transcoding jobs for parallel processing
•Prioritizes based on creator tier and video length
•Retries failed jobs with exponential backoff

Video Metadata DB

•Stores video titles, tags, and channel information
•Supports full-text search for video discovery
•Handles high-throughput read queries from the API

Memorystore Cache

•Caches video metadata and user session data
•Reduces database load for frequently accessed items
•Supports sub-millisecond reads for API responses

Scaling Strategy

Video uploads are chunked and stored in Cloud Storage before transcoding. Pub/Sub distributes transcoding jobs across GKE workers that auto-scale based on subscription backlog. Each video produces multiple resolution variants stored in Cloud Storage. Viewing traffic scales through Cloud CDN edge caching with adaptive bitrate manifests. Metadata queries hit Memorystore first, falling back to Cloud SQL read replicas.

Related Architectures

Data Lake & Analytics Platform

Modern Data Stack

Cloud-native data lake with streaming ingestion, batch ETL, query engine, and BI dashboards. Handles petabyte-scale analytics.

advanced

System Design

Web Crawler System

System Design Classic

Distributed web crawler on GCP with Pub/Sub URL frontier, Cloud Run workers, deduplication, and content extraction at web scale.

advanced

System Design

Multi-Tenant SaaS Platform

Generic SaaS

Production-ready multi-tenant SaaS with tenant isolation, feature flags, usage metering, and self-serve onboarding.

intermediate

System Design

URL Shortener System

System Design Classic

High-throughput URL shortening service with analytics, custom aliases, and 301/302 redirect handling at scale.

beginner

System Design

Notification System

System Design Classic

Multi-channel notification system on Azure supporting push, email, SMS, and in-app notifications with Event Grid fan-out.

intermediate

System Design

Dropbox File Storage System

Dropbox

Cloud file storage on Azure with chunked uploads to Blob Storage, delta sync, deduplication, and cross-device synchronization.

intermediate

System Design

YouTube Video Streaming System

Remix this architecture in Canvas