Modern Data Stack
A modern data lake architecture on GCP separates storage from compute using Cloud Storage and BigQuery. This design uses a medallion architecture (raw → curated → aggregated) with Dataflow for streaming and batch ETL, BigQuery for serverless SQL analytics, and Pub/Sub for real-time event ingestion. Built for data engineering teams centralizing analytics from multiple sources into a governed, query-ready data platform.
Share this architecture with your network
Cloud Storage provides virtually unlimited storage that scales automatically. Pub/Sub handles real-time ingestion with automatic scaling. Dataflow pipelines auto-scale workers based on backlog. BigQuery runs serverlessly — you pay per query with automatic slot allocation. Dataproc clusters spin up on demand for Spark workloads and auto-scale based on YARN metrics.
YouTube Video Streaming System
YouTube / Google
Video upload, transcoding, and adaptive bitrate streaming on GCP handling 500+ hours of video uploaded per minute.
Web Crawler System
System Design Classic
Distributed web crawler on GCP with Pub/Sub URL frontier, Cloud Run workers, deduplication, and content extraction at web scale.
Multi-Tenant SaaS Platform
Generic SaaS
Production-ready multi-tenant SaaS with tenant isolation, feature flags, usage metering, and self-serve onboarding.
URL Shortener System
System Design Classic
High-throughput URL shortening service with analytics, custom aliases, and 301/302 redirect handling at scale.
Notification System
System Design Classic
Multi-channel notification system on Azure supporting push, email, SMS, and in-app notifications with Event Grid fan-out.
Dropbox File Storage System
Dropbox
Cloud file storage on Azure with chunked uploads to Blob Storage, delta sync, deduplication, and cross-device synchronization.
Data Lake & Analytics Platform
Remix this architecture in Canvas