Canvas CloudAI
Canvas Cloud AI

Data Lake

advanced
data
Enhanced Content

Definition

Centralized repository that stores all types of raw data at any scale. Like a massive digital reservoir that holds data in its original form until you need to analyze it.

Real-World Example

Companies dump sensor data, logs, images, and documents into a data lake, then use analytics tools to find patterns and insights when needed.

Cloud Provider Equivalencies

A data lake is typically built on low-cost object storage. AWS commonly uses S3 as the storage layer with Lake Formation/Glue for governance and cataloging; Azure uses ADLS Gen2; GCP uses Cloud Storage with Dataplex/BigLake for governance; OCI uses Object Storage with Data Catalog for metadata.

AWS
Amazon S3 (with AWS Lake Formation and AWS Glue commonly used for governance and cataloging)
AZ
Azure Data Lake Storage Gen2
GCP
Google Cloud Storage (often paired with Dataplex and BigLake for governance and unified access)
OCI
OCI Object Storage (often paired with OCI Data Catalog for metadata management)

Compare Across Cloud Providers

Data Lake is available across all major cloud platforms. Compare equivalent services:

AWS
AWS Lake Formation
Azure
Azure Data Lake Storage
Google Cloud
Cloud Storage + Analytics
Oracle Cloud
Data Lake

Explore More Cloud Computing Terms