Data Catalog

intermediate
big & data
Enhanced Content

Definition

A centralized repository that stores metadata and helps users discover, understand, and manage data assets across the organization.

Real-World Example

Data Catalog helps data scientists find customer datasets by searching for 'customer behavior' and seeing what tables, descriptions, and owners are available.

Cloud Provider Equivalencies

All provide a searchable metadata repository for datasets (tables, files, topics), including schema, lineage, ownership, tags, and governance integration. AWS Glue focuses on analytics metadata for Athena/EMR/Redshift; Microsoft Purview emphasizes enterprise governance and scanning across Azure/SaaS; Google Cloud Dataplex unifies lakehouse governance and discovery across GCS/BigQuery; OCI Data Catalog catalogs OCI and external sources with business glossaries and harvesting.

AWS
AWS Glue Data Catalog
AZ
Microsoft Purview Data Catalog
GCP
Google Cloud Dataplex Data Catalog
OCI
OCI Data Catalog

Explore More Cloud Computing Terms