A centralized repository that stores metadata and helps users discover, understand, and manage data assets across the organization.
Data Catalog helps data scientists find customer datasets by searching for 'customer behavior' and seeing what tables, descriptions, and owners are available.
All provide a searchable metadata repository for datasets (tables, files, topics), including schema, lineage, ownership, tags, and governance integration. AWS Glue focuses on analytics metadata for Athena/EMR/Redshift; Microsoft Purview emphasizes enterprise governance and scanning across Azure/SaaS; Google Cloud Dataplex unifies lakehouse governance and discovery across GCS/BigQuery; OCI Data Catalog catalogs OCI and external sources with business glossaries and harvesting.