Data Governance
Definition
Data Governance encompasses practices and tools for ensuring data quality, security, privacy, and compliance across an organization’s data landscape.
Use Cases
- Unilever: Enterprise data catalog and governance to help employees find trusted datasets and improve data reuse across business units — Adopted a centralized data catalog and governance approach with defined data ownership (data stewards), standardized business glossary terms, and dataset certification to indicate trusted sources; integrated catalog metadata with analytics workflows to encourage self-service discovery (Improved data discoverability and reuse, reduced duplicated datasets and reporting inconsistencies, and enabled faster analytics delivery through clearer ownership and standardized definitions)
- HSBC: Governance and control of sensitive financial data to support regulatory compliance and reduce data risk — Implemented enterprise-wide data governance practices including data classification, lineage documentation, access controls, and auditability; aligned governance policies with risk and compliance requirements and embedded controls into data platforms (Better visibility into sensitive data, stronger audit readiness, and reduced operational risk through consistent controls and clearer accountability)
- Kaiser Permanente: Protecting patient data while enabling analytics for care and operations — Established governance policies for PHI/PII handling, role-based access, and auditing; used automated discovery/classification and access monitoring capabilities within their data platforms to enforce HIPAA-aligned controls (More consistent protection of patient data, improved compliance posture, and safer self-service analytics through standardized access and auditing)
Provider Equivalents
- AWS: Amazon DataZone
- Azure: Microsoft Purview
- GCP: Dataplex
- OCI: OCI Data Catalog
Frequently Asked Questions
- What's the difference between Data Governance and Data Management?
- Data management is the day-to-day work of collecting, storing, processing, and delivering data (pipelines, databases, backups, performance). Data governance is the set of rules, roles, and controls that decide how data should be defined, protected, accessed, and used (ownership, policies, quality standards, privacy, compliance). Governance guides management so the data is trustworthy and used responsibly.
- When should I use Data Governance?
- Use data governance when multiple teams share data, when you handle sensitive data (PII/PHI/payment data), or when you need consistent definitions and trusted reporting. Common triggers include: moving to a data lake/warehouse, enabling self-service analytics, adopting AI/ML on enterprise data, mergers that combine datasets, or needing to meet regulations like GDPR, HIPAA, PCI DSS, or SOX.
- How much does Data Governance cost?
- Costs typically include (1) tooling (catalog, classification, access governance, monitoring), often priced by data scanned, number of assets, users, or capacity; (2) cloud consumption (storage, query, logging, encryption key usage); and (3) people/process (data owners, stewards, policy work, training). The biggest cost driver is usually organizational effort to define ownership, standards, and workflows, not just the software.
Category: data
Difficulty: advanced
Related Terms
See Also