Data Warehouse

Definition

Structured storage system optimized for analysis and reporting of organized business data, supporting decision-making and business intelligence.

Use Cases

Provider Equivalents

Frequently Asked Questions

What's the difference between a data warehouse and a data lake?
A data warehouse stores curated, structured data optimized for SQL analytics and reporting (cleaned, modeled, and governed). A data lake stores raw or semi-structured data (files, logs, JSON, images) more flexibly, often used for data science, exploration, and later transformation. Many organizations use both: land data in a lake, then transform and load trusted datasets into a warehouse for BI.
When should I use a data warehouse?
Use a data warehouse when you need reliable reporting and dashboards, consistent business metrics (like revenue, churn, inventory turns), fast SQL queries over large historical datasets, and strong governance (access controls, auditing, data quality). It’s especially useful when multiple teams need a shared source of truth for analytics.
How much does a data warehouse cost?
Cost depends on (1) compute model (serverless per-query vs provisioned capacity), (2) data storage volume, (3) query frequency and complexity, (4) concurrency (how many users/tools query at once), (5) data ingestion/ETL costs, and (6) data egress/networking. For example, serverless warehouses often charge for data scanned per query plus storage, while provisioned warehouses charge for allocated compute (hourly) plus storage. Optimizing partitioning, clustering/sort keys, materialized views, and workload management can significantly reduce cost.

Category: data

Difficulty: advanced

Related Terms

See Also