Partitioning

Definition

Dividing a database table or dataset into smaller, more manageable pieces based on specific criteria like date ranges or geographic regions.

Use Cases

Provider Equivalents

Frequently Asked Questions

What's the difference between partitioning and sharding?
Partitioning splits data into parts but typically keeps it within the same database system and is managed as one logical table (e.g., partitions by month). Sharding splits data across multiple database instances/servers, which adds operational complexity but can scale write throughput and storage beyond a single system.
When should I use partitioning?
Use partitioning when your queries commonly filter on a predictable key (most often time, like date ranges) and the table is large enough that scanning everything is slow or expensive. It’s especially useful for logs, events, orders, IoT telemetry, and any dataset where recent data is queried far more than old data.
How much does partitioning cost?
Partitioning itself usually has no direct line-item cost, but it affects cost through performance and storage. In data warehouses and query engines that charge by data scanned (e.g., BigQuery on-demand, Athena), good partitioning can lower cost by reducing bytes scanned. In databases, partitioning can reduce CPU/IO for queries but may add overhead for writes, maintenance, and index management depending on the engine and partition strategy.

Category: data

Difficulty: advanced

Related Terms

See Also