Athena

Definition

AWS interactive query service for analyzing data in S3 using SQL, allowing users to run queries without needing to set up infrastructure.

Use Cases

Provider Equivalents

Frequently Asked Questions

What's the difference between Amazon Athena and Amazon Redshift?
Athena is a serverless query service that reads data directly from Amazon S3, so you don’t load data into a database first. It’s great for ad-hoc queries and data lake exploration. Amazon Redshift is a managed data warehouse where you typically load and model data for consistently fast performance on repeated BI/reporting workloads and complex transformations.
When should I use Amazon Athena?
Use Athena when your data already lives in Amazon S3 and you want to run SQL queries without managing servers—especially for log analysis, exploratory analytics, one-off investigations, and querying open table formats (like Parquet/ORC) in a data lake. If you need high concurrency dashboards with predictable performance, consider a data warehouse (for example Redshift) or caching/optimization strategies.
How much does Amazon Athena cost?
Athena is priced primarily per amount of data scanned by your queries (with separate pricing for features like Athena engine versions, workgroups, and optional capabilities). Costs depend on how much data each query reads, so using columnar formats (Parquet/ORC), compression, and partitioning (for example by date) can significantly reduce scanned data and cost.

Category: data

Difficulty: intermediate

Related Terms

See Also