DataBrew

Definition

AWS DataBrew is a visual data preparation service that enables users to clean, normalize, and prepare data for analysis without writing any code.

Use Cases

Provider Equivalents

Frequently Asked Questions

What's the difference between AWS Glue DataBrew and AWS Glue (ETL)?
DataBrew is a visual, no-code tool for exploring, profiling, and cleaning data using point-and-click transformations. AWS Glue (ETL) is a broader service for building and running scalable ETL jobs (often code-based with Spark or script-based) and managing a data catalog. Use DataBrew for interactive data prep and quick cleaning; use Glue ETL for production pipelines, complex transformations, and large-scale scheduled processing.
When should I use AWS Glue DataBrew?
Use DataBrew when you need to quickly understand a dataset (profiling), clean messy files (duplicates, inconsistent formats, nulls), and create a repeatable set of transformations without writing code. It’s especially useful for analysts and data engineers who want to prototype cleaning steps interactively and then run them as scheduled jobs on data stored in Amazon S3.
How much does AWS Glue DataBrew cost?
Pricing is based on usage, primarily the time spent running DataBrew jobs and the number of interactive sessions. Your total cost also depends on related AWS resources you use (for example, Amazon S3 storage, AWS Glue Data Catalog, and any downstream services like Athena or Redshift). For exact rates and regional differences, check the AWS Glue DataBrew pricing page and estimate based on job duration, frequency, and dataset size.

Category: ai-ml

Difficulty: intermediate

Related Terms

See Also