Streaming Service
Definition
A real-time data processing service that handles continuous streams of data for analytics and applications, enabling instant insights.
Use Cases
- Netflix: Real-time monitoring and analytics of application events to improve reliability and user experience. — Netflix has publicly described using Apache Kafka as a core event streaming platform to collect and distribute high-volume events (such as service logs and metrics) across many internal systems for near-real-time processing and alerting. (Faster detection of operational issues and improved ability to react to incidents using near-real-time event data.)
- The New York Times: Streaming content and user interaction events to power analytics and downstream data pipelines. — The New York Times has publicly discussed building event-driven data pipelines using Apache Kafka to stream events into their data platform for processing and analytics. (More timely analytics and a more flexible pipeline for delivering event data to multiple consumers.)
- Spotify: Near-real-time event streaming for analytics and operational insights across services. — Spotify has publicly shared using Google Cloud Pub/Sub for messaging/event distribution in parts of its platform, enabling services to publish and consume events asynchronously at scale. (Improved decoupling between services and more scalable distribution of event data to multiple systems.)
Provider Equivalents
- AWS: Amazon Kinesis Data Streams
- Azure: Azure Event Hubs
- GCP: Google Cloud Pub/Sub
- OCI: OCI Streaming
Frequently Asked Questions
- What's the difference between a streaming service and a message queue?
- A message queue is usually designed for task distribution: messages are processed once and then removed, often with a single main consumer per message. A streaming service is designed for continuous event streams where multiple consumers can read the same events, often with replay (reading past events) and ordering within partitions. Streaming services are commonly used for analytics, monitoring, and event-driven architectures where you want fan-out and the ability to reprocess data.
- When should I use a streaming service?
- Use a streaming service when you have continuous data arriving (clickstream, IoT telemetry, logs, transactions) and you need near-real-time processing, multiple downstream consumers, or the ability to replay events for debugging and reprocessing. If you only need simple background job processing with one consumer per task, a queue may be simpler.
- How much does a streaming service cost?
- Costs typically depend on throughput (ingress/egress), number of partitions/shards (reserved capacity), retention duration, and any enhanced features (e.g., extended retention, cross-region replication). You also pay for downstream processing (stream processing jobs), storage for long-term retention (data lake/warehouse), and network egress if consumers are in different regions or clouds.
Category: analytics
Difficulty: advanced
Related Terms
See Also