Google's service for converting audio to text with high accuracy. Like having a professional transcriptionist that works instantly and supports many languages.
Video conferencing apps use Speech-to-Text to provide real-time captions for accessibility and meeting notes.
All four services convert spoken audio into text using cloud-based automatic speech recognition (ASR). They support common audio formats, streaming (real-time) and batch transcription, timestamps, and language options. Differences are mainly in supported languages, model options (e.g., phone call vs. video), customization features, and pricing.