Google's service for converting audio to text with high accuracy. Like having a professional transcriptionist that works instantly and supports many languages.
Video conferencing apps use Speech-to-Text to provide real-time captions for accessibility and meeting notes.
All four services convert spoken audio into text using automatic speech recognition (ASR). They commonly support batch transcription (files) and streaming transcription (real time), language selection, timestamps, and confidence scores. Differences are usually in supported languages, domain tuning/customization options, streaming features, and pricing.