OCI Speech
Definition
Oracle's service for converting speech to text and text to speech. Like having a universal translator between spoken words and written text.
Use Cases
- Duolingo: Language learning speaking exercises and pronunciation practice — Uses speech recognition to evaluate learners’ spoken responses and provide feedback within lessons; uses text-to-speech to generate spoken prompts and examples across many languages. (Enables interactive speaking practice at scale and improves lesson accessibility for users who prefer listening over reading.)
- Zoom: Meeting captions and transcripts for accessibility and search — Applies speech-to-text to live meeting audio to generate captions and post-meeting transcripts that can be searched and reviewed. (Improves accessibility for participants who are deaf or hard of hearing and increases the usefulness of meeting content through searchable text.)
- Google Maps: Hands-free voice interaction for navigation — Uses speech-to-text to understand spoken destination queries and commands; uses text-to-speech to read turn-by-turn directions aloud. (Supports safer, hands-free navigation and improves usability for drivers and users with limited mobility or vision.)
Provider Equivalents
- AWS: Amazon Transcribe (speech to text) and Amazon Polly (text to speech)
- Azure: Azure AI Speech (Speech to Text and Text to Speech)
- GCP: Cloud Speech-to-Text and Cloud Text-to-Speech
- OCI: OCI Speech
Frequently Asked Questions
- What's the difference between OCI Speech and OCI Language?
- OCI Speech turns audio into text (speech to text) and turns text into audio (text to speech). OCI Language analyzes text you already have—for example detecting sentiment, key phrases, or language—so it typically comes after speech-to-text if your input starts as audio.
- When should I use OCI Speech?
- Use OCI Speech when you need to transcribe calls or meetings, add live captions, build voice-controlled apps, create voice bots, or generate spoken audio from text for accessibility (screen-reader-like experiences), IVR prompts, or narrated content.
- How much does OCI Speech cost?
- Pricing is usage-based and depends on factors like how many minutes of audio you transcribe (speech to text) and how much text you convert to audio (text to speech). Costs can also vary by features such as real-time vs batch processing and the selected voice or language. For exact rates, check the OCI Speech pricing page for your region and estimate based on expected audio minutes and characters.
Category: ai-ml
Difficulty: basic
Related Terms
See Also