Embeddings
Definition
Numerical representations of data (text, images, audio) in high-dimensional space where similar items are close together.
Use Cases
- Duolingo: Personalized language-learning features that need semantic understanding of learner responses and content — Duolingo has publicly discussed using OpenAI models for parts of its product (e.g., Duolingo Max). A common implementation pattern in such systems is to embed sentences/answers and compare them to reference meanings to power semantic matching, feedback, and retrieval of relevant explanations. (Improves relevance and quality of feedback and content retrieval, enabling more natural interactions and better personalization compared with keyword-based matching.)
- Instacart: Grocery search that understands intent beyond exact keywords (e.g., matching 'soda' with specific brands or 'gluten-free pasta' with relevant products) — Instacart has described using machine learning to improve search and discovery. A typical embeddings-based approach is to generate embeddings for queries and product catalog text/images, then use vector similarity to retrieve and rank semantically related items. (Higher search relevance and conversion by returning better matches for ambiguous or long-tail queries, reducing “no results” experiences.)
- Pinterest: Visual discovery and recommendations (finding visually or semantically similar Pins) — Pinterest has published engineering work on representation learning for recommendations and retrieval. A common embeddings approach is to compute embeddings for images and associated text, then use nearest-neighbor search to recommend similar content. (More relevant recommendations and improved content discovery by matching items based on learned similarity rather than only metadata or keywords.)
Provider Equivalents
- AWS: Amazon Bedrock (Titan Embeddings) and Amazon SageMaker (JumpStart embedding models)
- Azure: Azure OpenAI Service (text-embedding models) and Azure AI Search (vector search integration)
- GCP: Vertex AI Embeddings (text embeddings) and Vertex AI Vector Search
- OCI: OCI Generative AI (Embeddings) and OCI OpenSearch (k-NN/vector search)
Frequently Asked Questions
- What’s the difference between embeddings and vector databases?
- Embeddings are the vectors (numbers) produced by a model to represent meaning. A vector database (or vector index) is the system that stores those vectors and lets you quickly search for “nearest” vectors (most similar items). You often use both: generate embeddings with a model, then store/search them in a vector database.
- When should I use embeddings?
- Use embeddings when you need meaning-based matching instead of exact keyword matching. Common cases include semantic search, retrieval-augmented generation (RAG) for chatbots over your documents, recommendations (“items like this”), deduplication/near-duplicate detection, clustering topics, and similarity-based classification.
- How much do embeddings cost?
- Costs usually come from (1) generating embeddings (priced by input size, often per token for text or per image) and (2) storing/querying vectors (database/index compute, storage, and read/write operations). Total cost depends on how many items you embed, how often you re-embed (e.g., when content changes), vector dimension size, and query volume/latency requirements.
Category: ai-ml
Difficulty: advanced
Related Terms
See Also