Modern applications increasingly rely on intelligent search capabilities. Users expect results that are both precise (matching keywords) and smart (understanding meaning). MongoDB brings these worlds together with hybrid search, combining semantic vector search and traditional full-text search into one unified experience.
What you’ll learn
In this article, you’ll learn:
-
What vector search is and how it works in MongoDB.
-
What full-text (BM25) search is and what problems it solves.
-
Why combining both techniques creates more relevant search results.
-
How embeddings are stored and indexed directly in MongoDB.
-
How to build hybrid pipelines that blend semantics and metadata (e.g., IMDb ratings).
-
How ranking strategies such as Reciprocal Rank Fusion (RRF) merge results effectively.
What is vector search?
Vector search uses embeddings — high-dimensional numerical representations created by machine learning models — to find items that are semantically similar.
Instead of matching exact words, vector search measures the meaning behind text.
This allows your application to retrieve documents that are conceptually related, even when they don’t share any keywords. For example, a vector query based on “dream heist” may return Inception even if the phrase never appears in the plot description.
Vector search in MongoDB is powered by k-nearest neighbors (k-NN) over vectors stored in BSON Binary (Float32), indexed using knnVector indexes.
What is full-text search?
Full-text search in MongoDB (via MongoDB Search and Lucene) uses BM25, a highly effective ranking algorithm for keyword-based retrieval.
It excels at matching exact terms, synonyms, textual relevance, and user intent when the query is literal.
Full-text search handles:
-
keywords
-
phrases
-
stemming
-
language analyzers
-
scoring based on term frequency and importance
For queries like “computer hacker” or “time travel movie,” this is the most accurate approach.
Why use hybrid search?
When you combine vector search (semantic meaning) with full-text search (keyword precision), you get the best of both worlds.
-
Precision + context: Match exact terms and understand the deeper meaning.
-
Better ranking: Rerank by metadata — rating, genre, year, popularity.
-
All-in-one system: Run BM25 and k-NN directly inside MongoDB.
-
Production-ready pipelines: Use aggregation to filter, fuse, and customize scoring.
-
Ideal for AI: Recommendation engines, semantic search, assistants, and LLM retrieval.
Read it here: Keywords Meet Vectors: Hybrid Search on MongoDB