This tutorial demonstrates a hybrid search that is an aggregation of full-text and semantic search for the same query criteria. While full-text is effective in finding exact matches for query terms, semantic search provides the added benefit of identifying semantically similar documents even if the documents don't contain the exact query term. This ensures that synonymous and contextually similar matches are also included in the combined results of both methods of search.
Conversely, if you have tokens for proper nouns or specific keywords in your dataset that you don't expect to be considered in the training of an embedding model in the same context that they are used in your dataset, your vector search might benefit from being combined with a full-text search.
You can also set weights for each method of search per query. Based on whether full-text or semantic search results are most relevant and appropriate for a query, you can increase the weight for that search method per query.
You can reorder the documents in the results based on the relevance to the query by using the $rerank stage after the $rankFusion or $scoreFusion stage. The $rerank stage reorders the documents based on a Voyage AI reranker model.
About the Tutorial
This tutorial demonstrates how to run a hybrid search combining MongoDB Vector Search and MongoDB Search queries on the sample_mflix.embedded_movies collection, which contains details about movies, for unified search results. Specifically, this tutorial takes you through the following steps:
Create a MongoDB Vector Search index on the
plot_embedding_voyage_4_largefield. This field contains vector embeddings that represent the summary of a movie's plot.Create a MongoDB Search index on the
fullplotfield in thesample_mflix.embedded_moviescollection. This field contains the movie's name as a text string.Run a query that uses
$rankFusionor$scoreFusionto combine the results from a$vectorSearchquery against theplot_embedding_voyage_4_largefield and a$searchquery against thefullplotfield and then reorder the documents in the results based on the relevance to the query by using the$rerankstage.
Prerequisites
Before you begin, complete the prerequisites.