Explore Developer Center's New Chatbot! MongoDB AI Chatbot can be accessed at the top of your navigation to answer all your MongoDB questions.

Join us at AWS re:Invent 2024! Learn how to use MongoDB for AI use cases.
MongoDB Developer
Atlas
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
Atlaschevron-right

Optimizing for Relevance Using MongoDB Atlas and LlamaIndex

Apoorva Joshi13 min read • Published Oct 02, 2024 • Updated Oct 02, 2024
AIPythonAtlas
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
In the past year or so, there has been an uptick in natural language experiences in user-facing applications, ranging from RAG (Retrieval Augmented Generation (RAG) chatbots and agents to natural language search. With natural language becoming the new query language, keyword-based full-text search might not always retrieve the best results due to the length, paraphrasing, ambiguity, and vagueness of natural language queries.
Semantic search has become a popular choice for such applications since it retrieves information based on conceptual similarity rather than keyword matches.
But what if we could optimize for both conceptual similarity and important keywords?
In this tutorial, we will explore different strategies to optimize for relevance in RAG and natural language search applications, focusing on hybrid search.
Specifically, we will cover the following:
  • What is hybrid search?
  • Use cases for hybrid search
  • Tuning hybrid search for better results
  • Combining metadata filters with search

What is hybrid search? When should you use it?

Hybrid search is a technique that combines the results from multiple search methodologies to improve the relevance and accuracy of search results. Which search methodologies to combine depends on the use case—in RAG and natural language search applications, it is common to combine semantic search with full-text search to retrieve a combination of contextually accurate and highly relevant documents.
Some examples of applications where hybrid search is useful are as follows:
  • Product search: A search for “red running shoes,” for example, should return results that match the exact keywords in the query but can also contain related products such as socks and training vests that match the query's intent.
  • Customer support chatbots: Customers might mention precise error codes or function names in their queries, but the chatbot will need to retrieve documentation about the exact error code and also other related issues to fully understand the problem and provide a solution.
  • Content search and recommendation: In content recommendation systems for news, movies, etc., hybrid search can help balance precision and diversity, helping users not only find relevant content but also discover new, diverse content they may not have explicitly searched for. It can also help in situations where the user only has certain keywords or themes in mind but does not exactly know what they are looking for.
In this tutorial, we will focus on a movie recommendation system using MongoDB and LlamaIndex. We will see how to perform full-text, vector, and hybrid searches using MongoDB’s LlamaIndex integration and explore ways to tune them to optimize for relevance.
The Jupyter Notebook for this tutorial can be found on GitHub.

Step 1: Install required libraries

We will require the following libraries for this tutorial:
  • pymongo: Python package to interact with MongoDB databases and collections
  • llama-index: Python package for the LlamaIndex LLM framework
  • llama-index-llms-openai: Python package to use OpenAI models via their LlamaIndex integration
  • llama-index-vector-stores-mongodb: Python package for MongoDB’s LlamaIndex integration
1!pip install -qU pymongo llama-index llama-index-llms-openai llama-index-vector-stores-mongodb

Step 2: Set up prerequisites

In this tutorial, we will use OpenAI’s text-embedding-3-small model to generate embeddings for vector search. To use their models, you need to obtain an API key and set it as an environment variable:
1os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")
We will use MongoDB as a vector store and perform full-text, vector, and hybrid searches against it. But first, you will need a MongoDB Atlas account with a database cluster. Once you do that, you will need to get the connection string to connect to your cluster. Follow these steps to get set up:
Once you have the connection string, set it in your code and instantiate the MongoDB client to connect to your database.
1MONGODB_URI = getpass.getpass("Enter your MongoDB URI: ")
2mongodb_client = MongoClient(
3 MONGODB_URI, appname="devrel.content.retrieval_strategies_llamaindex"
4)
Don’t forget to add the IP of your host machine to the IP access list for your cluster.

Step 3: Load and process the dataset

In this tutorial, we will use MongoDB’s embedded_movies dataset, which is available on HuggingFace. The dataset contains a list of movies, their plots, and other useful metadata about the movies, such as genres, ratings, cast information, etc. Let’s download the dataset and do some basic pre-processing.
1# Load the dataset
2data = load_dataset("MongoDB/embedded_movies", split="train")
3data = pd.DataFrame(data)
4
5# Fill Nones in the dataframe
6data = data.fillna({"genres": "[]", "languages": "[]", "cast": "[]", "imdb": "{}"})
The above code loads the dataset from HuggingFace as a Pandas dataframe and fills None or NaN values in the resulting dataframe with appropriate defaults.
Next, let’s convert the dataframe into a list of LlamaIndex documents. Documents are a core abstraction in LlamaIndex and are essentially a class for storing a piece of text and associated metadata.
1documents = []
2for _, row in data.iterrows():
3 # Extract required fields
4 title = row["title"]
5 rating = row["imdb"].get("rating", 0)
6 languages = row["languages"]
7 cast = row["cast"]
8 genres = row["genres"]
9 # Create the metadata attribute
10 metadata = {"title": title, "rating": rating, "languages": languages}
11 # Create the text attribute
12 text = f"Title: {title}\nPlot: {row['fullplot']}\nCast: {', '.join(item for item in cast)}\nGenres: {', '.join(item for item in genres)}\nLanguages: {', '.join(item for item in languages)}\nRating: {rating}"
13 documents.append(Document(text=text, metadata=metadata))
The above code iterates through the rows of the dataframe and:
  • Extracts a subset of metadata columns from each row. The dataset has a lot of metadata for each movie, but we don’t need all of it.
  • Converts each row into a LlamaIndex document. The text attribute of the document is a concatenated string consisting of information that is suitable for both vector search (e.g., title, movie plot) as well as full-text search (e.g., cast, genre).
An example of a LlamaIndex document is as follows:
1Document(id_='df580995-076f-444e-ae21-9ba596a6a224', embedding=None, metadata={'title': 'The Perils of Pauline', 'rating': 7.6, 'languages': ['English']}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Title: The Perils of Pauline\nPlot: Young Pauline is left a lot of money...\nCast: Pearl White, Crane Wilbur, Paul Panzer, Edward Josè\nGenres: Action\nLanguages: English\nRating: 7.6', mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n')
Tags are a common way to characterize content in recommendation systems. If you have these in your dataset, you would include them in the text attribute to help with full-text search as well as the metadata attribute to potentially filter on these tags based on user preferences.

Step 4: Create MongoDB Atlas vector store

To perform vector and hybrid search on our data, we first need to embed and ingest it into a MongoDB collection to create a vector store.
1from llama_index.embeddings.openai import OpenAIEmbedding
2from llama_index.vector_stores.mongodb import MongoDBAtlasVectorSearch
3from llama_index.core.settings import Settings
4from llama_index.core import VectorStoreIndex, StorageContext
5
6Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
7
8VS_INDEX_NAME = "vector_index"
9FTS_INDEX_NAME = "fts_index"
10DB_NAME = "llamaindex"
11COLLECTION_NAME = "hybrid_search"
12collection = mongodb_client[DB_NAME][COLLECTION_NAME]
13
14vector_store = MongoDBAtlasVectorSearch(
15 mongodb_client,
16 db_name=DB_NAME,
17 collection_name=COLLECTION_NAME,
18 vector_index_name=VS_INDEX_NAME,
19 fulltext_index_name=FTS_INDEX_NAME,
20 embedding_key="embedding",
21 text_key="text",
22)
23# If the collection has documents with embeddings already, create the vector store index from the vector store
24if collection.count_documents({}) > 0:
25 vector_store_index = VectorStoreIndex.from_vector_store(vector_store)
26# If the collection does not have documents, embed and ingest them into the vector store
27else:
28 vector_store_context = StorageContext.from_defaults(vector_store=vector_store)
29 vector_store_index = VectorStoreIndex.from_documents(
30 documents, storage_context=vector_store_context, show_progress=True
31 )
The above code:
  • Specifies the embedding model to use by setting the embed_model attribute of the Settings object. We will use the text-embedding-3-small model from OpenAI. In LlamaIndex, Settings is an object used to specify various global configuration settings, such as the LLM, embedding model, text splitter, etc., for your application.
  • Specifies the name of the MongoDB database (DB_NAME) and collection (COLLECTION_NAME) to ingest data into, and the names of the vector (VS_INDEX_NAME) and full-text search (FTS_INDEX_NAME) indexes to use.
  • Initializes a MongoDB Atlas vector store using the MongoDBAtlasVectorSearch class from MongoDB’s LlamaIndex integration. The vector store object specifies the database (db_name) and collection (collection_name) to use, the names of the vector search (vector_index_name) and full-text search (fulltext_index_name) indexes, and the path to the embeddings (embedding_key) and text (text_key) field for vector search and full-text search respectively.
  • Creates a vector store index using the from_vector_store method if the MongoDB Atlas vector store already has documents with embeddings. If not, we create a storage context that manages where the documents and embeddings are stored—in this case, the MongoDB Atlas vector store. We then create the vector store index using the from_documents method and the MongoDB Atlas vector store as the storage context. Under the hood, the index chunks, embeds, and ingests the documents we created into MongoDB Atlas.

Step 5: Create MongoDB Atlas Search indexes

To perform full-text and vector searches on your data in MongoDB Atlas, you first need to create full-text and vector search indexes, respectively.
NOTE: MongoDB indexes are not to be confused with indexes in LlamaIndex. Indexes in MongoDB are required to efficiently retrieve data from MongoDB. Indexes in LlamaIndex are required if you want to query your data using LlamaIndex. So you need both when using MongoDB as a vector store with LlamaIndex.
1vs_model = SearchIndexModel(
2 definition={
3 "fields": [
4 {
5 "type": "vector",
6 "path": "embedding",
7 "numDimensions": 1536,
8 "similarity": "cosine",
9 },
10 {"type": "filter", "path": "metadata.rating"},
11 {"type": "filter", "path": "metadata.language"},
12 ]
13 },
14 name=VS_INDEX_NAME,
15 type="vectorSearch",
16)
The above code creates a model for the vector search index. The model specifies:
  • The index definition, which, in turn, specifies the path to the embedding field (embedding), the number of embedding dimensions (numDimensions), and the similarity metric (similarity) to use to measure embedding distance. We can also specify fields to use as pre-filters during vector search, in this case, movie rating and languages. Pre-filtering is a way to narrow the scope of the vector search based on certain criteria and can help improve the speed and quality of the search.
  • The index name (name).
  • The index type, in this case vectorSearch.
Next, let’s create a model for the full-text search index:
1fts_model = SearchIndexModel(
2 definition={"mappings": {"dynamic": False, "fields": {"text": {"type": "string"}}}},
3 name=FTS_INDEX_NAME,
4 type="search",
5)
The index definition for the full-text search index looks a bit different from the vector search index. For full-text search, we specify mappings which define how document fields are processed, analyzed, and stored for efficient and accurate full-text search functionality. In this case, we set dynamic to False to indicate that we do not want the search index to automatically index fields. Instead, we will specify the fields we want to index and their types under mappings.fields. Also, notice that the index type for full-text search is search.
Use dynamic mappings if your schema changes regularly or is unknown.
Finally, let’s create the search indexes for the collection containing our data. If an index already exists, we will display an appropriate error message and skip index creation:
1for model in [vs_model, fts_model]:
2 try:
3 collection.create_search_index(model=model)
4 except OperationFailure:
5 print(f"Duplicate index found for model {model}. Skipping index creation.")

Step 6: Get movie recommendations

Now that we have generated embeddings, and ingested and indexed our movie dataset into MongoDB Atlas, we are ready to use it to make movie recommendations.
Let’s define a function to do so:
1def get_recommendations(query: str, mode: str, **kwargs) -> None:
2 """
3 Get movie recommendations
4
5 Args:
6 query (str): User query
7 mode (str): Retrieval mode. One of (default, text_search, hybrid)
8 """
9 query_engine = vector_store_index.as_query_engine(
10 similarity_top_k=5, vector_store_query_mode=mode, **kwargs
11 )
12 response = query_engine.query(query)
13 nodes = response.source_nodes
14 for node in nodes:
15 title = node.metadata["title"]
16 rating = node.metadata["rating"]
17 score = node.score
18 print(f"Title: {title} | Rating: {rating} | Relevance Score: {score}")
The get_recommendations function in the above code:
  • Takes a natural language user query (query), a search mode (mode), and any additional arguments as input. The search mode can be text_search for full-text search, default for vector search, and hybrid for hybrid search.
  • Converts the vector store index created in Step 4 into a query engine. The query engine in LlamaIndex is an interface to ask questions about your data and configure query settings. Converting an existing index into a query engine is as simple as doing <index_name>.as_query_engine(). You can also specify parameters for your search in the as_query_engine call—in our case, we will set the similarity_top_k to 5 to get the top five most relevant results from our searches, and the vector_store_query_mode to indicate the search mode.
  • Queries the query engine, retrieves the returned nodes, iterates through them, and prints the movie title, rating, and relevance score for each node.
A node in LlamaIndex represents a chunk of a source document. Under the hood, indexes in LlamaIndex store data as node objects, so when queried, you get back nodes from the index.'
Now, let’s query our data with different search modes and observe the differences. Let’s start with full-text search.
1get_recommendations(
2 query="Action movies about humans fighting machines",
3 mode="text_search",
4)
5
6Title: Hellboy II: The Golden Army | Rating: 7.0 | Relevance Score: 5.93734884262085
7Title: The Matrix Revolutions | Rating: 6.7 | Relevance Score: 4.574477195739746
8Title: The Matrix | Rating: 8.7 | Relevance Score: 4.387373924255371
9Title: Go with Peace Jamil | Rating: 6.9 | Relevance Score: 3.5394840240478516
10Title: Terminator Salvation | Rating: 6.7 | Relevance Score: 3.3378987312316895
If you look at the plots for these movies in our dataset, you will observe that they all contain the keywords “humans” and “machines” which are a close match to the user query.
Next, let’s try the same query but with the search mode as vector search. To do this, set the mode parameter to default in the call to the get_recommendations() function:
1get_recommendations(
2 query="Action movies about humans fighting machines", mode="default"
3)
4
5Title: Death Machine | Rating: 5.7 | Relevance Score: 0.7407287359237671
6Title: Real Steel | Rating: 7.1 | Relevance Score: 0.7364246845245361
7Title: Soldier | Rating: 5.9 | Relevance Score: 0.7282171249389648
8Title: Terminator 3: Rise of the Machines | Rating: 6.4 | Relevance Score: 0.7266112565994263
9Title: Last Action Hero | Rating: 6.2 | Relevance Score: 0.7250100374221802
Note the difference in recommendations. The plots for the movies recommended above talk about futuristic societies, human soldiers battling “another breed of soldiers,” humans teaching robots how to box, etc., which are thematically related to the user query without containing the exact keywords.
Note the difference in scales between full-text and vector search. Refer to our documentation to read more about the scoring methodologies for Atlas Search and Atlas Vector Search.
Now, let’s try the query with the search mode as hybrid search. To do this, set the mode parameter to hybrid in the get_recommendations() function call:
1get_recommendations(query="Action movies about humans fighting machines", mode="hybrid")
2
3Title: Hellboy II: The Golden Army | Rating: 7.0 | Relevance Score: 0.5
4Title: Death Machine | Rating: 5.7 | Relevance Score: 0.5
5Title: The Matrix Revolutions | Rating: 6.7 | Relevance Score: 0.25
6Title: Real Steel | Rating: 7.1 | Relevance Score: 0.25
7Title: Soldier | Rating: 5.9 | Relevance Score: 0.16666666666666666
In this case, the list of recommendations is a mix of recommendations obtained from full-text and vector searches. Notice that the relevance scores for the same movies are now different. This is because the scores from vector search and full-text search are combined using reciprocal rank fusion (RRF). This technique combines rankings from multiple searches by assigning a score to each document based on the reciprocal of its rank in each list and optionally, a weight assigned to each search methodology.
By default, RRF assigns an equal weight to each search technique. However, you can change this by passing in an additional argument, alpha, to the query engine. This can be a value between 0 and 1, where a value closer to 1 will weigh vector search higher while combining the scores, and a value closer to 0 means full-text search will dominate. The value of alpha is 0.5 by default to indicate equal weighting.
1# Higher alpha, vector search dominates
2get_recommendations(
3 query="Action movies about humans fighting machines",
4 mode="hybrid",
5 alpha=0.7,
6)
7
8Title: Death Machine | Rating: 5.7 | Relevance Score: 0.7
9Title: Real Steel | Rating: 7.1 | Relevance Score: 0.35
10Title: Hellboy II: The Golden Army | Rating: 7.0 | Relevance Score: 0.30000000000000004
11Title: Soldier | Rating: 5.9 | Relevance Score: 0.2333333333333333
12Title: Terminator 3: Rise of the Machines | Rating: 6.4 | Relevance Score: 0.175
In the above code, we pass alpha as an argument to the get_recommendations() function and it gets forwarded to the query engine. With alpha set to 0.7, notice that the vector search results are scored and hence ranked higher than the full-text search results in the final list of recommendations.
Now, let’s set alpha to a value less than 0.5 and see what changes:
1# Lower alpha, full-text search dominates
2get_recommendations(
3 query="Action movies about humans fighting machines",
4 mode="hybrid",
5 alpha=0.3,
6)
7
8Title: Hellboy II: The Golden Army | Rating: 7.0 | Relevance Score: 0.7
9Title: The Matrix Revolutions | Rating: 6.7 | Relevance Score: 0.35
10Title: Death Machine | Rating: 5.7 | Relevance Score: 0.3
11Title: The Matrix | Rating: 8.7 | Relevance Score: 0.2333333333333333
12Title: Go with Peace Jamil | Rating: 6.9 | Relevance Score: 0.175
In the above code, we set alpha to 0.3 and notice that the full-text results are scored and ranked higher than the vector search results in the final list of recommendations.
Oftentimes, in recommendation systems, in addition to retrieval, there is a step to filter out retrieved candidates based on certain criteria such as known user preferences or custom business logic. If you have these criteria captured as metadata in your documents in MongoDB Atlas, you can even pre-filter documents to get an efficiency boost, rather than applying filters after retrieval.
As a final step in this tutorial, let’s see how to combine pre-filtering with hybrid search. Let’s assume we only want to recommend English movies with a rating greater than 7/10.
1from llama_index.core.vector_stores import (
2 MetadataFilter,
3 MetadataFilters,
4 FilterOperator,
5 FilterCondition,
6)
7
8filters = MetadataFilters(
9 filters=[
10 MetadataFilter(key="metadata.rating", value=7, operator=FilterOperator.GT),
11 MetadataFilter(
12 key="metadata.languages", value="English", operator=FilterOperator.EQ
13 ),
14 ],
15 condition=FilterCondition.AND,
16)
In LlamaIndex, you can specify metadata filters as Pydantic models (MetadataFilter) containing the metadata field to filter on (key), the field value (value), and a filter operator (operator). If you have multiple filter conditions, you can combine them using the MetadataFilters Pydantic model, with a filters attribute consisting of the filter conditions expressed as MetadataFilter models, and a condition attribute specifying how to combine the filter conditions—e.g., and, or.
In the above code, we create two metadata filters—one with key set to metadata.rating, value to 7, and operator to FilterOperator.GT indicating “greater than” to filter for movies with a rating greater than 7, and another with key set to metadata.languages, value to English, and operator to FilterOperator.EQ indicating “equal to” to filter for English movies. Finally, we combine these filters into a MetadataFilters model with the condition FilterCondition.AND to filter on documents where both the filter conditions are satisfied.
1get_recommendations(
2 query="Action movies about humans fighting machines",
3 mode="hybrid",
4 alpha=0.7,
5 filters=filters,
6)
7
8Title: Real Steel | Rating: 7.1 | Relevance Score: 0.7
9Title: T2 3-D: Battle Across Time | Rating: 7.8 | Relevance Score: 0.35
10Title: The Matrix | Rating: 8.7 | Relevance Score: 0.30000000000000004
11Title: Predator | Rating: 7.8 | Relevance Score: 0.2333333333333333
12Title: Transformers | Rating: 7.1 | Relevance Score: 0.175
In the above code, we pass the same query as before with the mode set to hybrid, alpha to 0.7, and an additional filters argument set to the MetadataFilters model we defined previously. Under the hood, the Pydantic filter model is translated to MongoDB query filters and added to the vector and full-text search pipelines to filter the search results.

Conclusion

In this tutorial, we built a content recommendation system using MongoDB and LlamaIndex. We explored different search modes, focusing on hybrid search, and also ways to tune it to improve the relevance of the search results. In content recommendation systems, in particular, hybrid search can help balance precision and diversity by optimizing for both keywords and contextual similarity.
To explore more ways to improve the retrieval quality of your searches, check out these tutorials on Developer Center:
Top Comments in Forums
There are no comments on this article yet.
Start the Conversation

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

Introducing Atlas Stream Processing Support Within the MongoDB for VS Code Extension


Mar 05, 2024 | 4 min read
Article

Using Atlas Search from Java


Jul 14, 2023 | 13 min read
Tutorial

Maintaining a Geolocation Specific Game Leaderboard with Phaser and MongoDB


Apr 02, 2024 | 18 min read
Tutorial

Serverless Development with AWS Lambda and MongoDB Atlas Using Java


Jul 20, 2023 | 6 min read
Table of Contents