BlogAnnounced at MongoDB.local NYC 2024: A recap of all announcements and updatesLearn more >>
MongoDB Developer
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right

How to Use Cohere Embeddings and Rerank Modules with MongoDB Atlas

Ashwin Gangadhar9 min read • Published Apr 02, 2024 • Updated Apr 04, 2024
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
The daunting task that developers currently face while developing solutions powered by the retrieval augmented generation (RAG) framework is the choice of retrieval mechanism. Augmenting the large language model (LLM) prompt with relevant and exhaustive information creates better responses from such systems.. One is tasked with choosing the most appropriate embedding model in the case of semantic similarity search. Alternatively, in the case of full-text search implementation, you have to be thorough about your implementation to achieve a precise recall and high accuracy in your results. Sometimes, the solutions require a combined implementation that benefits from both retrieval mechanisms.
If your current full-text search scoring workflow is leaving things to be desired, or if you find yourself spending too much time writing numerous lines of code to get semantic search functionality working within your applications, then Cohere and MongoDB can help. To prevent these issues from holding you back from leveraging powerful AI search functionality or machine learning within your application, Cohere and MongoDB offer easy-to-use and fully managed solutions.
Cohere is an AI company specializing in large language models.
  1. With a powerful tool for embedding natural language in their projects, it can help you represent more accurate, relevant, and engaging content as embeddings. The Cohere language model also offers a simple and intuitive API that allows you to easily integrate it with your existing workflows and platforms.
  2. The Cohere Rerank module is a component of the Cohere natural language processing system that helps to select the best output from a set of candidates. The module uses a neural network to score each candidate based on its relevance, semantic similarity, theme, and style. The module then ranks the candidates according to their scores and returns the top N as the final output.
MongoDB Atlas is a fully managed developer data platform service that provides scalable, secure, and reliable data storage and access for your applications. One of the key features of MongoDB Atlas is the ability to perform vector search and full-text search on your data, which can enhance the capabilities of your AI/ML-driven applications. MongoDB Atlas can help you build powerful and flexible AI/ML-powered applications that can leverage both structured and unstructured data. You can easily create and manage search indexes, perform queries, and analyze results using MongoDB Atlas's intuitive interface, APIs, and drivers. MongoDB Atlas Vector Search provides a unique feature — pre-filtering and post-filtering on vector search queries — that helps users control the behavior of their vector search results, thereby improving the accuracy and retrieval performance, and saving money at the same time.
Therefore, with Cohere and MongoDB Atlas, we can demonstrate techniques where we can easily power a semantic search capability on your private dataset with very few lines of code. Additionally, you can enhance the existing ranking of your full-text search retrieval systems using the Cohere Rerank module. Both techniques are highly beneficial for building more complex GenAI applications, such as RAG- or LLM-powered summarization or data augmentation.

What will we do in this tutorial?

Store embeddings and prepare the index

  1. Use the Cohere Embed Jobs to generate vector embeddings for the first time on large datasets in an asynchronous and scheduled manner.
  2. Add vector embeddings into MongoDB Atlas, which can store and index these vector embeddings alongside your other operational/metadata.
  3. Finally, prepare the indexes for both vector embeddings and full-text search on our private dataset.

Search with vector embeddings

  1. Write a simple Python function to accept search terms/phrases and pass it through the Cohere embed API again to get a query vector.
  2. Take these resultant query vector embeddings and perform a vector search query using the $vectorsearch operator in the MongoDB Aggregation Pipeline.
  3. Pre-filter documents using meta information to narrow the search across your dataset, thereby speeding up the performance of vector search results while retaining accuracy.
  4. The retrieved semantically similar documents can be post-filtered (relevancy score) to demonstrate a higher degree of control over the semantic search behaviour.

Search with text and Rerank with Cohere

  1. Write a simple Python function to accept search terms/phrases and prepare a query using the $search operator and MongoDB Aggregation Pipeline.
  2. Take these resultant documents and perform a reranking operation of the retrieved documents to achieve higher accuracy with full-text search results using the Cohere rerank module.
Cohere and MongoDB Flow Diagram
This will be a hands-on tutorial that will introduce you to how you can set up MongoDB with sample_movies dataset (the link to the file is in the code snippets). You’ll learn how to use the Cohere embedding jobs API to schedule a job to process all the documents as a batch job and update the dataset to add a new field by the name embedding that is stored alongside the other metadata/operational data. We will use this field to create a vector search index programmatically using the MongoDB Python drivers. Once we have created this index, we can then demonstrate how to query using the vector embedding as well as perform full-text search using the expressive and composable MongoDB Aggregation Pipeline (Query API).

Steps to initialize and run through the tutorial

Python dependencies

  • pandas: Helps with data preprocessing and handling
  • cohere: For embedding model and rerank module
  • pymongo: For the MongoDB Atlas vector store and full text search
  • s3fs : To load files directly from s3 bucket

Install all dependencies

The following line of code is to be run on Jupyter Notebook to install the required packages.

Initialize the Cohere API key and MongoDB connection string

If you have not created an API key on the Cohere platform, you can sign up for a Cohere account and create an API key, which you can generate from one of the following interfaces:
Also, if you have not created a MongoDB Atlas instance for yourself, you can follow the tutorial to create one. This will provide you with your MONGODB_CONNECTION_STR.
Run the following lines of code in Jupyter Notebook to initialize the Cohere secret or API key and MongoDB Atlas connection string.

Load dataset from the S3 bucket

Run the following lines of code in Jupyter Notebook to read data from an AWS S3 bucket directly to a pandas dataframe.
Loaded AWS S3 Dataset

Initialize and schedule the Cohere embeddings job to embed the "sample_movies" dataset

Here we will create a movies dataset in Cohere by uploading our sample movies dataset that we fetched from the S3 bucket and have stored locally. Once we have created a dataset, we can use the Cohere embed jobs API to schedule a batch job to embed all the entire dataset.
You can run the following lines of code in your Jupyter Notebook to upload your dataset to Cohere and schedule an embedding job.

How to initialize MongoDB Atlas and insert data to a MongoDB collection

Now that we have created the vector embeddings for our sample movies dataset, we can initialize the MongoDB client and insert the documents into our collection of choice by running the following lines of code in the Jupyter Notebook.

Programmatically create vector search and full-text search index

With the latest update to the Pymongo Python package, you can now create your vector search index as well as full-text search indexes from the Python client itself. You can also create vector indexes using the MongoDB Atlas UI or mongosh.
Run the following lines of code in your Jupyter Notebook to create search and vector search indexes on your new collection.

Query MongoDB vector index using $vectorSearch

MongoDB Atlas brings the flexibility of using vector search alongside full-text search filters. Additionally, you can apply range, string, and numeric filters using the aggregation pipeline. This allows the end user to control the behavior of the semantic search response from the search engine. The below lines of code will demonstrate how you can perform vector search along with pre-filtering on the year field to get movies earlier than 1990. Plus, you have better control over the relevance of returned results, so you can perform post-filtering on the response using the MongoDB Query API. In this demo, we are filtering on the score field generated as a result of performing the vector similarity between the query and respective documents, using a heuristic to retain only the accurate results.
Run the below lines of code in Jupyter Notebook to initialize a function that can help you achieve vector search + pre-filter + post-filter.

Vector search query example

Run the below lines of code in Jupyter Notebook cell and you can see the following results.
Vector Search Query Example Results

Vector search query example with prefilter

Vector Search with Prefilter Example Results

Vector search query example with prefilter and postfilter to control the semantic search relevance and behaviour

Vector Search with Prefilter and Postfilter Example Results

Leverage MongoDB Atlas full-text search with Cohere Rerank module

Cohere Rerank is a module in the Cohere suite of offerings that enhances the quality of search results by leveraging semantic search. This helps elevate the traditional search engine performance, which relies solely on keywords. Rerank goes a step further by ranking results retrieved from the search engine based on their semantic relevance to the input query. This pass of re-ranking search results helps achieve more appropriate and contextually similar search results.
To demonstrate how the Rerank module can be leveraged with MongoDB Atlas full-text search, we can follow along by running the following line of code in your Jupyter Notebook.
Cohere Rerank Model Sample Results
Output post reranking the full-text search results:


In this tutorial, we were able to demonstrate the following:
  1. Using the Cohere embedding along with MongoDB Vector Search, we were able to show how easy it is to achieve semantic search functionality alongside your operational data functions.
  2. With Cohere Rerank, we were able to search results using full-text search capabilities in MongoDB and then rank them by semantic relevance, thereby delivering richer, more relevant results without replacing your existing search architecture setup.
  3. The implementations were achieved with minimal lines of code and showcasing ease of use.
  4. Leveraging Cohere Embeddings and Rerank does not need a team of ML experts to develop and maintain. So the monthly costs of maintenance were kept to a minimum.
  5. Both solutions are cloud-agnostic and, hence, can be set up on any cloud platform.
The same can be found on a notebook which will help reduce the time and effort following the steps in this blog.

What's next?

To learn more about how MongoDB Atlas is helping build application-side ML integration in real-world applications, you can visit the MongoDB for AI page.

Facebook Icontwitter iconlinkedin icon
Rate this tutorial

Using Atlas Search from Java

Jul 14, 2023 | 13 min read

Optimize With MongoDB Atlas: Performance Advisor, Query Analyzer, and More

Jan 30, 2024 | 6 min read

Working with MongoDB Charts and the New JavaScript SDK

Apr 02, 2024 | 10 min read

Getting Started With Atlas Stream Processing Security

May 17, 2024 | 9 min read
Table of Contents