How to Seamlessly Use MongoDB Atlas and IBM watsonx.ai LLMs in Your GenAI Applications

Ashwin Gangadhar9 min read • Published Nov 14, 2023 • Updated Nov 15, 2023

AI Search Python Atlas

Rate this tutorial

One of the challenges of e-commerce applications is to provide relevant and personalized product recommendations to customers. Traditional keyword-based search methods often fail to capture the semantic meaning and intent of the user search queries, and return results that do not meet the user’s needs. In turn, they fail to convert into a successful sale. To address this problem, RAG (retrieval-augmented generation) is used as a framework powered by MongoDB Atlas Vector Search, LangChain, and IBM watsonx.ai.

RAG is a natural language generation (NLG) technique that leverages a retriever module to fetch relevant documents from a large corpus and a generator module to produce text conditioned on the retrieved documents. Here, the RAG framework is used to power product recommendations as an extension to existing semantic search techniques.

RAG use cases can be easily built using the vector search capabilities of MongoDB Atlas to store and query large-scale product embeddings that represent the features and attributes of each product. Because of MongoDB’s flexible schema, these are stored right alongside the product embeddings, eliminating the complexity and latency of having to retrieve the data from separate tables or databases.
RAG then retrieves the most similar products to the user query based on the cosine similarity of their embeddings, and generates natural language reasons that highlight why these products are relevant and appealing to the user.
RAG can also enhance the user experience (UX) by handling complex and diverse search queries, such as "a cozy sweater for winter" or "a gift for my daughter who is interested in science", and provides accurate and engaging product recommendations that increase customer satisfaction and loyalty.

Why MongoDB Atlas Vector Search?

To enable RAG to retrieve relevant documents, we need a vector search engine that can efficiently store and query large-scale document embeddings. MongoDB Atlas allows us to index documents based on their embeddings and perform similarity search using cosine distance, euclidean distance, or dot product. MongoDB Atlas also provides flexible schema, scalability, security, and performance features that over 45,000 organizations — from startups to enterprises and governments — rely on.

By using RAG with MongoDB Atlas Vector Search, we can enhance the user experience of product recommendations in several ways. First, we can provide more personalized and diverse recommendations that match the user's query or context. Second, we can generate more informative and engaging responses that can explain why each product is recommended and how it compares to others. Third, we can improve the relevance and accuracy of the recommendations by updating the document embeddings as MongoDB is primarily an operational data layer (ODL) that provides transaction features.

What is watsonx.ai?

Watsonx.ai is IBM’s next-generation enterprise studio for AI builders, bringing together new generative AI capabilities with traditional machine learning (ML) that span the entire AI lifecycle. With watsonx.ai, you can train, validate, tune, and deploy foundation and traditional ML models.

watsonx.ai brings forth a curated library of foundation models, including IBM-developed models, open-source models, and models sourced from third-party providers. Not all models are created equal, and the curated library provides enterprises with the optionality to select the model best suited to a particular use case, industry, domain, or even price performance. Further, IBM-developed models, such as the Granite model series, offer another level of enterprise-readiness, transparency, and indemnification for production use cases. We’ll be using Granite models in our demonstration. For the interested reader, IBM has published information about its data and training methodology for its Granite foundation models.

How to build a custom RAG-powered product discovery pipeline

For this tutorial, we will be using an e-commerce products dataset containing over 10,000 product details. We will be using the sentence-transformers/all-mpnet-base-v2 model from Hugging Face to generate the vector embeddings to store and retrieve product information. You will need a Python notebook or an IDE, a MongoDB Atlas account, and a wastonx.ai account for hands-on experience.

For convenience, the notebook to follow along and execute in your environment is available on GitHub.

Python dependencies

langchain: Orchestration framework
ibm-watson-machine-learning: For IBM LLMs
wget: To download knowledge base data
sentence-transformers: For embedding model
pymongo: For the MongoDB Atlas vector store

watsonx.ai dependencies

We’ll be using the watsonx.ai foundation models and Python SDK to implement our RAG pipeline in LangChain.

Sign up for a free watsonx.ai trial on IBM cloud. Register and get set up.
Create a watsonx.ai Project. During onboarding, a sandbox project can be quickly created for you. You can either use the sandbox project or create one; the link will work once you have registered and set up watsonx.ai. If more help is needed, you can read the documentation.
Create an API key to access watsonx.ai foundation models. Follow the steps to create your API key.
Install and use watsonx.ai. Also known as the IBM Watson Machine Learning SDK, watsonx.ai SDK information is available on GitHub. Like any other Python module, you can install it with a pip install. Our example notebook takes care of this for you.

We will be running all the code snippets below in a Jupyter notebook. You can choose to run these on VS Code or any other IDE of your choice.

Initialize the LLM

Initialize the watsonx URL to connect by running the below code blocks in your Jupyter notebook:

Code Snippet

Enter the URL for accessing the watsonx URL domain. For example: https://us-south.ml.cloud.ibm.com.

To be able to access the LLM models and other AI services on watsonx, you need to initialize the API key. You init the API key by running the following code block in you Jupyter notebook:

Code Snippet

You will be prompted when you run the above code to add the IAM API key you fetched earlier.

Each experiment can tagged or executed under specific projects. To fetch the relevant project, we can initialize the project ID by running the below code block in the Jupyter notebook:

Code Snippet

You can find the project ID alongside your IAM API key in the settings panel in the watsonx.ai portal.

Language model

In the code example below, we will initialize Granite LLM from IBM and then demonstrate how to use the initialized LLM with the LangChain framework before we build our RAG.

We will use the query: "I want to introduce my daughter to science and spark her enthusiasm. What kind of gifts should I get her?"

This will help us demonstrate how the LLM and vector search work in an RAG framework at each step.

Firstly, let us initialize the LLM hosted on the watsonx cloud. To access the relevant Granite model from watsonx, you need to run the following code block to initialize and test the model with our sample query in the Jupyter notebook:

Code Snippet

from ibm_watson_machine_learning.foundation_models.utils.enums import ModelTypes
from ibm_watson_machine_learning.foundation_models import Model
from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as GenParams
from ibm_watson_machine_learning.foundation_models.utils.enums import DecodingMethods

parameters = {
    GenParams.DECODING_METHOD: DecodingMethods.GREEDY,
    GenParams.MIN_NEW_TOKENS: 1,
    GenParams.MAX_NEW_TOKENS: 100
}

model = Model(
    model_id=ModelTypes.GRANITE_13B_INSTRUCT,
    params=parameters,
    credentials={
        "url": wxa_url,
        "apikey": wxa_api_key
    },
    project_id=wxa_project_id
)

from ibm_watson_machine_learning.foundation_models.extensions.langchain import WatsonxLLM

granite_llm_ibm = WatsonxLLM(model=model)

# Sample query chosen in the example to evaluate the RAG use case
query = "I want to introduce my daughter to science and spark her enthusiasm. What kind of gifts should I get her?"

# Sample LLM query without RAG framework
result = granite_llm_ibm(query)

Output:

Initialize MongoDB Atlas for vector search

Prior to starting this section, you should have already set up a cluster in MongoDB Atlas. If you have not created one for yourself, then you can follow the steps in the MongoDB Atlas tutorial to create an account in Atlas (the developer data platform) and a cluster with which we can store and retrieve data. It is also advised that the users spin an Atlas dedicated cluster with size M10 or higher for this tutorial.

Now, let us see how we can set up MongoDB Atlas to provide relevant information to augment our RAG framework.

Init Mongo client

We can connect to the MongoDB Atlas cluster using the connection string as detailed in the tutorial link above. To initialize the connection string, run the below code block in your Jupyter notebook:

Code Snippet

When prompted, you can enter your MongoDB Atlas connection string.

Download and load data to MongoDB Atlas

In the steps below, we demonstrate how to download the products dataset from the provided URL link and add the documents to the respective collection in MongoDB Atlas. We will also be embedding the raw product texts as vectors before adding them in MongoDB. You can do this by running the following lines of code your Jupyter notebook:

Code Snippet

You will be able to see the documents have been created in amazon database under the collection products.

Now all the product information is added to the respective collection, we can go ahead and create a vector index by following the steps given in the Atlas Search index tutorial. You can create the search index using both the Atlas UI as well as programmatically. Let us look at the steps if we are doing this using the Atlas UI.

You can select the Json Editor view to insert the vector search index config as shown in the image. Insert the following mapping to create the vector indexes.

Code Snippet

Once you have updated the vector search indexes in the Atlas UI, this will start an automatic process to create the vector search indexes for the collection for the first time. Also, this will automatically index any new document added to the collection. Seamlessly, you will be able to update the collection in real time as you add new products and augment the RAG framework so that users are always served with the latest and greatest products you are offering.

For a description of the other fields in this configuration, you can check out our Vector Search documentation.

Sample query to vector search

We can test the vector similarity search by running the sample query with the LangChain MongoDB Atlas Vector Search connector. Run the following code in your Jupyter notebook:

Code Snippet

In the above example code, we are able to use our sample text query to retrieve three relevant products. Further in the tutorial, let’s see how we can combine the capabilities of LLMs and vector search to build a RAG framework. For further information on various operations you can perform with the MongoDBAtlasVectorSearch module in LangChain, you can visit the Atlas Vector Search documentation.

RAG chain

In the code snippets below, we demonstrate how to initialize and query the RAG chain. We also introduce methods to improve the output from RAG so you can customize your output to cater to specific needs, such as the reason behind the product recommendation, language translation, summarization, etc.

So, you can set up the RAG chain and execute to get the response for our sample query by running the following lines of code in your Jupyter notebook:

Code Snippet

The output will look like this:

You can see from the example output where the RAG is able to recommend products based on the query as well as provide a reasoning or explanation as to how this product suggestion is relevant to the query, thereby enhancing the user experience.

Conclusion

In this tutorial, we demonstrated how to use watsonx LLMs along with Atlas Vector Search to build a RAG framework. We also demonstrated how to efficiently use the RAG framework to customize your application needs, such as the reasoning for product suggestions. By following the steps in the article, we were also able to bring the power of machine learning models to a private knowledge base that is stored in the Atlas Developer Data Platform.

In summary, RAG is a powerful NLG technique that can generate product recommendations as an extension to semantic search using vector search capabilities provided by MongoDB Atlas. RAG can also improve the UX of product recommendations by providing more personalized, diverse, informative, and engaging descriptions.

Next steps

Explore more details on how you can build generative AI applications using various assisted technologies and MongoDB Atlas Vector Search.

To learn more about Atlas Vector Search, visit the product page or the documentation for creating a vector search index or running vector search queries.

To learn more about watsonx, visit the IBM watsonx page.

Rate this tutorial

Tutorial

How to Build a RAG System With LlamaIndex, OpenAI, and MongoDB Vector Database

Feb 16, 2024 | 10 min read

Article

Auto Pausing Inactive Clusters

Nov 03, 2022 | 10 min read

Tutorial

Adding Autocomplete To Your NextJS Applications With Atlas Search

Feb 28, 2023 | 11 min read

Tutorial

Is it Safe to Go Outside? Data Investigation With MongoDB

Sep 23, 2022 | 11 min read

Why MongoDB Atlas Vector Search?
What is watsonx.ai?
How to build a custom RAG-powered product discovery pipeline
Conclusion
Next steps

Atlas

How to Seamlessly Use MongoDB Atlas and IBM watsonx.ai LLMs in Your GenAI Applications

Why MongoDB Atlas Vector Search?

What is watsonx.ai?

How to build a custom RAG-powered product discovery pipeline

Python dependencies

watsonx.ai dependencies

Initialize MongoDB Atlas for vector search

RAG chain

Conclusion

Next steps

Related

How to Build a RAG System With LlamaIndex, OpenAI, and MongoDB Vector Database

Auto Pausing Inactive Clusters

Adding Autocomplete To Your NextJS Applications With Atlas Search

Is it Safe to Go Outside? Data Investigation With MongoDB

Table of Contents