Part #1: Build Your Own Vector Search with MongoDB Atlas and Amazon SageMaker

Dominic Frei4 min read • Published Jan 26, 2024 • Updated Feb 07, 2024

AI Serverless AWS Atlas Search Python

Rate this tutorial

Have you heard about machine learning, models, and AI but don't quite know where to start? Do you want to search your data semantically? Are you interested in using vector search in your application?

Then you’ve come to the right place!

This series will introduce you to MongoDB Atlas Vector Search and Amazon SageMaker, and how to use both together to semantically search your data.

This first part of the series will focus on the architecture of such an application — i.e., the parts you need, how they are connected, and what they do.

The following parts of the series will then dive into the details of how the individual elements presented in this architecture work (Amazon SageMaker in Part 2 and MongoDB Atlas Vector Search in Part 3) and their actual configuration and implementation. If you are just interested in one of these two implementations, have a quick look at the architecture pictures and then head to the corresponding part of the series. But to get a deep understanding of Vector Search, I recommend reading the full series.

Let’s start with why though: Why should you use MongoDB Atlas Vector Search and Amazon SageMaker?

Components of your application

In machine learning, an embedding model is a type of model that learns to represent objects — such as words, sentences, or even entire documents — as vectors in a high-dimensional space. These vectors, called embeddings, capture semantic relationships between the objects.

On the other hand, a large language model, which is a term you might have heard of, is designed to understand and generate human-like text. It learns patterns and relationships within language by processing vast amounts of text data. While it also generates embeddings as an internal representation, the primary goal is to understand and generate coherent text.

Embedding models are often used in tasks like natural language processing (NLP), where understanding semantic relationships is crucial. For example, word embeddings can be used to find similarities between words based on their contextual usage.

In summary, embedding models focus on representing objects in a meaningful way in a vector space, while large language models are more versatile, handling a wide range of language-related tasks by understanding and generating text.

For our needs in this application, an embedding model is sufficient. In particular, we will be using All MiniLM L6 v2 by Hugging Face.

Amazon SageMaker isn't just another AWS service; it's a versatile platform designed by developers, for developers. It empowers us to take control of our machine learning projects with ease. Unlike traditional ML frameworks, SageMaker simplifies the entire ML lifecycle, from data preprocessing to model deployment. As software engineers, we value efficiency, and SageMaker delivers precisely that, allowing us to focus more on crafting intelligent models and less on infrastructure management. It provides a wealth of pre-built algorithms, making it accessible even for those not deep into the machine learning field.

MongoDB Atlas Vector Search is a game-changer for developers like us who appreciate the power of simplicity and efficiency in database operations. Instead of sifting through complex queries and extensive code, Atlas Vector Search provides an intuitive and straightforward way to implement vector-based search functionality. As software engineers, we know how crucial it is to enhance the user experience with lightning-fast and accurate search results. This technology leverages the benefits of advanced vector indexing techniques, making it ideal for projects involving recommendation engines, content similarity, or even gaming-related features. With MongoDB Atlas Vector Search, we can seamlessly integrate vector data into our applications, significantly reducing development time and effort. It's a developer's dream come true – practical, efficient, and designed to make our lives easier in the ever-evolving world of software development.

Generating and updating embeddings for your data

There are two steps to using Vector Search in your application.

The first step is to actually create vectors (also called embeddings or embedding vectors), as well as update them whenever your data changes. The easiest way to watch for newly inserted and updated data from your server application is to use MongoDB Atlas triggers and watch for exactly those two events. The triggers themselves are out of the scope of this tutorial but you can find other great resources about how to set them up in Developer Center.

The trigger then executes a script that creates new vectors. This can, for example, be done via MongoDB Atlas Functions or as in this diagram, using AWS Lambda. The script itself then uses the Amazon SageMaker endpoint with your desired model deployed via the REST API to create or update a vector in your Atlas database.

The important bit here that makes the usage so easy and the performance so great is that the data and the embeddings are saved inside the same database:

Data that belongs together gets saved together.

How to deploy and prepare this SageMaker endpoint and offer it as a REST service for your application will be discussed in detail in Part 2 of this tutorial.

Querying your data

The other half of your application will be responsible for taking in queries to semantically search your data.

Note that a search has to be done using the vectorized version of the query. And the vectorization has to be done with the same model that we used to vectorize the data itself. The same Amazon SageMaker endpoint can, of course, be used for that.

Therefore, whenever a client application sends a request to the server application, two things have to happen.

The server application needs to call the REST service that provides the Amazon SageMaker endpoint (see the previous section).
With the vector received, the server application then needs to execute a search using Vector Search to retrieve the results from the database.

The implementation of how to query Atlas can be found in Part 3 of this tutorial.

Wrapping it up

This short, first part of the series has provided you with an overview of a possible architecture to use Amazon SageMaker and MongoDB Atlas Vector Search to semantically search your data.

Have a look at Part 2 if you are interested in how to set up Amazon SageMaker and Part 3 to go into detail about MongoDB Atlas Vector Search.

✅ Sign-up for a free cluster.

✅ Already have an AWS account? Atlas supports paying for usage via the AWS Marketplace (AWS MP) without any upfront commitment — simply sign up for MongoDB Atlas via AWS Marketplace.

✅ Get help on our Community Forums.

Rate this tutorial

This is part of a series

Vector Search with MongoDB Atlas and Amazon SageMaker

Up Next

Part #2: Create Your Model Endpoint With Amazon SageMaker, AWS Lambda, and AWS API Gateway

Continue

RAG Series Part 2: How to Evaluate Your RAG Application

May 13, 2024 | 20 min read

Tutorial

Building an AI Agent With Memory Using MongoDB, Fireworks AI, and LangChain

Apr 23, 2024 | 21 min read

News & Announcements

Unlock the Value of Data in MongoDB Atlas with the Intelligent Analytics of Microsoft Fabric

Nov 17, 2023 | 6 min read

Tutorial

Serverless Development with Kotlin, AWS Lambda, and MongoDB Atlas

Aug 01, 2023 | 6 min read

Components of your application
Generating and updating embeddings for your data
Querying your data
Wrapping it up

Atlas