How to Model Your Documents for Vector Search

6 min • Published Jan 31, 2024

Rate this video

Search Video Summary

00:00:04Introduction and Basics of Data Modeling

The video starts with Anaiya Raisinghani introducing herself and the topic of the tutorial. She then dives into the basics of data modeling in MongoDB, explaining how it revolves around organizing data into documents in different collections. The structure of the data model depends on the use case or project.

00:02:03Understanding and Incorporating Vector Search

In this section, Anaiya explains the concept of Vector Search and how it allows for searching based on meaning rather than specific words. She then discusses how to incorporate vector embeddings into a data model, highlighting that they can be stored alongside other data inside a document or in a separate collection.

00:04:03Indexing and Querying with Vector Search

The final section of the video covers the technical aspects of using Vector Search. Anaiya explains the necessity of creating a search index when using Vector Search and provides a brief overview of how to do this. She then discusses how to use the Vector Search operator for querying and finding results from embedded data. The video concludes with Anaiya thanking the viewers for watching and providing links for further information in the description.

The main theme of the video is how to use the newly released MongoDB Atlas Vector Search for data modeling and querying.

🔑 Key Points

Data modeling in MongoDB revolves around organizing data into documents in different collections.
Vector Search allows for searching based on meaning rather than specific words, which is useful for querying using similarities.
Vector embeddings can be stored alongside other data inside a document or in a separate collection.
When using Vector Search, it is necessary to create a search index.
The Vector Search operator is a new aggregation stage inside of MongoDB Atlas that helps execute an approximate nearest neighbor query.

🔗 Related Links

All MongoDB Videos

Full Video Transcript

hi everyone my name is Anaiya Raisinghani and I am an associate developer Advocate over here with mongodb mongodb Atlas Vector search was just recently released so let's dive into a tutorial on how to properly model your documents when utilizing Vector search to truly revolutionize your querying capabilities since Vector search is new let's first go over an introduction to data modeling in mongod DP before before we incorporate our Vector embeddings so data modeling really revolves around organizing your data into documents in different collections the structure of your data model is actually going to depend on your use case or your project but there are some commonalities that every developer should know these are choosing whether to embed or reference your related data whether or not to use arrays inside of your document and whether you need to index your documents for a more in-depth explanation and a comprehensive guide of data modeling in mongodb please check out the article that's Linked In the description below now let's dive into an example of setting up a data model inside of mongodb we are going to be building our Vector embedding example using a mongodb document for our mongodb TV series on my screen here we have a single mongodb document representing our mongodb TV show without any embeddings in place as you can see we have a nested array featuring our array of seasons and within that our array of different episodes this way in our document we are capable of seeing exactly which season each episode is a part of along with the episode number the title the description and the date now that we have our example set up let's let's incorporate vector embeddings and discuss the proper techniques to set you up for Success so let's first understand exactly what Vector search is Vector search is the way to search based on meaning rather than on specific words this comes in handy when querying using similarities rather than searching based on keywords when using Vector search you can actually query using a question or a phrase rather than just a word or two so in a nutshell Vector surge is great for when you can't remember the name of that specific movie but you can remember the climax or the plot this process happens when text video or audio is transformed via an encoder into vectors with mongodb we can actually do this process using open AI hugging face or other natural language processing models once we have our vectors we can upload them into the base of our document and and then conduct Vector search and search for whatever you like please keep in mind the current limitations to using Vector search and how to properly embed your vectors you are able to store your vector embeddings alongside other data inside of your document or you can store them in a m collection it is really up to you the user or the project goals so now let's go over what a document with Vector embeddings can look like when you incorporate them into your data model using the same data model as before here we have our Vector embeddings classified at the base of our document currently there is a limitation where Vector embeddings cannot be nested in an array in your document so please ensure your document has your embeddings at the base the way we do here we have various tutorials on our developer Center our YouTube channel and in our documentation that can help you figure out how to embed these vectors into your document along with how to acquire these Vector embeddings in the in the first place and everything will be linked in the description below so now let's head into a little extra portion and that is indexing with Vector search when you are using Vector search it is necessary to create a search index so you are able to be successful when using semantic search to do this we have examples inside of our Vector search documentation but I'm also going to go over some sceleton code with you all right now so here is a skeleton code provided by our documentation when setting up your search index you want to change the path to be your vector path in our case it would be Vector embeddings type can just stay the way that it is for our number of Dimensions please match the dimensions of the model that you've chosen this is just the number of vector dimensions and the value cannot be greater than 2048 this limitation actually comes from the base embedding model that is being used so please ensure you're using a supported llm or large language model such as open AI or hugging face when using one of these there won't be any issues running into Vector dimensions and then for similarity please pick which Vector function you want to use to search for the top K nearest neighbors so now it's time to go over another extra tidbit which is quering with Vector search when you are ready to query and find results from your embedded data it is time to create an aggregation pip line on your embedded Vector data so to do this you can use the dollar sign Vector search operator which is a new aggregation stage inside of mongodb Atlas and it really helps execute an approximate nearest neighbor query so for more information on this aggregation operator how to use Factor search and everything else discussed inside of this tutorial please check out our developer Center and our YouTube channel for more information I'm also going to be linking everything in the description below so thank you so much for watching and for following along

Rate this video

Video

The Atlas Search 'cene: Season 1

Sep 11, 2024 | 2 min