Explore Developer Center's New Chatbot! MongoDB AI Chatbot can be accessed at the top of your navigation to answer all your MongoDB questions.

Introducing MongoDB 8.0, the fastest MongoDB ever!
MongoDB Developer
Atlas
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
Atlaschevron-right

Leveraging OpenAI and MongoDB Atlas for Improved Search Functionality

Pavel Duchovny5 min read • Published Sep 18, 2024 • Updated Sep 18, 2024
Node.jsAIAtlasVector SearchJavaScript
FULL APPLICATION
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Some features mentioned below will be deprecated on Sep. 30, 2025. Learn more.
Search functionality is a critical component of many modern web applications. Providing users with relevant results based on their search queries and additional filters dramatically improves their experience and satisfaction with your app.
In this article, we'll go over an implementation of search functionality using OpenAI's GPT-4 model and MongoDB's Atlas Vector search. We've created a request handler function that not only retrieves relevant data based on a user's search query but also applies additional filters provided by the user.
Enriching the existing documents data with embeddings is covered in our main Vector Search Tutorial.

Search in the Airbnb app context

Consider a real-world scenario where we have an Airbnb-like app. Users can perform a free text search for listings and also filter results based on certain criteria like the number of rooms, beds, or the capacity of people the property can accommodate.
To implement this functionality, we use MongoDB's full-text search capabilities for the primary search, and OpenAI's GPT-4 model to create embeddings that contain the semantics of the data and use Vector Search to find relevant results.
Listing search app
The code to the application can be found in the following GitHub repository.

The request handler

For the back end, we have used Atlas app services with a simple HTTPS “GET” endpoint. Endpoint
Our function is designed to act as a request handler for incoming search requests. When a search request arrives, it first extracts the search terms and filters from the query parameters. If no search term is provided, it returns a random sample of 30 listings from the database.
If a search term is present, the function makes a POST request to OpenAI's API, sending the search term and asking for an embedded representation of it using a specific model. This request returns a list of “embeddings,” or vector representations of the search term, which is then used in the next step.
1// This function is the endpoint's request handler.
2// It interacts with MongoDB Atlas and OpenAI API for embedding and search functionality.
3exports = async function({ query }, response) {
4 // Query params, e.g. '?search=test&beds=2' => {search: "test", beds: "2"}
5 const { search, beds, rooms, people, maxPrice, freeTextFilter } = query;
6
7 // MongoDB Atlas configuration.
8 const mongodb = context.services.get('mongodb-atlas');
9 const db = mongodb.db('sample_airbnb'); // Replace with your database name.
10 const listingsAndReviews = db.collection('listingsAndReviews'); // Replace with your collection name.
11
12 // If there's no search query, return a sample of 30 random documents from the collection.
13 if (!search || search === "") {
14 return await listingsAndReviews.aggregate([{$sample: {size: 30}}]).toArray();
15 }
16
17 // Fetch the OpenAI key stored in the context values.
18 const openai_key = context.values.get("openAIKey");
19
20 // URL to make the request to the OpenAI API.
21 const url = 'https://api.openai.com/v1/embeddings';
22
23 // Call OpenAI API to get the embeddings.
24 let resp = await context.http.post({
25 url: url,
26 headers: {
27 'Authorization': [`Bearer ${openai_key}`],
28 'Content-Type': ['application/json']
29 },
30 body: JSON.stringify({
31 input: search,
32 model: "text-embedding-ada-002"
33 })
34 });
35
36 // Parse the JSON response
37 let responseData = EJSON.parse(resp.body.text());
38
39 // Check the response status.
40 if(resp.statusCode === 200) {
41 console.log("Successfully received embedding.");
42
43 // Fetch a random sample document.
44
45
46 const embedding = responseData.data[0].embedding;
47 console.log(JSON.stringify(embedding))
48
49 let searchQ = {
50 "index": "default",
51 "queryVector": embedding,
52 "path": "doc_embedding",
53 "k": 100,
54 "numCandidates": 1000
55 }
56
57 // If there's any filter in the query parameters, add it to the search query.
58 if (freeTextFilter){
59 // Turn free text search using GPT-4 into filter
60 const sampleDocs = await listingsAndReviews.aggregate([
61 { $sample: { size: 1 }},
62 { $project: {
63 _id: 0,
64 bedrooms: 1,
65 beds: 1,
66 room_type: 1,
67 property_type: 1,
68 price: 1,
69 accommodates: 1,
70 bathrooms: 1,
71 review_scores: 1
72 }}
73 ]).toArray();
74
75 const filter = await context.functions.execute("getSearchAIFilter",sampleDocs[0],freeTextFilter );
76 searchQ.filter = filter;
77 }
78else if(beds || rooms) {
79 let filter = { "$and" : []}
80
81 if (beds) {
82 filter.$and.push({"beds" : {"$gte" : parseInt(beds) }})
83 }
84 if (rooms)
85 {
86 filter.$and.push({"bedrooms" : {"$gte" : parseInt(rooms) }})
87 }
88 searchQ.filter = filter;
89}
90
91 // Perform the search with the defined query and limit the result to 50 documents.
92 let docs = await listingsAndReviews.aggregate([
93 { "$vectorSearch": searchQ },
94 { $limit : 50 }
95 ]).toArray();
96
97 return docs;
98 } else {
99 console.error("Failed to get embeddings");
100 return [];
101 }
102};
To cover the filtering part of the query, we are using embedding and building a filter query to cover the basic filters that a user might request — in the presented example, two rooms and two beds in each.
1else if(beds || rooms) {
2 let filter = { "$and" : []}
3
4 if (beds) {
5 filter.$and.push({"beds" : {"$gte" : parseInt(beds) }})
6 }
7 if (rooms)
8 {
9 filter.$and.push({"bedrooms" : {"$gte" : parseInt(rooms) }})
10 }
11 searchQ.filter = filter;
12}

Calling OpenAI API

AI Filter
Let's consider a more advanced use case that can enhance our filtering experience. In this example, we are allowing a user to perform a free-form filtering that can provide sophisticated sentences, such as, “More than 1 bed and rating above 91.”
We call the OpenAI API to interpret the user's free text filter and translate it into something we can use in a MongoDB query. We send the API a description of what we need, based on the document structure we're working with and the user's free text input. This text is fed into the GPT-4 model, which returns a JSON object with 'range' or 'equals' operators that can be used in a MongoDB search query.

getSearchAIFilter function

1// This function is the endpoint's request handler.
2// It interacts with OpenAI API for generating filter JSON based on the input.
3exports = async function(sampleDoc, search) {
4 // URL to make the request to the OpenAI API.
5 const url = 'https://api.openai.com/v1/chat/completions';
6
7 // Fetch the OpenAI key stored in the context values.
8 const openai_key = context.values.get("openAIKey");
9
10 // Convert the sample document to string format.
11 let syntDocs = JSON.stringify(sampleDoc);
12 console.log(syntDocs);
13
14 // Prepare the request string for the OpenAI API.
15 const reqString = `Convert programmatic command to Atlas $search filter only for range and equals JS:\n\nExample: Based on document structure {"siblings" : '...', "dob" : "..."} give me the filter of all people born 2015 and siblings are 3 \nOutput: {"filter":{ "compound" : { "must" : [ {"range": {"gte": 2015, "lte" : 2015,"path": "dob"} },{"equals" : {"value" : 3 , path :"siblings"}}]}}} \n\n provide the needed filter to accomodate ${search}, pick a path from structure ${syntDocs}. Need just the json object with a range or equal operators. No explanation. No 'Output:' string in response. Valid JSON.`;
16 console.log(`reqString: ${reqString}`);
17
18 // Call OpenAI API to get the response.
19 let resp = await context.http.post({
20 url: url,
21 headers: {
22 'Authorization': `Bearer ${openai_key}`,
23 'Content-Type': 'application/json'
24 },
25 body: JSON.stringify({
26 model: "gpt-4",
27 temperature: 0.1,
28 messages: [
29 {
30 "role": "system",
31 "content": "Output filter json generator follow only provided rules"
32 },
33 {
34 "role": "user",
35 "content": reqString
36 }
37 ]
38 })
39 });
40
41 // Parse the JSON response
42 let responseData = JSON.parse(resp.body.text());
43
44 // Check the response status.
45 if(resp.statusCode === 200) {
46 console.log("Successfully received code.");
47 console.log(JSON.stringify(responseData));
48
49 const code = responseData.choices[0].message.content;
50 let parsedCommand = EJSON.parse(code);
51 console.log('parsed' + JSON.stringify(parsedCommand));
52
53 // If the filter exists and it's not an empty object, return it.
54 if (parsedCommand.filter && Object.keys(parsedCommand.filter).length !== 0) {
55 return parsedCommand.filter;
56 }
57
58 // If there's no valid filter, return an empty object.
59 return {};
60
61 } else {
62 console.error("Failed to generate filter JSON.");
63 console.log(JSON.stringify(responseData));
64 return {};
65 }
66};

MongoDB search and filters

The function then constructs a MongoDB search query using the embedded representation of the search term and any additional filters provided by the user. This query is sent to MongoDB, and the function returns the results as a response —something that looks like the following for a search of “New York high floor” and “More than 1 bed and rating above 91.”
1{$vectorSearch:{
2 "index": "default",
3 "queryVector": embedding,
4 "path": "doc_embedding",
5 "filter" : { "$and" : [{"beds": {"$gte" : 1}} , "score": {"$gte" : 91}}]},
6 "k": 100,
7 "numCandidates": 1000
8 }
9}

Conclusion

This approach allows us to leverage the power of OpenAI's GPT-4 model to interpret free text input and MongoDB's full-text search capability to return highly relevant search results. The use of natural language processing and AI brings a level of flexibility and intuitiveness to the search function that greatly enhances the user experience.
Remember, however, this is an advanced implementation. Ensure you have a good understanding of how MongoDB and OpenAI operate before attempting to implement a similar solution. Always take care to handle sensitive data appropriately and ensure your AI use aligns with OpenAI's use case policy.

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Article

Atlas Data Lake SQL Integration to Form Powerful Data Interactions


Jun 12, 2023 | 3 min read
Tutorial

Building AI Graphs With Rivet and MongoDB Atlas Vector Search to Power AI Applications


Sep 18, 2024 | 10 min read
Article

Audio Find - Atlas Vector Search for Audio


Sep 09, 2024 | 11 min read
Tutorial

How to Seamlessly Use MongoDB Atlas and IBM watsonx.ai LLMs in Your GenAI Applications


Sep 18, 2024 | 9 min read
Table of Contents