Tutorial: Build a Movie Search Application Using Atlas Search
Rate this tutorial
Let me guess. You want to give your application users the ability to find EXACTLY what they are looking for FAST! Who doesn't? Search is a requirement for most applications today. With , we have made it easier than ever to integrate simple, fine-grained, and lightning-fast search capabilities into all of your MongoDB applications. To demonstrate just how easy it is, let's build a web application to find our favorite movies.
This tutorial is the first in a four-part series where we will learn over the next few months to build out the application featured in our Atlas Search Product Demo.
|Get up and running with a basic search movie engine allowing us to look for movies based on a topic in our MongoDB Atlas movie data.|
|Part 2||Make it even easier for our users by building more advanced search queries with fuzzy matching and wildcard paths to forgive them for fat fingers and misspellings. We'll introduce custom score modifiers to allow us to influence our movie results.|
|Part 3||Add autocomplete capabilities to our movie application. We'll also discuss index mappings and analyzers and how to use them to optimize the performance of our application.|
|Part 4||Wrap up our application by creating filters to query across dates and numbers to even further fine-tune our movie search results. We'll even host the application on Realm, our serverless backend platform, so you can deliver your movie search website anywhere in the world.|
Now, without any further adieu, let's get this show on the road!
This tutorial will guide you through building a very basic movie search engine on a free tier Atlas cluster. We will set it up in a way that will allow us to scale our search in a highly performant manner as we continue building out new features in our application over the coming weeks. By the end of Part 1, you will have something that looks like this:
To accomplish this, here are our tasks for today:
Once you have your cluster, you can load the sample dataset by clicking the ellipse button and Load Sample Dataset.
Now, let's have a closer look at our sample data within the Atlas Data Explorer. In your Atlas UI, click on Collections to examine the movies collection in the new sample_mflix database. This collection has over 23k movie documents with information such as title, plot, and cast. The sample_mflix.movies collection provides the dataset for our application.
Since our movie search engine is going to look for movies based on a topic, we will use Atlas Search to query for specific words and phrases in the
fullplotfield of the documents.
The first thing we need is an Atlas Search index. Click on the tab titled Search Indexes under Collections. Click on the green Create Search Index button. Let's accept the default settings and click Create Index. That's all you need to do to start taking advantage of Search in your MongoDB Atlas data!
By accepting the default settings when we created the Search index, we dynamically mapped all the fields in the collection as indicated in the default index configuration:
Mapping is simply how we define how the fields on our documents are indexed and stored. If a field's value looks like a string, we'll treat it as a full-text field, similarly for numbers and dates. This suits MongoDB's flexible data model perfectly. As you add new data to your collection and your schema evolves, dynamic mapping accommodates those changes in your schema and adds that new data to the Atlas Search index automatically.
We'll talk more about mapping and indexes in Part 3 of our series. For right now, we can check off another item from our task list.
Let's use the aggregation pipeline builder inside of the Atlas UI to make an aggregation pipeline that makes use of our Atlas Search index. Our basic aggregation will consist of only three stages: $search, $project, and $limit.
You do not have to use the pipeline builder tool for this stage, but I really love the easy-to-use user interface. Plus, the ability to preview the results by stage makes troubleshooting a snap!
Navigate to the Aggregation tab in the sample_mflix.movies collection:
For the first stage, select the
$searchaggregation operator to search for the text "werewolves and vampires" in the
You can also add the highlight option, which will return the highlights by adding fields to the result payload that display search terms in their original context, along with the adjacent text content. (More on this later.)
$searchaggregation stage should be:
Note the returned movie documents in the preview panel on the right. If no documents are in the panel, double-check the formatting in your aggregation code.
$projectto your pipeline to get back only the fields we will use in our movie search application. We also use the
$metaoperator to surface each document's searchScore and searchHighlights in the result set.
Let's break down the individual pieces in this stage further:
"$meta": "searchScore"contains the assigned score for the document based on relevance. This signifies how well this movie's
fullplotfield matches the query terms "werewolves and vampires" above.
Note that by scrolling in the right preview panel, the movie documents are returned with the score in descending order. This means we get the best matched movies first.
"$meta": "searchHighlights"contains the highlighted results.
Because searchHighlights and searchScore are not part of the original document, it is necessary to use a $project pipeline stage to add them to the query output.
Now, open a document's highlight array to show the data objects with text values and types.
highlight.texts.value - text from the
fullplotfield returning a match
highlight.texts.type - either a hit or a text
- hit is a match for the query
- text is the surrounding text context adjacent to the matching string
We will use these later in our application code.
Remember that the results are returned with the scores in descending order.
$limit: 10will therefore bring the 10 most relevant movie documents to your search query. $limit is very important in Search because speed is very important. Without
$limit:10, we would get the scores for all 23k movies. We don't need that.
Finally, if you see results in the right preview panel, your aggregation pipeline is working properly! Let's grab that aggregation code with the Export Pipeline to Language feature by clicking the button in the top toolbar.
Your final aggregation code will be this:
This small snippet of code powers our movie search engine!
Now that we have the heart of our movie search engine in the form of an aggregation pipeline, how will we use it in an application? There are lots of ways to do this, but I found the easiest was to simply create a RESTful API to expose this data - and for that, I leveraged from right inside of Atlas.
Name your Realm application MovieSearchApp and make sure to link to your cluster. All other default settings are fine.
Now click the 3rd Party Services menu on the left and then Add a Service. Select the HTTP service and name it movies:
Click the green Add a Service button, and you'll be directed to Add Incoming Webhook.
Once in the Settings tab, name your webhook getMoviesBasic. Enable Respond with Result, and set the HTTP Method to GET. To make things simple, let's just run the webhook as the System and skip validation with No Additional Authorization. Make sure to click the Review and Deploy button at the top along the way.
In this service function editor, replace the example code with the following:
Let's break down some of these components. MongoDB Realm interacts with your Atlas movies collection through the global context variable. In the service function, we use that context variable to access the sample_mflix.movies collection in your Atlas cluster. We'll reference this collection through the const variable movies:
We capture the query argument from the payload:
Return the aggregation code executed on the collection by pasting your aggregation copied from the aggregation pipeline builder into the code below:
Finally, after pasting the aggregation code, change the terms "werewolves and vampires" to the generic
argto match the function's payload query argument - otherwise our movie search engine capabilities will be extremely limited.
Your final code in the function editor will be:
Now you can test in the Console below the editor by changing the argument from arg1: "hello" to arg: "werewolves and vampires".
Please make sure to change BOTH the field name arg1 to arg, as well as the string value "hello" to "werewolves and vampires" - or it won't work.
Click Run to verify the result:
If this is working, congrats! We are almost done! Make sure to SAVE and deploy the service by clicking REVIEW & DEPLOY CHANGES at the top of the screen.
The beauty of a REST API is that it can be called from just about anywhere. Let's execute it in our browser. However, if you have tools like Postman installed, feel free to try that as well.
Switch back to the Settings of your getMoviesBasic function, and you'll notice a Webhook URL has been generated.
Click the COPY button and paste the URL into your browser. Then append the following to the end of your URL: ?arg="werewolves and vampires"
If you receive an output like what we have above, congratulations! You have successfully created a movie search API! 🙌 💪
Entering data in the search bar will bring you movie search results because the application is currently pointing to an existing API.
- Line 81 - userAction() will execute when the user enters a search. If there is valid input in the search box and no errors, we will call the buildMovieList() function.
- Line 125 - buildMovieList() is a helper function for userAction().
The buildMovieList() function will build out the list of movies along with their scores and highlights from the
fullplotfield. Notice in line 146 that if the highlight.texts.type === "hit" we highlight the highlight.texts.value with a style attribute tag.*
In the userAction() function, notice on line 88 that the webhook_url is already set to a RESTful API I created in my own Movie Search application.
We capture the input from the search form field in line 82 and set it equal to searchString. In this application, we append that searchString input to the webhook_url
before calling it in the fetch API in line 92.
To make this application fully your own, simply replace the existing webhook_url value on line 88 with your own API from the getMoviesBasic Realm HTTP Service webhook you just created. 🤞 Now save these changes, and open the index.html file once more in your browser, et voilà! You have just built your movie search engine using Atlas Search. 😎
Pass the popcorn! 🍿 What kind of movie do you want to watch?!
You have just seen how easy it is to build a simple, powerful search into an application with . In our next tutorial, we continue by building more advanced search queries into our movie application with fuzzy matching and wildcard to forgive fat fingers and typos. We'll even introduce custom score modifiers to allow us to shape our search results. Check out our documentation for other possibilities.
Harnessing the power of Apache Lucene for efficient search algorithms, static and dynamic field mapping for flexible, scalable indexing, all while using the same MongoDB Query Language (MQL) you already know and love, - MongoDB now has a very particular set of skills. Skills we have acquired over a very long career. Skills that make MongoDB a DREAM for developers like you.
Building Generative AI Applications Using MongoDB: Harnessing the Power of Atlas Vector Search and Open Source Models
Sep 26, 2023