Searching on Your Location with Atlas Search and Geospatial Operators
Nic RaboyPublished Jan 28, 2022 • Updated Feb 03, 2023
Rate this tutorial
Being able to use natural language search on text data is probably one of the most popular use-cases, but there are scenarios where you might need to narrow the results even further.
Let's say you're building a restaurant review application like Yelp or a bed and breakfast booking system like Airbnb. Sure, you'll enter some kind of text criteria for what you're looking for, but there's also a location aspect to it. For example, if you want to find a place to get a cheeseburger within walking distance of your current location, you probably don't want your search results to contain entries from another country. This is an example of a geo search, where you would want to return results based on location coordinates.
In this tutorial, we're going to see how to use Atlas Search and the compound operator to search based on text entered and within a certain geographical area. For the text entered, we'll use the autocomplete operator, and for the geospatial component, we'll use the geoWithin operator.
To get an idea of what we want to accomplish, take a look at the following animated image:
- MongoDB Atlas (M0+ cluster)
- Node.js (15.9.0+)
- The sample_airbnb sample dataset (load it for free in Atlas).
The above versions are just the versions that I'm using. You might have success with an older version of Node.js as well. For MongoDB Atlas, you can use a FREE M0 cluster or something more powerful.
We're going to be using a sample dataset for this example. You can learn more about sample_airbnb and the others in the documentation.
We're going to build an API, but it is going to have a single endpoint. The purpose of this API is to allow our front end to interact with MongoDB.
On your computer, create a new directory for our back end and execute the following from the command line:
The above commands will create a new package.json file and then download the MongoDB and Express Framework dependencies. Because we're going to have our back end and front end running locally on different ports, installing a cross-origin resource sharing (CORS) package is also necessary.
Add the above code to a main.js file within your project directory. If you want a quick start for MongoDB with Node.js, Lauren Schaefer wrote a multi-part series to get you up to speed.
There is one thing to note in the above code:
My MongoDB Atlas connection information is being stored as an environment variable on my computer. While environment variables are the safest approach, make sure you swap it with whatever you plan to use.
With the boilerplate code out of the way, we can focus on what matters for this example: the aggregation pipeline for searching on text and geospatial data. However, before we start writing pipeline stages, we need to properly index our data for search.
In MongoDB Atlas, select the top-level Search tab after choosing one of your clusters. Within this tab, select the Create Index button which will bring you into a configuration wizard for creating Atlas Search indexes.
Rather than using the visual editor to create an index, we're going to use the JSON Editor with the following configuration. Provide sample_airbnb as the database and listingsAndReviews as the collection. You can copy and paste the following index configuration:
While the name of the index doesn't impact its functionality, we're going to name it autocomplete and reference it within our Node.js application. To break down what the above index does, we are indexing two fields within the documents of our collection. The
address.locationfield is being indexed as a geospatial field while the
namefield is being indexed as an autocomplete text field. No other fields within our document will be searchable based on this index.
By the end of the index creation, you should have something that looks like this:
So, let's go back to our code.
We know that our search results should be dependent on the text the user provides and the user's location (as a latitude and longitude).
If we wanted to search just with text, our aggregation pipeline stage (query) would look like the following:
The above stage says that we want to use the
autocompleteindex for our search and we want to use the
autocompleteoperator. We're searching for "apartment" on the
namefield and we're saying that we're allowing typo tolerance, AKA fuzzy matching.
If we wanted to use Atlas Search to search within a geographic area, our aggregation pipeline stage would look like the following:
The above stage says once again we're using the
autocompleteindex, but we're using the
geoWithinoperator. We're searching within a circular area where the center point is specified by a latitude and longitude. When working with GeoJSON like in the above code, the longitude is the first element in the
coordinatesarray and the latitude is the second element. We're also providing a radius to search around the center point.
We just created two possible aggregation pipeline stages. The problem is that we want to be efficient. We don't want to search text using one stage and then apply a geo range on the results in a different stage. Instead we want to do our
geoWithinoperations within a single query.
We can do this with the
To combine multiple operations, we can change our code and aggregation pipeline logic to look like the following:
Notice that we're including a
mustarray within the
compoundoperator. You can learn more about each of the compound terms in the documentation, but the
mustoperator defines which clauses must match to produce results.
To clear things up, we're saying that the results must satisfy both the
Now, you can run the Node.js application and send the following payload to the endpoint using a POST request:
Given the data that is in the sample_airbnb dataset, we should end up with results around New York. However, the data we get back is likely more than we need. To limit the response, we can update our aggregation pipeline to not only search, but to project the fields we want in our response.
Modify the code to look like the following:
To be clear about what was added to the above code, take note of the $project stage.
In the above
$projectstage, we are saying we only want the
addressfields returned. We are also interested in the scoring data that came back from our search. By default, the scoring data would not be present in our results. This data might be useful for determining the quality of the match.
With the back end out of the way, let's focus on the front end.
Create another project directory that will represent the frontend application. Within that directory, create an index.html file with the following markup:
The markup above is boilerplate for getting started with jQuery — the exception being the
<div>container that has the input field. The
idof the input field is going to be important for when we work with jQuery.
So, let's have a look at the autocomplete logic for the front end.
In the above code, we are using the
autocompletefunction for jQuery on the
bnbinput element. The
sourceof our data to show on the screen will come from our API endpoint. As characters are entered into the field, a POST request is made with the expected JSON payload. The results are then formatted to how jQuery expects them to be, in this case having a
idfield within an object.
Because we want a narrow scope for this example, we won't be looking at the logic for when an element is selected from the returned autocomplete results. However, just having the
sourcefield will allow us to visually show autocomplete results as we type them.
To run this example, you'll need to serve the back end and front end separately. For the back end, navigate into the project directory with your command line and execute the following:
The above command should start serving the API on port 3000. To serve the front end, you'll either need Python or a compatible tool or package like serve, which is available through NPM.
If Python is available, you can execute the following from within your frontend project directory:
The above command will serve the front end on port 8000.
Of course, what I listed for serving your applications was meant for local development and testing. You'll have to use your best judgment when deploying your applications to production.
You just saw how to use the
compoundoperator for MongoDB Atlas Search to search based on text as well as within a geospatial area, in this case a circle. Why might this be valuable? Imagine needing to search for a hotel or restaurant near your location or within walking distance rather than returning all possible matches based on your text input. The
compoundoperator lets you search for results only if they match the compounding terms provided.
Questions or comments on this tutorial? Head to the MongoDB Community Forums and let's chat!