BlogAtlas Vector Search voted most loved vector database in 2024 Retool State of AI reportLearn more >>
MongoDB Developer
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right

Cross Cluster Search Using Atlas Search and Data Federation

Pavel Duchovny3 min read • Published Jul 22, 2022 • Updated Jul 22, 2022
Facebook Icontwitter iconlinkedin icon
Rate this article
The document model is the best way to work with data, and it’s the main factor to drive MongoDB popularity. The document model is also helping MongoDB innovate it's own solutions to power the world's most sophisticated data requirements.
Data federation, allows you to form federated database instances that span multiple data sources like different Atlas clusters, AWS S3 buckets, and other HTTPs sources. Now, one application or service can work with its individual cluster with dedicated resources and data compliance while queries can run on a union of the datasets. This is great for analytics or those global view dashboards and many other use cases in distributed systems.
Atlas Search is also an emerging product that allows applications to build relevance-based search powered by Lucene directly on their MongoDB collections. While both products are amazing on their own, they can work together to form a multi-cluster, robust text search to solve challenges that were hard to solve beforehand.

Example use case

Plotting attributes on a map based on geo coordinates is a common need for many applications. Complex code needs to be added if we want to merge different search sources into one data set based on the relevance or other score factors within a single request.
With Atlas federated queries run against Atlas search indexes, this task becomes as easy as firing one query.
In my use case, I have two clusters: cluster-airbnb (Airbnb data) and cluster-whatscooking (restaurant data). For most parts of my applications, both data sets have nothing really in common and are therefore kept in different clusters for each application.
Atlas Clusters for each application
However, if I am interested in plotting the locations of restaurants and Airbnbs (and maybe shops, later) around the user, I have to merge the datasets together with a search index built on top of the merged data.

With federated queries, everything becomes easier

As mentioned above, the two applications are running on two separated Atlas clusters due to their independent microservice nature. They can even be placed on different clouds and regions, like in this picture.
The restaurants data is stored in a collection named “restaurants” followed by a common modeling, such as grades/menu/location.
The Airbnb application stores a different data set model keeping Airbnb data, such as bookings/apartment details/location.
The power of the document model and federated queries is that those data sets can become one if we create a federated database instance and group them under a “virtual collection” called “pointsOfInterest.”
Atlas Query Federation setup
The data sets can now be queried as if we have a collection named “pointsOfInterest” unioning the two.

Lets add Atlas Search to the mix

Since the collections are located on Atlas, we can easily use Atlas search to individually index each. It’s also most probable that we already did that as our underlying applications require search capabilities of restaurants and Airbnb facilities.
Atlas Search index setup
However, if we make sure that the names of the indexes are identical—for example, “default”—and that key fields for special search—like geo—are the same (e.g., “location”), we can run federated search queries on “pointsOfInterest.” We are able to do that since the federated queries are propagated to each individual data source that comprise the virtual collection. With Atlas Search, it's surprisingly powerful as we can get results with a correct merging of the search scores between all of our data sets. This means that if geo search points of interest are close to my location, we will get either Airbnb or restaurants correctly ordered by the distance. What’s even cooler is that Atlas Data Federation intelligently “pushes down” as much of a query as possible, so the search operation will be done locally on the clusters and the union will be done in the federation layer, making this operation as efficient as possible.
Compass geo query

Finally, let's chart it up

We can take the query we just ran in Compass and export it to MongoDB Charts, our native charting offering that can directly connect to a federated database instance, plotting the data on a map:


With new products come new power and possibilities. Joining the forces of Data Federation and Atlas Search allows creators to easily form applications like never before. Start innovating today with MongoDB Atlas.

Facebook Icontwitter iconlinkedin icon
Rate this article

Nairobi Stock Exchange Web Scraper

Apr 02, 2024 | 20 min read

Atlas Search Playground: Easy Experimentation

Jun 03, 2024 | 7 min read

Creating an API With the AWS API Gateway and the Atlas Data API

Jul 12, 2024 | 8 min read

Taking RAG to Production with the MongoDB Documentation AI Chatbot

Jul 09, 2024 | 11 min read
Table of Contents