New to MongoDB Atlas — Global Clusters Enable Low-Latency Reads and Writes from Anywhere

Leo Zheng and Jesse Krasnostein

#Releases#Cloud#Atlas

The ability to replicate data across any number of cloud regions was introduced to MongoDB Atlas, the fully managed service for the database, last fall. This granted Atlas customers two key benefits. For those with geographically distributed applications, this functionality allowed them to leverage local replicas of their data to reduce read latency and provide a fast, responsive customer experience on a global scale. It also meant that an Atlas cluster could be easily configured to failover to another region during cloud infrastructure outages, providing customers with the ability to provision multi-region fault tolerance in just a few clicks.

But what about improving write latency and addressing increasingly demanding regulations, many of which have data residency requirements? In the past, users could address these challenges in a couple of ways. If they wanted to continue using a fully managed MongoDB service, they could deploy separate databases in each region. Unfortunately, this often resulted in added operational and application complexity. They could also build and manage a geographically distributed database deployment themselves and satisfy these requirements using MongoDB’s zone sharding capabilities.

Today we’re excited to introduce Global Clusters to MongoDB Atlas. This new feature makes it possible for anyone to effortlessly deploy and manage a single database that addresses all the aforementioned requirements. Global Clusters allow organizations with distributed applications to geographically partition a fully managed deployment in a few clicks, and control the distribution and placement of their data with sophisticated policies that can be easily generated and changed.

Improving app performance by reducing read and write latency

With Global Clusters, geographically distributed applications can write to (and of course, read from) local partitions of an Atlas deployment called zones. This new Global Writes capability allows you to associate and direct data to a specific zone, keeping it in close proximity to nearby application instances and end users. In its simplest configuration, an Atlas zone contains a 3-node replica set distributed across the availability zones of its preferred cloud region. This configuration can be adjusted depending on your requirements. For example, you can turn the 3-node replica set into multiple shards to address increases in local write throughput. You can also distribute the secondaries within a zone into other cloud regions to enable fast, responsive read access to that data from anywhere.

The illustration above represents a simple Global Cluster in Atlas with two zones. For simplicity’s sake, we’ve labeled them blue and red. The blue zone uses a cloud region in Virginia as the preferred region, while the red zone uses one in London. Local application instances will write to and read from the MongoDB primaries located in the respective cloud regions, ensuring low latency read and write access. Each zone also features a read-only replica of its data located in the cloud region of the other one. This ensures that users in North America will have fast, responsive read access to data generated in Europe, and vice versa.

Satisfying data residency for regulatory requirements

By allowing developers to easily direct the movement of data at the document level, Global Clusters provide a foundational building block that helps organizations achieve compliance with regulations containing data residency requirements. Data is associated with a zone and pinned to that zone unless otherwise configured.

The illustration below represents an Atlas Global Cluster with 3 zones — blue, red, and orange. The configuration of the blue and red zones are very similar to what we already covered. Local application instances read and write to nearby primaries located in the preferred regions — Virginia and London — and each zone includes a read-only replica in the preferred cloud region of every other zone for serving fast, global reads. What’s different is the orange zone, which serves Germany. Unlike data generated in North America and the UK, data generated in and around Germany is not replicated globally; instead, it remains pinned to the preferred cloud region located in Frankfurt.

Deploying your first Global Cluster

Now let’s walk through how easy it is to set up a Global Cluster with MongoDB Atlas.

In the Atlas UI, when you go to create a cluster, you’ll notice a new accordion labelled Global Cluster Configuration. If you click into this and enable “Global Writes”, you’ll find two easy-to-use and customizable templates. Global Performance provides reasonable read and write latency to the majority of the global population and Excellent Global Performance provides excellent read and write latency to majority of the global population. Both options are available across AWS, Google Cloud Platform, and Microsoft Azure.

You can also configure your own zones. Let’s walk through the setup of a Global Cluster using the Global Performance template on AWS. After selecting the Global Performance template, you’ll see that the Americas are mapped to the North Virginia region, EMEA is mapped to Frankfurt, and APAC is mapped to Singapore.

As your business requirements change over time, you are able to switch to the Excellent Global Performance template or fully customize your existing template.

Customizing your Global Cluster

Say you wanted to move your EMEA zone from Frankfurt to London. You can do so in just a few clicks. If you scroll down in the Create Cluster Dialog, you’ll see the Zone configuration component (pictured below). Select the zone you want to edit and simply update the preferred cloud region.

Once you’re happy with the configuration, you can verify your changes in the latency map and then proceed to deploy the cluster.

After your Global Cluster has been deployed, you’ll find that it looks just like any other Atlas cluster. If you click into the connect experience to find your connection string, you’ll find a simple and concise connection string that you can use in all of your geographically distributed application instances.

Configuring data for a Global Cluster

Now that your Global Cluster is deployed, let's have a look at the Atlas Data Explorer, where you can create a new database and collection. Atlas will walk you through this process, including the creation of an appropriate compound shard key — the mechanism used to determine how documents are mapped to different zones.

This shard key must contain the location field. The second field should be a well-distributed identifier, such as userId. Full details on key selection can be found in the MongoDB Atlas docs.

To help show what documents might look like in your database, we’ve added a few sample documents to a collection in the Data Explorer. As you can see above, we’ve included a field called location containing a ISO-3166-1 alpha 2 country code ("US", "DE", "IN") or a supported ISO-3166-2 subdivision code ("US-DC", "DE-BE", "IN-DL"), as well as a field called userId, which acts as our well-distributed identifier. This ensures that location affinity is baked into each document.

In the background, MongoDB Atlas will have automatically placed each of these documents in their respective zones. The document corresponding to Anna Bell will live in North Virginia and the document corresponding to John Doe will live in Singapore. Assuming we have application instances deployed in Singapore and North Virginia, both will use the same MongoDB connection string to connect to the cluster. When Anna Bell connects to our application from the US, she will automatically be working with data kept in close proximity to her. Similarly, when John Doe connects from Australia, he will be writing to the Singapore region.

Adding a zone to your Global Cluster

Now let’s say that you start to see massive adoption of your application in India and you want to improve the performance for local users. At anytime, you can return to your cluster configuration, click “Add a Zone”, and select Mumbai as the preferred cloud region for the new zone.

The global latency map will update, showing us the new zone and an updated view of the countries that map to it. When we deploy the changes, the documents that are tagged with relevant ISO country codes will gracefully be transferred across to the new zone, without downtime.

Scaling write throughput in a single zone

As we mentioned earlier in this post, it’s possible to scale out a single zone to address increases in local write throughput. Simply scroll to the “Zone Configuration”, click on “Additional Options” and increase the number of shards. By adding a second shard to a zone, you are able to double your write throughput.

Low-latency reads of data originating from other zones

We also referenced the ability to distribute read-only replicas of data from a zone into the preferred cloud regions of other zones, providing users with low-latency read access to data originating from other regions. This is easy to configure in MongoDB Atlas. In “Zone Configuration”, select “Add secondary and read-only regions”. Under “Deploy read-only replicas”, select “Add a node” and choose the region where you’d like your read-only replica to live.

For global clusters, Atlas provides a shortcut to creating read-only replicas of each zone in every other zone. Under “Zone configuration summary”, simply select the “Configure local reads in every zone” button.

MongoDB Atlas Global Clusters are very powerful, making it possible for practically any developer or organization to easily deploy, manage, and scale a distributed database layer optimized for low-latency reads and writes anywhere in the world. We're very excited to see what you build with this new functionality.

Global clusters are available today on Amazon Web Services, Google Cloud Platform, and Microsoft Azure for clusters M30 and larger.