Five Minute MongoDB: MongoDB Atlas Global Clusters

Dj Walker-Morgan

#Atlas

The world is a big place and that in itself is a problem for the TCP packets that carry your queries and results back and forth over the internet. Bounded by the speed of light - and other factors - talking to a database on the other side of the world can be sluggish.

That's why MongoDB Atlas's Global Clusters exist - if you can't move the speed of light, the solution is, thanks to a distributed architecture, to move the data closer to the user. And that makes your application run faster all around the world.

How does it work?

Well, let's start with a quick refresher on MongoDB sharding. With sharding, you can split your data between the MongoDB replica sets that make up a cluster based on a sharding key. This is based on a field (or fields) chosen for its properties to spread the various documents stored in the database between the replica sets. This is the key to MongoDB's ability to scale horizontally.

Now, what Global Clusters does at the simplest level is to prefix that sharding key with a location making a sharding key with location information. It's cleverer than that though as you don't actually say which replica set. Think of the world divided up with ISO country codes or ISO subdivision codes. This is used to make up a region based map of the world. When you create a replica set within a Global Cluster, it'll be associated with one of those codes.

When you make a new document, it has to have a location field which also has the appropriate country/subdivision code. As that document is written, the write request is routed using that location to the nearest replica set you've got to that location. Put the location into a query and you'll find that query is routed to the nearest replica set to that location. There are lots of nuances to how this works, too much for a brief article like this but you can read more in the Global Clusters documentation.

Global Clusters in motion

How do you cover the world?

How do you know where to put your replica sets though? That's where the MongoDB user interface magic comes in with an interactive latency map that lets you see what your current coverage is in terms of latency between other regions and your replica sets. Click to see what impact a new replica set in another region will have. If it answers your latency issue, just click on to create a replica set and let the MongoDB sharding take care of migrating appropriate documents to the new replica set.

Got a particularly busy region? You can easily add more replica sets to take the load in that region. Then all you need to do is let the sharding distribute the load across those replica sets.

Asking the world

Dividing the planet up like this may leave you wondering what happens when you do want to make a query that will include many regions and therefore many replica sets. Sharding has you covered again with a well-understood mechanism to distribute the query and gather the results. And if latency or load is a concern, you can establish read replicas of other regions within selected regions.

Unique to MongoDB Atlas

The most surprising thing about Global Clusters is that they are, as far as we know, unique to MongoDB Atlas. Other databases may offer the building blocks for constructing a globally distributed network of databases with routable writes and queries, but MongoDB Atlas's Global Clusters is already deployed, running and remarkably simple to put to work. It's just one feature of MongoDB Atlas which makes it the obvious option for running MongoDB in the cloud.