Efficient Sync Solutions: Cluster-to-Cluster Sync and Live Migration to Atlas
Rate this tutorial
The challenges that are raised in modern business contexts are increasingly complex. These challenges range from the ability to minimize downtime during migrations to adopting efficient tools for transitioning from relational to non-relational databases, and from implementing resilient architectures that ensure high availability to the ability to scale horizontally, allowing large amounts of data to be efficiently managed and queried.
Two of the main challenges, which will be covered in this article, are:
- The need to create resilient IT infrastructures that can ensure business continuity or minimal downtime even in critical situations, such as the loss of a data center.
- Conducting migrations from one infrastructure to another without compromising operations.
It is in this context that MongoDB stands out by offering innovative solutions such as MongoSync and live migrate.
MongoDB Atlas, with its capabilities and remarkable flexibility, offers two distinct approaches to implementing business continuity strategies. These two strategies are:
- Creating a cluster with a geographic distribution of nodes.
- The implementation of two clusters in different regions synchronized via MongoSync.
In this section, we will explore the second point (i.e., the implementation of two clusters in different regions synchronized via MongoSync) in more detail.
"The
mongosync
binary is the primary process used in Cluster-to-Cluster Sync. mongosync
migrates data from one cluster to another and can keep the clusters in continuous sync."This tool performs the following operations:
- It migrates data from one cluster to another.
- It keeps the clusters in continuous sync.
Let's make this more concrete with an example:
- Initially, the situation looks like this for the production cluster and the disaster recovery cluster:
The current state of the main cluster would look like this:
The back-up used for disaster recovery is still blank:
Before proceeding, it is essential to install the
mongosync
binary. If you have not already done so, you can download it from the downloads page. The commands described below have been tested in the CentOS 7 operating system.Let's proceed with the configuration of
mongosync
by defining a configuration file and a service:You can copy and paste the current configuration into this file using the appropriate connection strings. You can also test with two Atlas clusters, which must be M10 level or higher. For more details on how to get the connection strings from your Atlas cluster, you can consult the documentation.
Generally, this step is performed on a Linux machine by system administrators. Although the step is optional, it is recommended to implement it in a production environment.
Next, you will be able to create a service named mongosync.service.
This is what your service file should look like.
Reload all unit files:
Now, we can start the service:
We can also check whether the service has been started correctly:
Output:
If a service is not created and executed, in a more general way, you can start the process in the following way:
mongosync --config mongosync.conf
After starting the service, verify that it is in the idle state:
Output:
We can run the synchronization:
Output:
We can also keep track of the synchronization status:
Output:
At this time, the DR environment is aligned with the production environment and will also maintain synchronization for the next operations:
And our second cluster is now in sync with the following data.
Armed with what we've discussed so far, we could ask a last question like:
Is it possible to take advantage of the disaster recovery environment in some way, or should we just let it synchronize?
By making the appropriate
mongosync
configurations --- for example, by setting the "buildIndexes" option to false and omitting the "enableUserWriteBlocking" parameter (which is set to false by default) --- we can take advantage of the limitation regarding non-synchronization of users and roles to create read-only users. We do this in such a way that no entries can be entered, thereby ensuring consistency between the origin and destination clusters and allowing us to use the disaster recovery environment to create the appropriate indexes that will go into optimizing slow queries identified in the production environment.Live migrate is a tool that allows users to perform migrations to MongoDB Atlas and more specifically, as mentioned by the official documentation, is a process that uses
mongosync
as the underlying data migration tool, enabling faster live migrations with less downtime if both the source and destination clusters are running MongoDB 6.0.8 or later.So, what is the added value of this tool compared to
mongosync
?It brings two advantages:
- You can avoid the need to provision and configure a server to host
mongosync
.
As we have seen, these tools, with proper care, allow us to achieve our goals while also providing us with a certain flexibility.
Regardless of the solution that will be used for migration and/or synchronization, you will be able to contact MongoDB support, who will help you identify the best strategy to solve that task successfully.
Top Comments in Forums
usmaan_rangrezUsmaan Rangrez3 weeks ago
When DC is completely down,
If i write in DR,
how will that data sync to DC?
I had mongosync insatlled in DC.