Keeping Data in Sync Anywhere with Cluster-to-Cluster Sync

Cesar Rojas and Alan Zheng

#MongoDB World#Cluster-to-Cluster Sync

For over a decade, MongoDB users have been deploying clusters for some of their most important workloads. We work with customers running MongoDB in a variety of environments, but there are three main environments that we see customers using:

Globally distributed cloud clusters (Atlas and self-managed): Enterprises have been successfully running cloud-based applications — in multiple zones and regions — for 10-plus years. More recently, the deployment of globally distributed multi-cloud data clusters has provided tremendous value and flexibility for modern applications. The last two years of the pandemic resulted in an accelerated proliferation of cloud data clusters to support new application services and workloads.
”
On-premises clusters: Many leading companies and government institutions remain reliant on their on-premises systems for various reasons, including regulatory compliance, data governance, existing line-of-business application integrations, or legacy investments.
”
Edge clusters: Organizations also distribute workloads to edge systems to bring enterprise applications closer to data sources, such as local edge servers ingesting sensor data from IoT devices. This proximity to data at its source can deliver substantial business benefits, including improved response times and faster insights.

Keeping hybrid data clusters in sync is challenging

Due to the diverse data origins and evolution of apps, maintaining data stores in hybrid environments — i.e., distributing data between different environments or distributing data between multiple clusters in a single environment — can be challenging. As application owners innovate and expand to new data environments, a big part of their success will depend on effective data synchronization between their clusters. Cluster data synchronization requires:

  • Support for globally distributed hybrid data clusters. All cluster data must be synchronized between different types of clusters.

  • Continuous synchronization. Support for a constant, nonstop stream of data that seamlessly flows across cluster deployments and is accessible by apps connecting to those different deployments.

  • Resumability. The ability to pause and resume data synchronization from where you left off.

The need for a hybrid, inter-cluster data sync

By default, a MongoDB cluster allows you to natively distribute and synchronize data globally within a single cluster. We automate this intra-cluster movement of data using replica sets and sharded clusters. These two configurations let you replicate data across multiple zones, geographical regions, and even multi-cloud configurations.

But there are occasions when users want to go beyond a single MongoDB cluster and synchronize data to a separate cluster (inter-cluster) configuration for use cases such as:

  • Migrating to MongoDB Atlas

  • Creating separate development and production environments

  • Supporting DevOps strategies (e.g., blue-green deployments)

  • Deploying dedicated analytics environments

  • Meeting locality requirements for auditing and compliance

  • Maintaining preparedness for a stressed exit (e.g., reverse cloud migration)

  • Moving data to the edge

Introducing Cluster-to-Cluster Sync

We designed Cluster-to-Cluster Synchronization to solve the challenges of inter-cluster data synchronization. It provides you with continuous unidirectional data synchronization of two MongoDB clusters (source to destination) in the same or hybrid environments.

Diagram of Cluster-to-Cluster Sync

With Cluster-to-Cluster Sync, you have full control of your synchronization process by deciding when to start, stop, pause, resume, or reverse the direction of synchronization. You can also monitor the progress of the synchronization in real time.

Availability

Cluster-to-Cluster Sync is now Generally Available as part of MongoDB 6.0. Currently, Cluster-to-Cluster Sync is compatible only with source and destination clusters that are running on MongoDB 6+.

What's next?

To get started with Cluster-to-Cluster Sync, you need mongosync, a downloadable and self-hosted tool that enables data movement between two MongoDB clusters.

Get started today: