Database migration from standalone machine to cluster

Stuart_S · March 6, 2023, 9:15am

Hi all ,

Currently i have an production live standalone mongodb server which is connected to an express nodejs application , now i went ahead and setup a replica set (3 machines) . Now how do i migrate data from this live DB to my replica set without any loss of data .

Aasawari · March 7, 2023, 4:08am

Hi @Stuart_S and welcome to the MongoDB Community forum!!

There could be two different ways to migrate from a standalone to a replica set.

Case 1: If you have no data in your database and you only have a deployment

The steps for the following are:

Shutdown the stand alone database.
Restart the process using --replSet in your deployment.
Add secondaries to the primary mongod process.
Connect you application with the primary of the replica set.

Case 2: If you have a large collection in your database in the stand alone deployment:

The MongoDB tools mongoexport will convert the collection into the format of the choice and further the mongoimport.
Further, you can also use mongodump and mongorestore for the process.

However, please note that, mongoexport and mongoimport does not import and export the indexes in MongoDB. Hence you would be required to create index if using the former method.

Also, please refer to the documentation on How to Convert a standalone to replica set for further information.

Let us know if you have any further queries.

Best Regards
Aasawari

Kobe_W · March 7, 2023, 6:31pm

But this page says the instance can just be restarted wish a replica set name , no mentioning about data import/export at all.

Stuart_S · March 9, 2023, 12:25am

Thanks @Aasawari for the answer
I do have the Case 2 to do , but for live data and data that will be coming in continously, mongoexport or mongodump which is better?

Aasawari · March 15, 2023, 3:51am

Hi @Stuart_S

For live data migration, you need to convert the stand alone replica set to a single node replica set and then pull the data to Atlas.

Please note that, as per the documentation, you cannot user M0/M2/M5 shared tier clusters as source or destination for live migration.

To add more details to the above information,

If your data size is large, you can resync the replica set member to speed up the process.
If you wish to move data to Atlas, you can follow the documentation for the same.

Please note that, the process can potentially be disruptive, hence the recommendation is to test the workload before performing on the production environment.

Let us know of you have further questions.

Best Regards
Aasawari

Brock · March 23, 2023, 8:19pm

Another alternative:
Just export the data to JSON, indexes as JSON, etc. and so on, and then just upload the JSON documents to the new MongoDB Cluster.

If it’s large amounts of data, just export to JSON files in batches of appropriate sizes, and then import the JSON files into the new DB. I literally migrated a 600GB 8 Node Cluster running 4.2 on premise to an 10 Node cluster running 6.0 for a friends engineering firm doing this.

0 Down time, zero shut downs, zero data lost, it was practically brainless. After confirmation all the aggregations and indexes/queries etc. were all there, we connected the servers to the new cluster, and nukes the Kubernetes containers running the 4.2. 100% successful migration in 20GB batches, and it only took 3 hours.

You don’t have to get extreme and complicated for processes that can just be easy if you want them to.

EDIT:
The 3 hours wasn’t because it takes 3 hours to export 600GB of data and configs, that only took 40 minutes. It took 3 hours because of having to troubleshoot Kubernetes and Docker issues, and the replication needing to happen on a backup location in Europe, had to wait for the data to upload to that cluster as well over a slow VPN. Otherwise the entire process wouldn’t have even taken 1.5 hours if that.

EDIT:
And with this exact method, I’m literally volunteering my time to oversee 6.3TB of a migration this weekend. And we’re going to export the JSON files to an external SSD, and just upload the JSON from the exact same SSDs to the new clusters. Upgrade/Migration. Going from several 4.0 and 4.2’s to several different (Breaking it up into smaller 3 node sharded clusters instead of giant monolithic clusters) 6.0 clusters. Zero downtime planned, zero production disruption, and then we’re nuking the entire old server rack and pending whether or not I can take the rack and old Dell Poweredgess for my lab.

Don’t over think things, don’t push for complications, if there’s an easy, very safe method that meets your needs, and you’re willing to do it, then well, just do it. But this is just another alternative way @Stuart_S

Brock · March 23, 2023, 8:54pm

@Stuart_S

Other options:
GraphQL API (My personal favorite, and is stupid easy, if you’re on premise, install the Apollo GraphQL server with your on premise MongoDB cluster and you can route the Apollo GraphQL to the Atlas, unless Atlas GraphQL API is broken) Or if it’s Atlas to Atlas, GraphQL to GraphQL via building an API with it to just send and receive the data.

Atlas Functions HTTP service (Rest API)

There are a bunch of other options to do this, just listing a couple of more for you.

VSH · June 22, 2023, 5:16pm

@Brock
Could you please explain how to export and then upload the data?
Do you mean mongoexport and mongoimport tools?
Are there any compatibility issues when exporting a 4.4 database to 6.0?
Thanks.