How to copy a Mongo database to another, the fast way

Hello everyone,

I am working on a Node.js/Express/Mongo app.

I have 2 environments :

  • production environment, which connects to a ‘production’ Mongo database

  • development environment, which connects to a ‘development’ Mongo database

Both databases ‘production’ and ‘development’ are hosted in Mongo Cloud Atlas.

Note : today, the two databases are in the very same cluster, but I could create a second cluster if needed to solve my issue.

I regularly have to fix bugs, and every time it happens, I want to copy the data in the production database to the development database, so that I can replicate in my dev environment the issue viewed in production.

My question is : how can I copy the production database to the dev database (replacing data in the dev database) in the fastest way possible ?

Today, I am exporting/importing (via JSON file) the collection one by one, which is obviously not satisfying.

I also managed to write a script that I run on Mongo Playground in VSCode.

But I have a lot of collections, some of them are a bit “fat”, so the script is extremely long (=> not an acceptable solution).

I suppose I am running through a very common problem, and that there should be a standard solution here.

How should I manage this ?

If using Atlas have you tried to talk to Mongo about this? We have the same routine, where we replace Dev from Prod on a regular basis. I believe it’s done at a server level using the toolset that’s part of the Atlas infrastructure.
We have a pretty large replicated cluster and it does not take that long to do the refresh.

At the very least I’d not use JSON as you could lose data typing but use mongodump and import instead unless you have a very good reason not to.

Sorry I’ve not more detail, our DBA team deal with this but I know they do not use dump/restore but do it at a server level and without custom tooling or so I believe.

Hello John_Sewell, Thank you for your reply,

I wouldn’t mind using mongodump only if I see a difference in execution time for the transfer.

But a solution that doesn’t require me to temporarily store my data locally would be a better solution for me, I think.

You mentioned talking to Mongo about this, how can I contact him?

Since

you may use the aggregation framework on all the collections from one database with an $out into the other database.

1 Like

Oops, missed that. Yes that would be very quick, just an $out to copy data from one database to another if they are both on the same cluster! Then the data does not need to leave the server.

Hello,

I was able to try mongodump with mongorestore, it took me about 2min to transfer the data. The problem is that I haven’t yet found any other way of having to drop the data on my development environment in order to be able to update the data.

Then I followed your other suggestion with $out, and it took me much less time to transfer the data. However, I don’t think my boss will like this method. It forces me to write a script. Unless there’s another way of using $out?

What a stupid boss, if he does not like that you automatize a process with a script.

What a stupid boss, if he prefers that you use a manual operation for a repetitive task.

What a stupid boss, if he does not like the fact that you are selecting the best solution for your current setup and use-case.

He must have reach his highest level on the Peter Principle.

Please calm down, I could be wrong about what he told me too. It seemed to me that he wanted something other than a script, if that’s the best solution then I obviously want to believe him and he’ll believe it too. In any case, I lack knowledge of MongoDB and that’s why I’ve come to you. But please, be respectful.

I am respectful to you. I am not with your boss if does not let you implement the best solution.

The fastest way would be to use Atlas Cloud Backups and restore a backup to your Development Cluster. I’m not sure why you are keeping your production and development data in the same Cluster, unless you are not concerned about performance.

1 Like

Restore can drop the target of the restore, but if both are on the same cluster then it just makes sense to use the $out option.

There are also options to do with indexes that you may want to take a look at.

If you’re doing a drop and restore, that’s going to be a script anyway surely? You’re not going to type in the command manually each time you want to do this? You could just put the mongosh call with the command to $out between databases in a one line batch / shell file.

If you’re possibly going to split up the Prod and dev then it makes sense to get something in place that could work in the future and not have to be engineered.

Also be wary of users setup and permissions if you have the prod and dev environments on the same cluster…seems to be a recipie for something bad to happen!

1 Like

Other options to consider if you go the mongodump/restore route in order to avoid

The option --archive without a filename to send mongodump to stdout and then pipe to mongorestore with --archive again to read the dump from stdin.

1 Like