production environment, which connects to a ‘production’ Mongo database
development environment, which connects to a ‘development’ Mongo database
Both databases ‘production’ and ‘development’ are hosted in Mongo Cloud Atlas.
Note : today, the two databases are in the very same cluster, but I could create a second cluster if needed to solve my issue.
I regularly have to fix bugs, and every time it happens, I want to copy the data in the production database to the development database, so that I can replicate in my dev environment the issue viewed in production.
My question is : how can I copy the production database to the dev database (replacing data in the dev database) in the fastest way possible ?
Today, I am exporting/importing (via JSON file) the collection one by one, which is obviously not satisfying.
I also managed to write a script that I run on Mongo Playground in VSCode.
But I have a lot of collections, some of them are a bit “fat”, so the script is extremely long (=> not an acceptable solution).
I suppose I am running through a very common problem, and that there should be a standard solution here.
If using Atlas have you tried to talk to Mongo about this? We have the same routine, where we replace Dev from Prod on a regular basis. I believe it’s done at a server level using the toolset that’s part of the Atlas infrastructure.
We have a pretty large replicated cluster and it does not take that long to do the refresh.
At the very least I’d not use JSON as you could lose data typing but use mongodump and import instead unless you have a very good reason not to.
Sorry I’ve not more detail, our DBA team deal with this but I know they do not use dump/restore but do it at a server level and without custom tooling or so I believe.
Oops, missed that. Yes that would be very quick, just an $out to copy data from one database to another if they are both on the same cluster! Then the data does not need to leave the server.
I was able to try mongodump with mongorestore, it took me about 2min to transfer the data. The problem is that I haven’t yet found any other way of having to drop the data on my development environment in order to be able to update the data.
Then I followed your other suggestion with $out, and it took me much less time to transfer the data. However, I don’t think my boss will like this method. It forces me to write a script. Unless there’s another way of using $out?
Please calm down, I could be wrong about what he told me too. It seemed to me that he wanted something other than a script, if that’s the best solution then I obviously want to believe him and he’ll believe it too. In any case, I lack knowledge of MongoDB and that’s why I’ve come to you. But please, be respectful.
The fastest way would be to use Atlas Cloud Backups and restore a backup to your Development Cluster. I’m not sure why you are keeping your production and development data in the same Cluster, unless you are not concerned about performance.
Restore can drop the target of the restore, but if both are on the same cluster then it just makes sense to use the $out option.
There are also options to do with indexes that you may want to take a look at.
If you’re doing a drop and restore, that’s going to be a script anyway surely? You’re not going to type in the command manually each time you want to do this? You could just put the mongosh call with the command to $out between databases in a one line batch / shell file.
If you’re possibly going to split up the Prod and dev then it makes sense to get something in place that could work in the future and not have to be engineered.
Also be wary of users setup and permissions if you have the prod and dev environments on the same cluster…seems to be a recipie for something bad to happen!