How to restore data into Shard Cluster?

Allwyn_Jesu · May 26, 2022, 4:37pm

Hi,

I have taken data dump using mongodump from Standalone instance. Now I want to restore that data into a Shard cluster.

Please note that the data I will restore already contains the shard key. So for restoration, should the mongorestore command point to one of the routers or individual shards? What are the best practices to be followed for mongodump and mongodump?

Thanks

tapiocaPENGUIN · May 26, 2022, 9:21pm

You will want to restore to a mongos.

mongorestore --host <host> --port <port> -u <username> --authenticationDatabase admin /path/to/file

The data wont be sharded when it is restored, it will be on a single shard, you will need to enable sharding once the data is restored.

Stennie_X · May 26, 2022, 11:16pm

Hi @Allwyn_Jesu,

You definitely want to restore into a sharded cluster via mongos as @tapiocaPENGUIN suggested, but there are a few further details to be aware of.

More specifically: you should always mongorestore data into a sharded cluster using mongos so the cluster metadata is properly maintained. Inserting directly to a shard (bypassing mongos) is likely to cause operational issues.

Sharding information isn’t part of the metadata when you mongodump data from a sharded cluster.

However, mongorestore uses the sharding options for the target collection so you can define a shard key prior to restoring data and avoid some unnecessary rebalancing that would happen if a collection is sharded after all data is inserted.

If you already know the distribution of shard key values and plan to mongorestore into an empty collection, you can also save some time by Pre-Splitting Chunks in a Sharded Cluster.

Regards,
Stennie

system · June 21, 2022, 2:26pm

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.