replSet initial sync drop all databases

pruthvi_reddy · February 7, 2023, 10:17pm

We trying to initialize replica set based on snapshot file, all members were on same same datafiles level.

I did make sure oplog on three nodes is same level, and did ran:
rs.initiate(
{
_id: “rs.ucld”,
version: 1,
members: [
{ _id: 0, host : “1a01.internal.us:27017” },
{ _id: 1, host : “1b01.internal.us:27017” },
{ _id: 2, host : “1c01.internal.us:27017” }
]
}
)

Question: Data files were same, oplog record count was identical across three nodes, but when we initialize why a primary is dropping all data on other 2 nodes and trying to sync, we can’t do that because our datafiles size is 600GB, and it keep failing in middle even though we allowed. We trying to prevent a new “Initial Sync” by using snapshot files.

Received the logs:
2023-02-07T21:41:22.128Z I NETWORK [conn18] received client metadata from 172.16.20.29:59406 conn18: { driver: { name: “NetworkInterfaceTL”, version: “4.0.28” }, os: { type: “Linux”, name: “Ubuntu”, architecture: “x86_64”, version: “18.04” } }
2023-02-07T21:41:22.131Z I REPL [replexec-1] Member 1c01.internal.us:27017 is now in state STARTUP2
2023-02-07T21:41:22.132Z I ACCESS [conn18] Successfully authenticated as principal __system on local from client 172.16.20.29:59406
2023-02-07T21:41:22.132Z I REPL [replication-0] Starting initial sync (attempt 1 of 10)
2023-02-07T21:41:22.133Z I STORAGE [replication-0] Finishing collection drop for local.temp_oplog_buffer (5d742075-16fd-4f64-97f2-0f3e04d51200).
2023-02-07T21:41:22.136Z I STORAGE [replication-0] createCollection: local.temp_oplog_buffer with generated UUID: 254e5557-4968-41b7-90ed-b27c139d3a90
2023-02-07T21:41:23.143Z I REPL [replication-1] sync source candidate: 1c01.internal.us:27017
2023-02-07T21:41:26.423Z I REPL [replication-1] Initial syncer oplog truncation finished in: 3280ms
2023-02-07T21:41:26.423Z I STORAGE [replication-1] dropAllDatabasesExceptLocal 6

Tarun_Gaur · February 14, 2023, 6:59pm

Hello @pruthvi_reddy ,

I noticed in the logs you shared that you are using MongoDB v4.0. It is out of support since April 2022. The oldest supported series is v4.2, so I would suggest you to upgrade from v4.0 to v4.2. Please upgrade the cluster as a whole, not just the affected nodes, it is also recommended to take backups before executing the procedure.

As per the Documentation - Restore a Replica Set from MongoDB Backups

You cannot restore a single data set to three new mongod instances and then create a replica set. If you copy the data set to each mongod instance and then create the replica set, MongoDB will force the secondaries to perform an initial sync.

The procedures in the documentation describe the correct and efficient ways to deploy a restored replica set.
I would recommend you to go through the documentation and follow the advised steps.

Regards,
Tarun

system · February 28, 2023, 2:09pm

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.