Hello all,
I am a little stuck with a database that is in production which I want to “secure” and “update”.
The actual crazy situation Is the following (don’t ask me why…):
- Database version: 3.6 (Yes I know…)
- Database size: 650Go (950 Go on disk)
- Servers Structure:
- 1 Master (correctly sized)
- 1 secondary (a little smaller)
- 1 tiny arbiter
- Data:
- 1 database with… 40k+ collection and rising (design issue on app side which generates collection for each IOT device)
- not enough space on primary to duplicate completely files
- cannot stop for long maintenance of the server (max 30 minutes)
My first issue is that I cannot sync the secondary with the primary:
- It is always failing and restarting from scratch. (STARTUP2)
- The speed is ~20mb/s (4.5 days)
- From what I see; the indexes building takes an extremely long time
My following plans will be:
- Add a 2 replica set
- New staging test environment
- upgrade to 4.0 → 4.2 → 5.0 → (6.0) (App side connectors needs to be updated)
- Rework the app better to organize the collections (~20 collections)
Have you any idea how I can proceed or help the sync?
Error when it occurs:
E REPL [replication-27] Initial sync attempt failed – attempts left: 9 cause: Networ
kInterfaceExceededTimeLimit: error fetching oplog during initial sync :: caused by :: error in fetcher batch callback:
Operation timed out
Thanks a lot for your help.