I run a mongodb sharded cluster v4.4.6-ent. The data size is about 80G. Previously the cluster have 2 shards and after adding 2 shards (now 4 shards), the cluster takes more than 3 days (up to now) to reblance data, which significantly increased the online service 90th latency.
Execute: db.adminCommand( { balancerStatus: 1 } )
:
> db.adminCommand( { balancerStatus: 1 } )
{ mode: 'full',
inBalancerRound: true,
numBalancerRounds: 321397,
ok: 1,
...
It seems that the cluster is still working on reblancing data. Looking at the cluster CPU usage for the past days:
The CPU usage can be divided into two phases:
- Moving chunks at 8/8 for 10 hours
- (Not sure) Rebalancing data/index in the cluster?
So I have several questions:
- What does the cluster do after moving chunks?
- How long will it take to balance data? Or how can I get the current progress (the above command only show that it’s still in balancer round)?
- How to minimize the reblancing impact on online service after adding new shards?