Hello,
We have a two-sharded cluster in our setup, and our shard key is {“a”: 1, “b”: 1}. We scheduled chunk balancing for one hour daily during less busy traffic hours. We want to measure the write throughput of the cluster.
To achieve this, we are inserting documents with GUIDS for fields “a” and “b.” However, we’ve encountered an issue where MongoDB is creating new chunks in only one shard. I can understand that until the first rebalance, one shard might hold all the ranges because of that, all writes are going to a single shard. However, even after a substantial amount of data and enough chunks, and despite the rebalances, MongoDB continues to create new chunks on only one shard.
During the rebalancing process, MongoDB ensures that the two shards have an equal number of chunks, which seems to be working correctly. However, the problem persists as new chunks are still directed to a single shard, causing uneven data distribution.
We are puzzled by this behaviour, as it goes against our expectation of achieving a balanced distribution of chunks across both shards. We would appreciate any insights or suggestions to resolve this issue and achieve the desired even data distribution in our sharded cluster.
When does MongoDB create a new chunk? I know it splits a chunk when it exceeds a certain size, but I’ve noticed an increased chunk count while inserting data. What criteria trigger the creation of a new chunk, and how does MongoDB decide on which shard to create the chunk?
In my scenario, all the new chunks are being created in replica set-2, creating a performance bottleneck on that specific node.
I’d appreciate any insights or explanations to understand the chunk creation process better and address the performance issue. Thank you.
Outside of the maintanance can you run the following command. This will check if the balancer is enabled or not. If it isn’t enabled then it will not balance the chunks.:
sh.getBalancerState()
I’m assuming you will find that replica set-2 is the primary db of the shard. if you run sh.status() in the “databases” section you will find the primary shard for the database. I believe when the balancer is disabled the data writes to the primary shard but I can’t find documentation to support this.
I believe this is the correct outcome, because until the balancer is running the chunks won’t be even.