I want to configure ranged sharding, but I don’t have a suitable key with a good distribution.
select the _id column, which stores storing the yyyyMMddHmmss + 11-digit random number as the sharding key.
For keys that increase monotonically, it is said to consider hashed shading.
-
I am curious about how much load occurs when adding a new shard node when configured with hash sharding.
It consists of shard1 and shard2.
If you add shard3, there will be a lot of load due to the process of reissuing hash keys for all data. I am curious about how much it will be. -
When testing ranged sharding in MongoDB 6.0.12 with yyyyMMddHmmss + 11-digit random number as the sharding key,
Although shard1 is large in size with one chunk, we confirmed that data distribution is good.
It seems that the chunks are not distributed properly, but seeing that the data is distributed properly, I wonder if there will be any problems if I use this setting.
Starting from version 6.0.3, I heard that the balance is adjusted based on data size instead of the number of chunks, and that larger data sizes per chunk are better if the data size per shard is uniform across the collection. I wonder if I can use it with this setting.
Shard shard1 at shard1/10.162.29.28:27201,10.162.29.31:27201
{
data: '3.54GiB',
docs: 5111802,
chunks: 1,
'estimated data per chunk': '3.54GiB',
'estimated docs per chunk': 5111802
}
---
Shard shard2 at shard2/10.162.29.28:27202,10.162.29.31:27202
{
data: '3.24GiB',
docs: 4686028,
chunks: 26,
'estimated data per chunk': '127.8MiB',
'estimated docs per chunk': 180231
}