Mongodb is not distributing data evenly among shards

I have a database with 800k objects and I defined about 13 shard servers to access the data quickly. I assigned a letter to each object for use in the sharding process, for example, shards: ‘a’ for the first object, shards: ‘b’ for the second object, and so on. I created a shard key using the shards field within each object and wanted to distribute the objects as evenly as possible across the 13 shard servers. I used “hashed” as the shard key for the shards field. I evenly distributed the letters to all objects, for example, 50k objects had shards: ‘a’ and 50k objects had shards: ‘b’, and so on. I used "sh.shardCollection(“test.testCollection”, { “shards”: “hashed” } ) to shard the collection, but the data only went to two of the 13 shard servers. The distribution was not even among the two servers, with a distribution of approximately 72% to one server and 28% to the other. I want the data to be evenly distributed among all 13 shard servers. Can you help me with this?

Shard a at a/127.0.0.1:21000,127.0.0.1:21001,127.0.0.1:21002
{
  data: '125.57MiB',
  docs: 227420,
  chunks: 1,
  'estimated data per chunk': '125.57MiB',
  'estimated docs per chunk': 227420
}
Shard k at k/127.0.0.1:23070,127.0.0.1:23071,127.0.0.1:23072
{
  data: '326.31MiB',
  docs: 576209,
  chunks: 1,
  'estimated data per chunk': '326.31MiB',
  'estimated docs per chunk': 576209
}

Object sample:

{
  "_id": {
    "$oid": "63dd7324289226c918818c55"
  },
  "Title": "",
  "Product": {
    "web1": {
      "Harry Potter and the Chamber of Secrets: 2/7 (Harry Potter 2)": {
        "Price": 15,
        "Url": "https://www.amazon.com/Harry-Potter-Chamber-Secrets-Book/dp/B017V4IPPO/ref=sr_1_2?crid=GCT8C7Z3Q4SE&keywords=Harry+Potter+and+the+Chamber+of+Secrets&qid=1676836656&sprefix=harry+potter+and+the+chamber+of+secrets%2Caps%2C230&sr=8-2",
        "Time": {
          "$date": {
            "$numberLong": "1676669514749"
          }
        }
      }
    }
  },
  "Category": [
    "Book",
    "Fantasy"
  ],
  "Time": {
    "$date": {
      "$numberLong": "1676669514749"
    }
  },
  "shards": "h"
}

edit: Host has Ryzen 9 5950x processor, 96GB RAM and 3x SN850 SSD.

I was able to speed up the process by adding shard keys to aggregate queries with $text content. However, I still haven’t found why the shards are not evenly distributed. My shard keys seem to be working correctly. I need someone who knows why objects are not evenly distributed between shards.

Hi @Dogan_Can_GAZAZ and welcome to the MongoDB community forum!!

The hashed shard key in MongoDB sharded cluster can help achieve an even distribution of data among the shards if the shard key is monotonically increasing which further means, the hashed shard key would evenly distribute data for fields whose values are changing at a constant rate.

It would be helpful to understand the concern further if you could help me with some information regarding the deployment:

  1. The output for sh.status()
  2. The key selected as hashed shard key
  3. Reason for selecting the above field as the hashed shard key field.

Best Regards
Aasawari