Sharding cluster error

An error appears in the mongod system log:

{“t”:{"$date":“2021-12-06T07:30:26.676+00:00”},“s”:“W”, “c”:“SHARDING”, “id”:23777, “ctx”:“MoveChunk”,“msg”:“Error while doing moveChunk”,“attr”:{“error”:“OperationFailed: Data transfer error: ExceededTimeLimit: Failed to delete orphaned .colletion range [{ field: “value1” }, { field: “value2” }) :: caused by :: operation exceeded time limit”}}

I don’t know what caused this problem.An uncertain guess is that there are too many documents corresponding to the same shardkey.
The number corresponding to value1 is 150,000, but value2 is only 6.

Can someone provide some explanation or guidance, thanks

Hi @jiang_xu1 and welcome in the MongoDB community :muscle: !

Do you have a jumbo chunk issue on this collection? What’s the shard key? How many chunks do you have in this collection and how big is it in total?

Cheers,
Maxime.

Hello, thanks for reply.
I built a cluster with two shards (shard-A,shard-B) , use the non-unique shard key “field1” to shard “collectionA” and observed the above error:

“Failed to delete orphaned databaseA.collectionA range [{ field1: “value1” }, { field1: “value2” })”

CollectionA status:

chunkMetadata: [{ shard: ‘shard-A’, nChunks: 129 },{ shard: ‘shard-B’, nChunks: 129 }],
count: 16891443,
size: 18.04GB,
storageSize: 8.74GB,
totalIndexSize: 3.77GB,
totalSize: 12.51GB

mongo version: 5.0.4

My sharding command: sh.shardCollection(“database.collectionA”, {“field1”: 1});

In fact, the value represented by field1 is an address in the blockchain. Considering that it is already a hash value, I used a range shard; later I realized that an active address would generate a lot of records and an inactive address would have fewer records.

The addresses mentioned in the mongod error log are not all addresses with many records

Hi @jiang_xu1,

An ideal chunk size is smaller than 64 MB.

Looks like you have 129 * 2 = 258 chunks so the maximum data in this collection should be 258 * 64 MB = 16 512 MB. Usually chunks are actually way smaller than 64 MB as they are split into smaller chunks regularly. So I suspect that your shard key isn’t a good one & you have jumbo chunk issues: the balancer cannot split the chunks into smaller chunks anymore because you have too many documents with the same shard key and they are all in the same big chunk.

A sh.status(true) should confirm that you have jumbo chunks in your cluster.

2 ways to fix this:

An ideal shard key:

  • has a BIG (infinite is good!) cardinality
  • is evenly distributed
  • doesn’t grow monotonically
  • is used in most of your queries

See doc: Choose a shard key.

While I’m at it: this collection is relatively small and shouldn’t require to be sharded. Usually we recommend to shard when the cluster contains more than 2 TB of data and their is no need to shard all the collections on the cluster.
Small collections like this one can live on one of the shard (== collection Primary shard) and shouldn’t need the support of a sharded cluster.

Cheers,
Maxime.

1 Like

It is true that the shard key I selected is not reasonable, but I did not find the jumbo flag in the result of the sh.status(true) command

The reason I use sharding is that I am storing all the blockchain data, and the current data is only part of it