ABot
(Anon Bot)
February 8, 2025, 12:44pm
1
Hello,
I am facing an issue for days now regarding balancing of a gridfs collection.
As ou can see on the capture the storage is not balanced correctly between shards over time :
I recreated my database to reset the sharding because I suspected a problem that disabled the shard but the same problem occurs.
Now I am checking again and I see that nothing is sharded from the shell :
mongos> db.getSiblingDB(“config”).chunks.find({ ns: “n_storage.fs.chunks” }).count()
0
mongos> use config
switched to db config
mongos> db.collections.find({“_id”: “n_storage.fs.chunks”})
{ “_id” : “n_storage.fs.chunks”, “lastmodEpoch” : ObjectId(“67a6a46f6cdb3b52cc4282d4”), “lastmod” : ISODate(“2025-02-08T00:25:19.765Z”), “timestamp” : Timestamp(1738974319, 12), “uuid” : UUID(“b8166ae7-5a88-4024-8893-0ab8c3973218”), “key” : { “files_id” : 1, “n” : 1 }, “unique” : false, “chunksAlreadySplitForDowngrade” : false, “noBalance” : false }
mongos>
I am confused about this issue because data seems splited, but sharding seems not correctly performed,
Thanks for your help
Hi Anon,
Could you share the output of:
db.aggregate([
{ $shardedDataDistribution: { } },
{ $match: { "ns": "n_storage.fs.chunks" } }
])
ABot
(Anon Bot)
February 9, 2025, 4:59pm
3
Here is the output :
mongos> db.aggregate([ { $shardedDataDistribution: { } }, { $match: { “ns”: “n_storage.fs.chunks” } } ]).pretty()
{
“ns” : “n_storage.fs.chunks”,
“shards” : [
{
“shardName” : “cold_shard_7”,
“numOrphanedDocs” : 0,
“numOwnedDocuments” : 53378,
“ownedSizeBytes” : NumberLong(“11952508516”),
“orphanedSizeBytes” : 0
},
{
“shardName” : “cold_shard_1”,
“numOrphanedDocs” : 0,
“numOwnedDocuments” : 53410,
“ownedSizeBytes” : NumberLong(“11942422590”),
“orphanedSizeBytes” : 0
},
{
“shardName” : “cold_shard_3”,
“numOrphanedDocs” : 0,
“numOwnedDocuments” : 53395,
“ownedSizeBytes” : NumberLong(“12000846620”),
“orphanedSizeBytes” : 0
},
{
“shardName” : “cold_shard_4”,
“numOrphanedDocs” : 0,
“numOwnedDocuments” : 53352,
“ownedSizeBytes” : NumberLong(“12024046944”),
“orphanedSizeBytes” : 0
},
{
“shardName” : “cold_shard_6”,
“numOrphanedDocs” : 0,
“numOwnedDocuments” : 52786,
“ownedSizeBytes” : NumberLong(“11947160952”),
“orphanedSizeBytes” : 0
},
{
“shardName” : “cold_shard_0”,
“numOrphanedDocs” : 0,
“numOwnedDocuments” : 53390,
“ownedSizeBytes” : NumberLong(“12012803390”),
“orphanedSizeBytes” : 0
},
{
“shardName” : “cold_shard_2”,
“numOrphanedDocs” : 0,
“numOwnedDocuments” : 53918,
“ownedSizeBytes” : NumberLong(“12055094276”),
“orphanedSizeBytes” : 0
},
{
“shardName” : “cold_shard_5”,
“numOrphanedDocs” : 30391,
“numOwnedDocuments” : 719018,
“ownedSizeBytes” : NumberLong(“168458008202”),
“orphanedSizeBytes” : NumberLong(“7120276999”)
}
]
}
mongos>
The collection seems that is still going under balancing.
Starting from MongoDB 6.0+ the balancer aim to have a collection equally distributed in term of data size over the shards (read more here )
The shard cold_shard_5
has more data than other shards (~ 145GB more) and orphans document that will get deleted.
If you periodically check the output of $shardedDataDistribution
you should see the number of numOwnedDocuments
for cold_shard_5
decrease with the time until all the shards have approximately the same amount of data.
ABot
(Anon Bot)
February 10, 2025, 8:33am
5
Thanks a lot,
After your message I disabled my backend API that insert in my gridfs and it get balanced faster after that, I guess the migration is working better when the collections is not used.
Now everything is balanced correctly :