Hello there. I created a cluster with two shards. Or so I thought. In my design, first server has mongos_router shard1 (3 nodes) and config servers. Second server only has shard2 (3 nodes as well). After I configured shards I enabled sharding for every database. And when I look the output of sh.status() I see the output below:
...
{
database: {
_id: 'wins_emission',
primary: 'shard1rs',
partitioned: false,
version: {
uuid: UUID('aebf94cf-6069-41ba-9a91-f91a944071b1'),
timestamp: Timestamp({ t: 1711952615, i: 3000 }),
lastMod: 1
}
},
collections: {}
},
{
database: {
_id: 'wins_healthcheck',
primary: 'shard2rs',
partitioned: false,
version: {
uuid: UUID('663cb5f7-b7b3-4f40-9f52-2c3d1969fb65'),
timestamp: Timestamp({ t: 1711952305, i: 4 }),
lastMod: 1
}
},
...
I understood this as db’s will be distributed among shards. And I was expecting the data among nodes not to be same. For example, notifications table have 17.7k docs. And Im expecting these docs to be shared among nodes. Like shard1 first node have 4k, shard1 second node have 4k etc. But it isnt working like that. Every node in every shard has the same amount 17.7k .
I may be misunderstood this. I tried sharding at collection level for notifications table. I created a hashed shard key. And then executed sh.shardCollection(xxx) command. And now my first shard has 4.7k docs among its own nodes whereas shard2 has 12.9k in itself. Now this made me think of these questions.
1- Do I need to shard every collection to make use of a sharded cluster?
2- Should I shard every collection or only the ones which will hold big data like logs.
3- Why do all the nodes in a shard have the same amount of docs. Aren’t they supposed to distribute the data in itself?
Any help is appreciated.