Hi all.
Apologies if this is already covered. I’ve looked back quite a way in the sharding section, but no-one seems to be having the same problem as me, other than one person who said they fixed it by adding a user.
I’m doing some experimental work on a sharded, replicated Mongo cluster. It’s in a docker swarm (as that’s what I’m using on my dev system). I’m using a shared key, so all of the containers/services are running the same authentication.
I’m running mongo:latest, as of 10th March 2025.
I’ve got 3 config servers, 2 shards of 3 replicas and 3 mongos using swarm’s replicate functionality.
I’ve set it up initially with 2 shards of 3 replicas, and it just works. It’s fantastic. I insert data, and it goes to the different shard based on the hash of a unique key.
For a test, I’ve put in 40 records. They ended up pretty much 50/50 per shard. I also put in 1M records, and they were almost exactly 50/50.
I now am in the process of seeing I can add a third shard, so that if I need to extend the performance of the system, I don’t need to tear anything down and build it up again.
My process is to modify my compose file to copy the 3 replicas into a new block called shard3 (so I copy shard1-a, shard1-b, shard1-c in the compose file as shard3-a, 3-b, 3-c).
The structure is identical. Same shared key file. The only thing that’s different is any references to “shard1” is “shard3”.
When I bring the swarm up with the 3 new mongo containers, I exec into the first one, and add them to a replica set. This all works fine. rs.status() shows 1 PRIMARY and 2 SECONDARY.
I then go to the mongos instance and add the new shard
sh.addShard("shard3_rs/shard3-a:27017,shard3-b:27017,shard3-c:27017");
If I then list the shards using
use config
db.shards.find({})
all 3 of the shards are there
At this point, nothing happens.
If I add more records, they all end up in shard 1 & 2.
There’s 2 things of note.
When I do
sh.status
I get
balancer
{
'Currently enabled': 'yes',
'Currently running': 'no',
'Failed balancer rounds in last 5 attempts': 0,
'Migration Results for the last 24 hours': 'No recent migrations'
}
Nothing I do seems to change the balancer to running:yes.
The other thing is that I’m getting
{"t":{"$date":"2025-03-10T17:38:13.084+00:00"},"s":"I", "c":"ACCESS", "id":5286307, "svc":"-", "ctx":"conn142","msg":"Failed to authenticate","attr":{"client":"10.0.1.4:55416","isSpeculative":false,"isClusterMember":false,"mechanism":"","user":"","db":"","error":"AuthenticationAbandoned: Authentication session abandoned, client has likely disconnected","result":337,"metrics":{"conversation_duration":{"micros":320009905,"summary":{}}},"extraInfo":{}}}
The important bit is “Failed to authenticate”
I’m guessing this is the core of my problem, that the mongos is not able to get in touch (correctly/partially) with the new shard as there’s a credentials problem.
My issue is that I’ve not specified credentials anywhere (other than the mongo environment variable to define root username and password, which don’t seem to work in clustered mode)
Why would I need to add a user to the new shard. There’s no sign of any users that I can see in mongos or in the sharded replicas.
Assuming I have to create a user, which database, collection and/or replicaset would I need to add it to, and where would I refer to the user? Presumably the addShard connection string?
I’ve backtracked, and torn everything down and started afresh using the same compose file, and initialised everything new, and the system works fine with 3 shards. The 40 records are spread around the 3 shards, not 2.
If I then try to add a 4th shard (because why not!), it does the same as when adding 1 shard to 2 - the one added after the system is set up never corrects itself.
If anyone can point out what I’m doing wrong, or give me some pointers, I’d greatly appreciate it.
Is what I’m trying to do actually possible? The mongo docs say to just do sh.addShard.
Thanks.