Adding a new shard to an already existing sharded (& replicated) database

Hi all.

Apologies if this is already covered. I’ve looked back quite a way in the sharding section, but no-one seems to be having the same problem as me, other than one person who said they fixed it by adding a user.

I’m doing some experimental work on a sharded, replicated Mongo cluster. It’s in a docker swarm (as that’s what I’m using on my dev system). I’m using a shared key, so all of the containers/services are running the same authentication.

I’m running mongo:latest, as of 10th March 2025.

I’ve got 3 config servers, 2 shards of 3 replicas and 3 mongos using swarm’s replicate functionality.

I’ve set it up initially with 2 shards of 3 replicas, and it just works. It’s fantastic. I insert data, and it goes to the different shard based on the hash of a unique key.

For a test, I’ve put in 40 records. They ended up pretty much 50/50 per shard. I also put in 1M records, and they were almost exactly 50/50.

I now am in the process of seeing I can add a third shard, so that if I need to extend the performance of the system, I don’t need to tear anything down and build it up again.

My process is to modify my compose file to copy the 3 replicas into a new block called shard3 (so I copy shard1-a, shard1-b, shard1-c in the compose file as shard3-a, 3-b, 3-c).

The structure is identical. Same shared key file. The only thing that’s different is any references to “shard1” is “shard3”.

When I bring the swarm up with the 3 new mongo containers, I exec into the first one, and add them to a replica set. This all works fine. rs.status() shows 1 PRIMARY and 2 SECONDARY.

I then go to the mongos instance and add the new shard

sh.addShard("shard3_rs/shard3-a:27017,shard3-b:27017,shard3-c:27017");

If I then list the shards using

use config
db.shards.find({})

all 3 of the shards are there

At this point, nothing happens.

If I add more records, they all end up in shard 1 & 2.

There’s 2 things of note.

When I do

sh.status

I get

balancer
{
  'Currently enabled': 'yes',
  'Currently running': 'no',
  'Failed balancer rounds in last 5 attempts': 0,
  'Migration Results for the last 24 hours': 'No recent migrations'
}

Nothing I do seems to change the balancer to running:yes.

The other thing is that I’m getting

{"t":{"$date":"2025-03-10T17:38:13.084+00:00"},"s":"I", "c":"ACCESS", "id":5286307, "svc":"-", "ctx":"conn142","msg":"Failed to authenticate","attr":{"client":"10.0.1.4:55416","isSpeculative":false,"isClusterMember":false,"mechanism":"","user":"","db":"","error":"AuthenticationAbandoned: Authentication session abandoned, client has likely disconnected","result":337,"metrics":{"conversation_duration":{"micros":320009905,"summary":{}}},"extraInfo":{}}}

The important bit is “Failed to authenticate”

I’m guessing this is the core of my problem, that the mongos is not able to get in touch (correctly/partially) with the new shard as there’s a credentials problem.

My issue is that I’ve not specified credentials anywhere (other than the mongo environment variable to define root username and password, which don’t seem to work in clustered mode)

Why would I need to add a user to the new shard. There’s no sign of any users that I can see in mongos or in the sharded replicas.

Assuming I have to create a user, which database, collection and/or replicaset would I need to add it to, and where would I refer to the user? Presumably the addShard connection string?

I’ve backtracked, and torn everything down and started afresh using the same compose file, and initialised everything new, and the system works fine with 3 shards. The 40 records are spread around the 3 shards, not 2.

If I then try to add a 4th shard (because why not!), it does the same as when adding 1 shard to 2 - the one added after the system is set up never corrects itself.

If anyone can point out what I’m doing wrong, or give me some pointers, I’d greatly appreciate it.

Is what I’m trying to do actually possible? The mongo docs say to just do sh.addShard.

Thanks.

Hi Pete

The balancer will kick in only if the collection pass a threshold. By default, in order to start balancing two shards must have a data size difference for a given collection of at least 384MB.

You can check the data distribution with the command $shardedDataDistribution.

Note that you can see the list of shards in your cluster with listShards

Additionally, when you are adding new shards you can leverage resharding (on the same shard key using option forceRedistribution) to redistribute the data without down time.
Differently from the balancer, resharding allows for parallel data migrations resulting in a faster data distribution.

In general:

  • keep the balancer running in order to keep the data equally distributed across the shards
  • consider resharding when you are foreseen a change to your data distribution (e.g. change shard key, adding/removing shards, enable zones)
3 Likes

Thank you so much. That’s given me the answer.

The amount of data (even 1M documents) is only ~40Mb, so that’s why the third shard never got involved.

I used the forceRedistribution (overriding the numInitialChunks, setting it to 5 as I’ve only got 40 documents for my tests), and the 3rd shard woke up and data was distributed.

Thanks again for your help - you’re a lifesaver.

Pete.

1 Like

Adding to my previous message

Another option you have is to decrease the default chunks size, for example to 1MB this will lower the threshold to 3MB.

I advise to keep the default size for collection in the order of GB in order to avoid frequent move data

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.