Balancing performance after adding a shard

Hello,
can a single-core performance of primary config server be a bottleneck of chunk migration after adding an additional shard?

Context:
I have a 4-shard (3 nodes each, plus 3 config server) 4.4.1 cluster. Each node has about 3TB compressed (zstd) data, ~15TB uncompressed. I have recently added a new (5th) shard due to DB growth. After a few hours I calculated that balancing will take long weeks or even months. I also found that mongod process on primary config server uses about 1 full CPU core (on a 4-core) system and seems to be a bottleneck since I/O (including network) and CPU usage on other nodes is rather low. Is the balancer process single-threaded or it just doesn’t want to use the whole CPU? Can I speed it up by adding more CPU cores?
Is there any other way to speed the balancing up?

Without that I may have disk space issues on existing nodes since I won’t be able to release disk space fast enough.

Thanks
Andrzej

Bump, after almost 2 years and adding new shards I always have the same problem of balancer performance but it hits me even harder (the performance is even lower).
Is there ANY way to speed up the balancer?

Hey Andrzej,

The balancer is currently single-threaded in order to reduce the impact of chunk migrations on ongoing workloads. With that said, depending on your machine sizes, you can be bottlenecked on RAM, CPU or IOPS. We are actively working on speeding up the addShard process and will get back to you once the improvements are delivered.

Thanks for your input!

Garaudy