Sorry hoping in. Not my area but this might help others to know: are the documents all under constant change like user data? or some accumulated data taken daily or so.
also knowing the server version along with the error message might help too. that and if all shard has the same cpu/disk/network capabilities (twins) or some has lower capacities.
by the way, lowering the chunk size might need more trips for you but also may solve the problem.