Connect to multiple mongos but they are imbalancing

Jay_Chung · October 28, 2022, 9:18pm

Hi,

I setup a sharded cluster and mainly interact via mongos with go-mongo-driver. There are 6 query routers:

a:27016
b:27016
c:27016
d:27016
e:27016
f:27016

I format connect string according to the order above. It seems a:27016 consumes more CPU than other mongos instance (about 50%), which means a:27016 takes more requests and it’s imbalancing.

I read the server selection algorithm and everything is great. This is my understanding:

When server kind is Mongos

ReadPrefSelector returns all candidates.

LatencySelector selects according to latency, but I think most cases it returns all candidates as well.

Topology.SelectServer picks 2 servers from candidates randomly, and then picks the less operation count one

Do I misconfigure or is there any bug leads to imbalancing?

kevinadi · October 31, 2022, 4:23am

Hi @Jay_Chung welcome to the community!

Could you share your connection code, and the connection URI string? You can change the server names if you need to just like you did in the post, but seeing the URI string might help.

However, from Read Preference for Sharded Clusters:

If there is more than one mongos instance in the connection seed list, the driver determines which mongos is the “closest” (i.e. the member with the lowest average network round-trip-time) and calculates the latency window by adding the average round-trip-time of this “closest” mongos instance and the localThresholdMS . The driver will load balance randomly across the mongos instances that fall within the latency window.

I think it should randomly choose a mongos that falls within a latency window. I have a test in mind: what if you reverse the order of the mongos in your URI string (e.g. f,e,d,c,b,a) and see if now server f is the busiest one, or is it still server a? This would make an interesting data point for the investigation.

Best regards
Kevin