Connect timeout in C#driver during Load tests

Currently using:

  • .NET MongoDB.Driver v2.13.1

  • Mongo cluster (replica set), with primary (v4.4.6) and two secondaries (v4.4.8).

  • Connection string is:

mongodb://{login}:{password}@master.local:27017, worker-1.local:27017, worker-2.local:27017/?replicaSet=rs0&minPoolSize=20&maxPoolSize=1000&connectTimeoutMS=10000

We are currently experiencing some awkward behaviour of our Mongo .Net Core 3.1 API during our load tests. Exception that troubles us is:

A timeout occurred after 30000ms selecting a server using CompositeServerSelector
{
	Selectors = MongoDB.Driver.MongoClient+AreSessionsSupportedServerSelector,
	LatencyLimitingServerSelector
	{
		AllowedLatencyRange = 00:00:00.0150000
	},
	OperationsCountServerSelector
}
Client view of cluster state is
{ 
	ClusterId : "1",
	ConnectionMode : "ReplicaSet",
	Type : "ReplicaSet",
	State : "Disconnected",
	Servers : [
	{
		ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/ worker-1.local:27017" }",
		EndPoint: "Unspecified/ worker-1.local:27017",
		ReasonChanged: "ServerInitialDescription",
		State: "Disconnected",
		ServerVersion: ,
		TopologyVersion: ,
		Type: "Unknown",
		LastHeartbeatTimestamp: null,
		LastUpdateTimestamp: "2021-09-16T11:20:03.0135378Z" 
	},
	{ 
		ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/ worker-2.local:27017" }",
		EndPoint: "Unspecified/ worker-2.local:27017",
		ReasonChanged: "ServerInitialDescription",
		State: "Disconnected",
		ServerVersion: ,
		TopologyVersion: ,
		Type: "Unknown",
		LastHeartbeatTimestamp: null,
		LastUpdateTimestamp: "2021-09-16T11:20:03.0141756Z"
	},
	{	
		ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/master.local:27017" }",
		EndPoint: "Unspecified/master.local:27017",
		ReasonChanged: "ServerInitialDescription",
		State: "Disconnected",
		ServerVersion: ,
		TopologyVersion: ,
		Type: "Unknown",
		LastHeartbeatTimestamp: null,
		LastUpdateTimestamp: "2021-09-16T11:20:03.0118832Z"
	}
		]
}.

Error happens in 5% cases when RampUp period of virtual users is long (60sec), but in 99% cases when RampUp is 0 or some small number (like 5sec).
For the first minut or two of the test, .NET API starts throwing these exceptions for some requests, and some requests take quite a while to execute.

Any info about the cause is helpful

Hi Markus, this isn’t a solution but I’ve also been having similar issues, so I thought I’d share my findings.
Version: MongoDb.Driver v2.12.3.0 with Dot Net Core 3.1

In our case we tested on several machines, and we found it harder to replicate the issue - but once the issue occurs, the affected host is unable to reconnect at all. So we can see from our own tests (and other logs) that it is not a connectivity issue, because the other unaffected hosts are still connected to the database.

We found an open Jira ticket; our (my logs and yours) logs are same/similar to the ones shared throughout the comments, mentioning the CompositeServerSelector;

https://jira.mongodb.org/browse/CSHARP-1895?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&showAll=true

Which seems to be linked to another story which is a bugfix which seems to be planned for FY21Q4…
https://jira.mongodb.org/browse/CSHARP-2490

I think this bugfix is probably what we are both waiting for. In the mean time, have a band-aid fix of just restarting the server when it happens (not often) (Or even just not using MongoDb).
Unfortunately I don’t know if there’s any other fix to avoid this issue until then. :cry:
Has anyone else had similar issues?

I have been able to go around this bug by increasing APIs thread pool min count of workerThreads and completionPortThreads:

ThreadPool.SetMinThreads(xx, xx);

I am not happy with this solution, but load tests are passing with no timeouts