Bug in c# Driver 2.11.0

After many hours trying to figure out the issue i have found that the C# Drivers 2.11.0 have an issue. Depending on what other third party libraries have been used before using the driver they will just throw timeouts exception. Reverting the Driver to 2.10.4 fix the issue. So it’s something that have been added between both version. Right now i have found that if you open a MongoClient at the begining of the application (first line in the program.cs) it will work even after calling the third party dll’s that cause mongo to fail. If instead you call any of these third party dll then try to make any call to mongo you will always have timeout error. I have more details on the operation i was doing over at stackoverflow https://stackoverflow.com/q/63305491/2748412

Is there anyways to use allowDiskUse on a find with version 2.10.4

Hi @CyberFranck_N_A and welcome to the forums,

The allowDiskUse option is specific for MongoDB Aggregation framework only. The option enables writing to temporary files. When set to true it allows aggregation stages to write data to the _tmp subdirectory in the dbPath directory.

This is not related to a specific MongoDB driver or version, this is the behaviour of the server.

Looking at the error stack trace that you posted on StackOverflow:

A timeout occured after 30000ms selecting a server using CompositeServerSelector{ Selectors = MongoDB.Driver.MongoClient+AreSessionsSupportedServerSelector, LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 } }. Client view of cluster state is { ClusterId : “1”, ConnectionMode : “Automatic”, Type : “Unknown”, State : “Disconnected”, Servers : [{ ServerId: “{ ClusterId : 1, EndPoint : “127.0.0.1:27017” }”, EndPoint: “127.0.0.1:27017”, ReasonChanged: “ServerInitialDescription”, State: “Disconnected”, ServerVersion: , TopologyVersion: , Type: “Unknown”, LastHeartbeatTimestamp: null, LastUpdateTimestamp: “2020-08-07T16:00:54.4780565Z” }] }.

Also from the server log:

{“t”:{"$date":“2020-08-07T12:31:05.334-04:00”},“s”:“I”, “c”:"-", “id”:20883, “ctx”:“conn249”,“msg”:“Interrupted operation as its client disconnected”,“attr”:{“opId”:4183920}}
{“t”:{"$date":“2020-08-07T12:31:05.334-04:00”},“s”:“I”, “c”:“NETWORK”, “id”:22989, “ctx”:“conn249”,“msg”:“Error sending response to client. Ending connection from remote”,“attr”:{“error”:{“code”:6,“codeName”:“HostUnreachable”,“errmsg”:“Connection reset by peer”},“remote”:“127.0.0.1:61403”,“connectionId”:249}}

From a brief look, it is likely that the client was disconnected from the server. Is there something on the network layer that is interrupting the connection ?

If you’re still encountering this issue, and able to reproduce it consistently, could you please provide:

  • MongoDB server version
  • Operating System and version
  • A minimal reproducible code example
  • Any other relevant information, i.e. setup etc.

Regards,
Wan.

Error is reproducible 100% of the time.

Switching driver from 2.11.0 to 2.10.4 fix the issue.

MongoDB server version 4.4.0
Windows 10 latest version (brand new pc new install)

example code based on the stack overflow old code that does work :

 // get the client
 var client = new MongoClient("ConnectionStringHere");

// get the database
var db = client.GetDatabase("SomeDatabase");

// create the filter for collection names
var filter = new BsonDocument("name", "SomeCollection");
var options = new ListCollectionNamesOptions { Filter = filter };

// check if the filter return at least 1 record
var containAtLEastOneRecord = db.ListCollectionNames(options).Any();

// declare a third party DLL object
var test = ThirdPartyDll.SomeClass();

here the similar example but this does not work

// declare a third party DLL object
var test = ThirdPartyDll.SomeClass();

 // get the client
 var client = new MongoClient("ConnectionStringHere");

// get the database
var db = client.GetDatabase("SomeDatabase");

// create the filter for collection names
var filter = new BsonDocument("name", "SomeCollection");
var options = new ListCollectionNamesOptions { Filter = filter };

// check if the filter return at least 1 record
var containAtLEastOneRecord = db.ListCollectionNames(options).Any();

Noticed the ONLY difference is that i called another third party dll before yours and it doesn’t work with 2.11.0 but if i switch to 2.10.4 both code works. I do use over 250 third party DLL and only a single one cause that issue if i instantiate it before yours. I have contacted their support and as their dll do not crash and they say it’s up to you to fix your issue. I assumed there must be culture info issue between same dll or something like it that is not standard in your code in 2.11.0 that is correct in 2.10.4

1 Like

Hi @CyberFranck_N_A,

Thanks for the extra information. Would you be able to share the name of this third party DLL ? Essentially an attempt to create a reproducible test case for others.

Also, do you get any warning or error message during build time ?

Regards,
Wan

The third party DLL is called Eyeshot. It’s a proprietary CAD engine. There is no error or warning regarding this at build time in the list. The only one i get is the runtime exception

A timeout occured after 30000ms selecting a server using CompositeServerSelector…

Hi @CyberFranck_N_A,

Thanks for that information. Unfortunately it’s challenging to debug an issue without a reproducible test case.

Could you try setting up MongoDB on a different machine, and connecting to it remotely ? You could try to spin up your own LAN server, or just spin-up free-shared-tier MongoDB Atlas. Just guessing, but I would like to know whether your machine is overloaded on runtime which prevents the local server to responds adequately.

Regards,
Wan.

the DB is not local. I have ordered a brand new PC with windows 10 installed and Mongo community on it and that’s the only thing installed. After test are complete we will create a dedicated VM running on Windows Server. My test are done from my PC connecting to that remote computer. Also i do have the same issue while trying to connect to my local database too. The server is running a Ryzen 9 3950X 16 cores and local i am running the threadripper 3970X 32 cores so being overloaded i don’t think that’s the problem here.

Hi,
I’m spreadly facing the same issue connecting to Atlas on DEV and STAGING environment, i.e. different clients, different Atlas clusters. Cannot say it’s because of a specific 3rd party dll, I’m still investigating.
I’ll try to downgrade the driver to the 2.10.4 on the DEV one and try to reproduce.
For both of my envs I’ve a WebAPI app on Azure Germany which connects to Atlas cluster, both as M0 Sandbox (General) deployed on AWS / Frankfurt (eu-central-1).
PROD is still running with the 2.10.2 with no problems. Also, the code is not changed around the driver usage.

So i might not be just due to the DLL loading order even if i can actually reproduce this every single time by changing the order only ?

Today I downgraded the DEV and the STAGING environments to 2.10.4 and later I was debugging a branched codebase running the driver version 2.11.0 getting the timeout too.
I downgraded the driver in the branched as well to 2.10.4 and I’m not facing the issue anymore.
Unfortunately I have no so much time to investigate into this deeper, but I’m 100% sure the code around the driver wasn’t changed and the only difference is the driver version. I’ll wait the next version to check if the issue is solved.

I have the same problem with just a simple WinForms application trying to enumerate DB’s on the local server.
The problem was intermittent and was solved when I downgraded to driver version 2.10.4.
The only thing I did on the server is define an admin user with password protection on the DB.

Hi @Oren_Lev, and welcome to the forum,

Although you may see a similar error message, the cause may be entirely different. For example, both a third party DLL interference versus network interference could cause a CompositeServerSelector issue (amongst other things).

If the issue that you are encountering is reproducible, please open a new discussion thread/topic. Please provide:

  • Example WinForms application
  • MongoDB server version
  • Architecture of the application/server. i.e. remote, Atlas, etc.

Best regards,
Wan.

1 Like

We are using 2.11.5 and we got this issue as well. It doesn’t happen very often.

A timeout occured after 30000ms selecting a server using CompositeServerSelector{ Selectors = MongoDB.Driver.MongoClient+AreSessionsSupportedServerSelector, LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 } }. Client view of cluster state is { ClusterId : “1”, ConnectionMode : “Automatic”, Type : “Sharded”, State : “Disconnected”, Servers : [{ ServerId: “{ ClusterId : 1, EndPoint : “192.168.225.110:27017” }”, EndPoint: “192.168.225.110:27017”, ReasonChanged: “InvalidatedBecause:ChannelException:MongoDB.Driver.MongoConnectionException: An exception occurred while receiving a message from the server. —> System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host. —> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host
at System.Net.Sockets.Socket.EndReceive(IAsyncResult asyncResult)
at System.Net.Sockets.NetworkStream.EndRead(IAsyncResult asyncResult)
.
.
.
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at MongoDB.Driver.Core.Servers.Server.ServerChannel.d__33`1.MoveNext()”, State: “Disconnected”, ServerVersion: , TopologyVersion: , Type: “Unknown”, LastHeartbeatTimestamp: null, LastUpdateTimestamp: “2021-01-20T16:05:37.3963307Z” }] }.

Hi, Sam,

Thank you for reaching out to us about this issue. Without the full stack trace, it is difficult to diagnose the issue definitively. We did however recently discover and fix a race condition in our SDAM implementation, CSHARP-3302. A fix for CSHARP-3302 has been released in .NET/C# Driver 2.11.6. Please try upgrading and letting us know if this resolves your sporadic server selection timeout issues.

Thanks,
James

2 Likes

Hi,
I have the same issue here even after using the 2.11.6 driver.
The connection string am using to create the Mongo Client is the following:

   var database = new MongoClient("mongodb+srv://#userName:#password@cluster0.eqam6.gcp.mongodb.net/database_name?retryWrites=true&w=majority&connect=replicaSet"); 

and the stack trace of the error is the following:

[MESSAGE] --- A timeout occured after 30000ms selecting a server using CompositeServerSelector{ Selectors = MongoDB.Driver.MongoClient+AreSessionsSupportedServerSelector, LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 } }. Client view of cluster state is { ClusterId : "1", ConnectionMode : "ReplicaSet", Type : "ReplicaSet", State : "Disconnected", Servers : [{ ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/cluster0-shard-00-00.eqam6.gcp.mongodb.net:27017" }", EndPoint: "Unspecified/cluster0-shard-00-00.eqam6.gcp.mongodb.net:27017", ReasonChanged: "Heartbeat", State: "Disconnected", ServerVersion: , TopologyVersion: , Type: "Unknown", HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while opening a connection to the server. ---> System.TimeoutException: Timed out connecting to Unspecified/cluster0-shard-00-00.eqam6.gcp.mongodb.net:27017. Timeout was 00:00:30.
   at MongoDB.Driver.Core.Connections.TcpStreamFactory.<ConnectAsync>d__7.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MongoDB.Driver.Core.Connections.TcpStreamFactory.<CreateStreamAsync>d__4.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MongoDB.Driver.Core.Connections.SslStreamFactory.<CreateStreamAsync>d__4.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MongoDB.Driver.Core.Connections.BinaryConnection.<OpenHelperAsync>d__51.MoveNext()
   --- End of inner exception stack trace ---
   at MongoDB.Driver.Core.Connections.BinaryConnection.<OpenHelperAsync>d__51.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MongoDB.Driver.Core.Servers.ServerMonitor.<InitializeConnectionAsync>d__32.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(Task task)
   at MongoDB.Driver.Core.Servers.ServerMonitor.<HeartbeatAsync>d__34.MoveNext()", LastHeartbeatTimestamp: "2021-02-23T09:14:35.3589542Z", LastUpdateTimestamp: "2021-02-23T09:14:35.3589542Z" }, { ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/cluster0-shard-00-01.eqam6.gcp.mongodb.net:27017" }", EndPoint: "Unspecified/cluster0-shard-00-01.eqam6.gcp.mongodb.net:27017", ReasonChanged: "Heartbeat", State: "Disconnected", ServerVersion: , TopologyVersion: , Type: "Unknown", HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while opening a connection to the server. ---> System.TimeoutException: Timed out connecting to Unspecified/cluster0-shard-00-01.eqam6.gcp.mongodb.net:27017. Timeout was 00:00:30.
   at MongoDB.Driver.Core.Connections.TcpStreamFactory.<ConnectAsync>d__7.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MongoDB.Driver.Core.Connections.TcpStreamFactory.<CreateStreamAsync>d__4.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MongoDB.Driver.Core.Connections.SslStreamFactory.<CreateStreamAsync>d__4.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MongoDB.Driver.Core.Connections.BinaryConnection.<OpenHelperAsync>d__51.MoveNext()
   --- End of inner exception stack trace ---
   at MongoDB.Driver.Core.Connections.BinaryConnection.<OpenHelperAsync>d__51.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MongoDB.Driver.Core.Servers.ServerMonitor.<InitializeConnectionAsync>d__32.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(Task task)
   at MongoDB.Driver.Core.Servers.ServerMonitor.<HeartbeatAsync>d__34.MoveNext()", LastHeartbeatTimestamp: "2021-02-23T09:14:35.4419744Z", LastUpdateTimestamp: "2021-02-23T09:14:35.4419744Z" }, { ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/cluster0-shard-00-02.eqam6.gcp.mongodb.net:27017" }", EndPoint: "Unspecified/cluster0-shard-00-02.eqam6.gcp.mongodb.net:27017", ReasonChanged: "Heartbeat", State: "Disconnected", ServerVersion: , TopologyVersion: , Type: "Unknown", HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while opening a connection to the server. ---> System.TimeoutException: Timed out connecting to Unspecified/cluster0-shard-00-02.eqam6.gcp.mongodb.net:27017. Timeout was 00:00:30.
   at MongoDB.Driver.Core.Connections.TcpStreamFactory.<ConnectAsync>d__7.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MongoDB.Driver.Core.Connections.TcpStreamFactory.<CreateStreamAsync>d__4.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MongoDB.Driver.Core.Connections.SslStreamFactory.<CreateStreamAsync>d__4.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MongoDB.Driver.Core.Connections.BinaryConnection.<OpenHelperAsync>d__51.MoveNext()
   --- End of inner exception stack trace ---
   at MongoDB.Driver.Core.Connections.BinaryConnection.<OpenHelperAsync>d__51.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MongoDB.Driver.Core.Servers.ServerMonitor.<InitializeConnectionAsync>d__32.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(Task task)
   at MongoDB.Driver.Core.Servers.ServerMonitor.<HeartbeatAsync>d__34.MoveNext()", LastHeartbeatTimestamp: "2021-02-23T09:14:35.3579464Z", LastUpdateTimestamp: "2021-02-23T09:14:35.3579464Z" }] }.

if you need any help to reproduce the issue please let me know.
Any help is appreciated.

I’ve been watching this thread for a couple months as we are experiencing the same issue. We are running three instances of mongo, each in an EC2 instance inside the same VPC and subnet, configured as a replica set. We have been receiving the same CompositeServerSelector error in a few different flavors, and upgrading to the 2.11.6 CSHARP driver version did not improve these timeouts.

We were hit a rash of these in production last week and applied a minPoolSize=20 to the most active mongoDB API clients, and that seems to have improved it slightly, but not eliminated the problem. This newer connection string is: mongodb://username:password@mongo0:27017,mongo1:27017,mongo2:27017/admin?authSource=admin&replicaSet=rs0&minPoolSize=20&maxPoolSize=500

The stack trace of the errors we are seeing:

[22:19:41 ERR] A timeout occured after 30000ms selecting a server using CompositeServerSelector{ Selectors = MongoDB.Driver.MongoClient+AreSessionsSupportedServerSelector, LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 } }. Client view of cluster state is { ClusterId : "1", ConnectionMode : "ReplicaSet", Type : "ReplicaSet", State : "Disconnected", Servers : [{ ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/mongo0:27017" }", EndPoint: "Unspecified/mongo0:27017", ReasonChanged: "Heartbeat", State: "Disconnected", ServerVersion: , TopologyVersion: , Type: "Unknown", HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while receiving a message from the server.
 ---> System.TimeoutException: The operation has timed out.
   at MongoDB.Driver.Core.Misc.StreamExtensionMethods.ReadAsync(Stream stream, Byte[] buffer, Int32 offset, Int32 count, TimeSpan timeout, CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Misc.StreamExtensionMethods.ReadBytesAsync(Stream stream, Byte[] buffer, Int32 offset, Int32 count, TimeSpan timeout, CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Connections.BinaryConnection.ReceiveBufferAsync(CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at MongoDB.Driver.Core.Connections.BinaryConnection.ReceiveMessageAsync(Int32 responseTo, IMessageEncoderSelector encoderSelector, MessageEncoderSettings messageEncoderSettings, CancellationToken cancellationToken)
   at MongoDB.Driver.Core.WireProtocol.CommandUsingCommandMessageWireProtocol`1.ExecuteAsync(IConnection connection, CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Connections.IsMasterHelper.GetResultAsync(IConnection connection, CommandWireProtocol`1 isMasterProtocol, CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Servers.ServerMonitor.GetIsMasterResultAsync(IConnection connection, CommandWireProtocol`1 isMasterProtocol, CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Servers.ServerMonitor.HeartbeatAsync(CancellationToken cancellationToken)", LastHeartbeatTimestamp: "2021-03-14T22:18:27.4390847Z", LastUpdateTimestamp: "2021-03-14T22:18:27.4390850Z" }, { ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/mongo1:27017" }", EndPoint: "Unspecified/mongo1:27017", ReasonChanged: "Heartbeat", State: "Disconnected", ServerVersion: , TopologyVersion: , Type: "Unknown", HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while opening a connection to the server.
 ---> MongoDB.Driver.MongoConnectionException: An exception occurred while receiving a message from the server.
 ---> System.TimeoutException: The operation has timed out.
   at MongoDB.Driver.Core.Misc.StreamExtensionMethods.ReadAsync(Stream stream, Byte[] buffer, Int32 offset, Int32 count, TimeSpan timeout, CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Misc.StreamExtensionMethods.ReadBytesAsync(Stream stream, Byte[] buffer, Int32 offset, Int32 count, TimeSpan timeout, CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Connections.BinaryConnection.ReceiveBufferAsync(CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at MongoDB.Driver.Core.Connections.BinaryConnection.ReceiveMessageAsync(Int32 responseTo, IMessageEncoderSelector encoderSelector, MessageEncoderSettings messageEncoderSettings, CancellationToken cancellationToken)
   at MongoDB.Driver.Core.WireProtocol.CommandUsingQueryMessageWireProtocol`1.ExecuteAsync(IConnection connection, CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Connections.IsMasterHelper.GetResultAsync(IConnection connection, CommandWireProtocol`1 isMasterProtocol, CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Connections.ConnectionInitializer.InitializeConnectionAsync(IConnection connection, CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Connections.BinaryConnection.OpenHelperAsync(CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at MongoDB.Driver.Core.Connections.BinaryConnection.OpenHelperAsync(CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Servers.ServerMonitor.InitializeConnectionAsync(CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Servers.ServerMonitor.HeartbeatAsync(CancellationToken cancellationToken)", LastHeartbeatTimestamp: "2021-03-14T22:18:27.4350973Z", LastUpdateTimestamp: "2021-03-14T22:18:27.4350976Z" }, { ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/mongo2:27017" }", EndPoint: "Unspecified/mongo2:27017", ReasonChanged: "Heartbeat", State: "Disconnected", ServerVersion: , TopologyVersion: , Type: "Unknown", HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while opening a connection to the server.
 ---> System.TimeoutException: Timed out connecting to 172.31.128.202:27017. Timeout was 00:00:30.
   at MongoDB.Driver.Core.Connections.TcpStreamFactory.ConnectAsync(Socket socket, EndPoint endPoint, CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Connections.TcpStreamFactory.CreateStreamAsync(EndPoint endPoint, CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Connections.BinaryConnection.OpenHelperAsync(CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at MongoDB.Driver.Core.Connections.BinaryConnection.OpenHelperAsync(CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Servers.ServerMonitor.InitializeConnectionAsync(CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Servers.ServerMonitor.HeartbeatAsync(CancellationToken cancellationToken)", LastHeartbeatTimestamp: "2021-03-14T22:19:25.9946242Z", LastUpdateTimestamp: "2021-03-14T22:19:25.9946246Z" }] }.

We are not experiencing primary stepdowns, but may be seeing widely varying load conditions that are affecting the RTT.

From the primary:

	"connections" : {
		"current" : 1142,
		"available" : 50058,
		"totalCreated" : 7653175,
		"active" : 3
	},
	"electionMetrics" : {
		"stepUpCmd" : {
			"called" : 0,
			"successful" : 0
		},
		"priorityTakeover" : {
			"called" : 1,
			"successful" : 1
		},
		"catchUpTakeover" : {
			"called" : 0,
			"successful" : 0
		},
		"electionTimeout" : {
			"called" : 0,
			"successful" : 0
		},
		"freezeTimeout" : {
			"called" : 0,
			"successful" : 0
		},
		"numStepDownsCausedByHigherTerm" : 0,
		"numCatchUps" : 0,
		"numCatchUpsSucceeded" : 0,
		"numCatchUpsAlreadyCaughtUp" : 1,
		"numCatchUpsSkipped" : 0,
		"numCatchUpsTimedOut" : 0,
		"numCatchUpsFailedWithError" : 0,
		"numCatchUpsFailedWithNewTerm" : 0,
		"numCatchUpsFailedWithReplSetAbortPrimaryCatchUpCmd" : 0,
		"averageCatchUpOps" : 0
	},

As far I can tell with multiple dev reviews, we are correctly using the singleton instantiation of the MongoClient and rely on it to manage the connection pools when needed.

When the mongo pool hangs, the API response times spike, then monotonically decrease until the API container that is using the MongoClient is recycled because of connection timeouts.

My next thought is to start playing with localThreshold to increase the window above the default 15ms to make sure that one or more servers are available in extreme events. Is there a way to get the RTT to servers as seen from the mongo driver and dump those in the case of these timeouts to see if they are limiting?

Hi, Nadeem and Roger,

Thank you to each of you for reaching out to us. We have reviewed the log messages provided by both of you and we do not see evidence of CSHARP-3302. CSHARP-3302 was fixed in MongoDB .NET/C# driver v2.11.6 and v2.12.0. As well CSHARP-3302 is evidenced by heartbeats becoming stuck (LastHeartbeatTimestamp and LastUpdateTimestamp not updating) while one member shows that it is in the middle of processing a heartbeat.

Server selection timeouts can occur for a wide variety of reasons and is most often the result of network connectivity issues between your client application and the cluster. In both the log lines provided, we see that the client application is unable to reach any of the cluster members due to network timeouts.

We recommend verifying network connectivity and configuration, especially firewalls and connection limits and similar issues that could prevent your client application from successfully connecting with your cluster.

Sincerely,
James

1 Like

James,

Thanks for the review. While I cannot rule out a network issue, our instances themselves report no intra-node dropouts, changes of primary, or timeouts. All of our applications are run with internal load balancers and EC2 instances on the same VPC as the Mongo ec2 instances. We do have some heavy .AggregateAsync<TModel>(pipeline, new AggregateOptions { BatchSize = 8000, AllowDiskUse = true })) operations, and continue to optimize things there.

That said, we just saw another error that was thrown at the start of the contemporaneous CompositeServerSelector timeouts that may provide additional insights, as it seems strictly within the driver as a NullRef

System.NullReferenceException: Object reference not set to an instance of an object.
  Module "MongoDB.Driver.Core.Servers.ServerMonitor", in CancelCurrentCheck
  Module "MongoDB.Driver.Core.Servers.Server", in HandleBeforeHandshakeCompletesException
  Module "MongoDB.Driver.Core.Servers.Server", in GetChannelAsync
  Module "MongoDB.Driver.Core.Operations.RetryableReadContext", in InitializeAsync
  Module "MongoDB.Driver.Core.Operations.RetryableReadContext", in CreateAsync
  Module "MongoDB.Driver.Core.Operations.AggregateOperation`1", in ExecuteAsync
  Module "MongoDB.Driver.OperationExecutor+<ExecuteReadOperationAsync>d__3`1", in MoveNext
  Module "MongoDB.Driver.MongoCollectionImpl`1+<ExecuteReadOperationAsync>d__98`1", in MoveNext
  Module "MongoDB.Driver.MongoCollectionImpl`1+<AggregateAsync>d__22`1", in MoveNext
  Module "MongoDB.Driver.MongoCollectionImpl`1+<UsingImplicitSessionAsync>d__106`1", in MoveNext

We typically have between 1400-4000 connections open to the db, and have followed all the thread/socket/heap/ulimit settings pointers to make sure instances aren’t having socket starvation.

Thanks again,
Roger

Hi Roger, thanks for reporting this issue. It was fixed in 2.12.0, and tracked here: https://jira.mongodb.org/browse/CSHARP-3436.

2 Likes