Interrupted due to server monitor timeout

I do some fairly long processing from Node and I regularly get an error message that says:
Name: PooleClearedOnNetworkError
message: Connection to <…> interrupted due to server monitor timeout

I’ve seen these after i upgraded to an M10 instance. I am running Node on a local machine connecting through to Atlas

I’ve seen some suggestions for different error messages for GCP

Any advice?

Hey @michael_hyman1,

The PoolClearedError can occur due to Intermittent network outages that cause the driver to lose connectivity.

Essentially, this error happens when the driver believes the server associated with a connection pool is no longer available. The pool gets cleared and closed so that operations waiting for a connection can retry on a different server.

To address this:

  • Enable retryability in your application if you haven’t already. This allows operations to retry seamlessly on another server.
  • Enable SDAM (Server Discovery and Monitoring) monitoring to understand why servers are being marked unknown.
  • Similarly, enable connection-level monitoring to see network errors and events.
  • Check the cluster’s logs for elections, failovers, and step-downs that could be disrupting connectivity.
  • Ensure the cluster itself is healthy and members are communicating properly.

Let me know if you have any other questions!


thanks. I have put in the SDAM and connection level monitoring and will see what i find. retry is already on.
i reduced the size of my bulk writes but that doesn’t seem to have had any impact on this. it happens when i’m doing very long runs (> 30 minutes), although i do a lot of reads and writes throughout that period. will see if the new logs show anything

did a bunch of refactoring of the connection pool and also added processing from some of these messages; just did a 90 minute run without trouble so i’m hoping the problem is behind me

1 Like

still occurs. this is happening during reads of very long files. somewhere in I get a serverHeartbeatFailed for two connectionId, then a serverDescriptionChanged, mix of pool changes, then finally the PoolClearedOnNetworkError that halts everything

Does this mean i need to check the connection status before every read?
Do i need to switch to something like mongoose?

This happens when I am around 2M records or so into iterating through a table using for await (const row of cursor). I’m not sure how to recover from it, the behavior is inconsistent. It starts with a serverHeartbeatFailed and then a connectionPoolCleared. Since I’m in the midst of iterating through a collection, how do I recover? Do I need to artificially introduce a monotonically increasing page value to go through in chunks?