Failed to refresh session cache but activeSessionCount of mongos is initiated every 5 mins ( logicalSessionRefreshMillis)

peterkim · May 17, 2023, 6:45am

Test Env >>

mongo version 4.2.23
sharded cluster
- Primary : 1
- Secondary : 5
- Arbiter : 1
- all members.votes = 1

I shutdown 3 Secondary members result in that majority for votes is satisfied but majority of readwrite concern of data-bearing member is not satisfied.

After shutdown 3 Secondaries,
At every 5 mins, I could see a message from mongos log like the following
"…[LogicalSessionCacheRefresh] Failed to refresh session cache: WriteConcernFailed: waiting for replication timed out; Error details: { wtimeout: true} at testreplicaset "

So I expected the activeSessionCount of mongos would not be initiated but keep increasing
but it was initiated every 5mins.

Is this normal?

Kushagra_Kesav · May 22, 2023, 9:40am

Hello @peterkim,

Welcome to the MongoDB Community forums

In my reproduction, I simulated a replica set deployment consisting of 1 primary node, 5 secondary nodes (data-bearing members), and 1 arbiter - I observed that when I shut down 3 out of the 5 secondary nodes, the write operations remained unaffected and I’m ensuring the majority of data-bearing members with 2 secondaries (data-bearing members), 1 primary and 1 arbiter. However, when I shut down one more secondary node, there was no remaining primary node, resulting in the inability to perform write operations. To read more, please refer to the MongoDB documentation for Write Concern for Replica Sets.

This error is part of the regular logical session routine. By default, MongoDB periodically (every 5 minutes) persists the content of the cached information to the config.system.session collection. The error indicates that there was no quorum to satisfy the required write concern for the job responsible for maintaining the logical session mechanism.

Could you please give an example of the scenario that you have in mind?

In addition, I recommend updating your MongoDB version to the latest release. It’s worth noting that MongoDB 4.2 is no longer supported, and upgrading to a newer version can provide improved stability, bug fixes, and additional features. You can refer to the EOL Support Policies for more information on MongoDB versions and their support status.

Best regards,
Kushagra

peterkim · May 23, 2023, 7:54am

It’s your answer >>
“In my reproduction, I simulated a replica set deployment consisting of 1 primary node, 5 secondary nodes (data-bearing members), and 1 arbiter - I observed that when I shut down 3 out of the 5 secondary nodes, the write operations remained unaffected and I’m ensuring the majority of data-bearing members with 2 secondaries (data-bearing members), 1 primary and 1 arbiter. However, when I shut down one more secondary node, there was no remaining primary node, resulting in the inability to perform write operations.”

My Answer>>
Yes, when majority of votes is satisfied , Primary exists and cluster works normally, but when majority of data-bearing member is not satisfied , write transaction with majority write concern will fail. So I expected that the write for LogicalSessionCacheRefresh with majority write concern would fail with lack of data-bearing members. As I expected, I saw an error “Failed to refresh session cache:
WriteConcernFailed: waiting for replication timed out” in the mongos error log with lack of data-bearing members ( 3 secondaries shutting down), but activeSessionCount of mongos was initiated successfully every 5 mins. It should be kept increasing because refresh session cache was failed with above Error. Is my description too poor to make you understand this situation?