Hi Kevin,
I enabled the loglevel and found almost every 5 mins there is a response time increase and which makes some of our queries to become timedout. Looks like starting from 4.2, mongo Added MajorityService and in every 5mins LogicalSessionRefresh its doing a w:majority with timeout of 15000. During the same time the response (in mongotop) goes very high (from 200ms to 8s).
2022-11-19T08:13:11.923+0000 D2 COMMAND [conn424] run command config.$cmd { update: "system.sessions", ordered: false, allowImplicitCollectionCreation: false, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" }
2022-11-19T08:13:11.970+0000 D2 REPL [conn424] Waiting for write concern. OpTime: { ts: Timestamp(1668845591, 1583), t: 48 }, write concern: { w: "majority", wtimeout: 15000 }
2022-11-19T08:13:11.974+0000 I COMMAND [conn424] command config.$cmd command: update { update: "system.sessions", ordered: false, allowImplicitCollectionCreation: false, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" } numYields:0 reslen:7137 locks:{ ParallelBatchWriterMode: { acquireCount: { r: 183 } }, ReplicationStateTransition: { acquireCount: { w: 366 } }, Global: { acquireCount: { r: 183, w: 183 } }, Database: { acquireCount: { w: 183 } }, Collection: { acquireCount: { w: 183 } }, Mutex: { acquireCount: { r: 366 } } } flowControl:{ acquireCount: 183, timeAcquiringMicros: 65 } storage:{} protocol:op_msg 51ms
2022-11-19T08:13:11.976+0000 D2 COMMAND [conn424] run command config.$cmd { delete: "system.sessions", ordered: false, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" }
2022-11-19T08:13:11.988+0000 D2 REPL [conn424] Waiting for write concern. OpTime: { ts: Timestamp(1668845591, 1588), t: 48 }, write concern: { w: "majority", wtimeout: 15000 }
2022-11-19T08:13:11.989+0000 I COMMAND [conn424] command config.$cmd command: delete { delete: "system.sessions", ordered: false, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" } numYields:0 reslen:230 locks:{ ParallelBatchWriterMode: { acquireCount: { r: 244 } }, ReplicationStateTransition: { acquireCount: { w: 488 } }, Global: { acquireCount: { r: 244, w: 244 } }, Database: { acquireCount: { w: 244 } }, Collection: { acquireCount: { w: 244 } }, Mutex: { acquireCount: { r: 246 } } } flowControl:{ acquireCount: 244, timeAcquiringMicros: 57 } storage:{} protocol:op_msg 12ms
2022-11-19T08:13:45.030+0000 D2 COMMAND [LogicalSessionCacheRefresh] run command config.$cmd { update: "system.sessions", ordered: false, allowImplicitCollectionCreation: false, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" }
2022-11-19T08:13:45.088+0000 D2 REPL **[LogicalSessionCacheRefresh] Waiting for write concern. OpTime: { ts: Timestamp(1668845625, 368), t: 48 }, write concern: { w: "majority", wtimeout: 15000 }**
2022-11-19T08:13:45.094+0000 I COMMAND [LogicalSessionCacheRefresh] command config.$cmd command: update { update: "system.sessions", ordered: false, allowImplicitCollectionCreation: false, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" } numYields:0 reslen:3160 locks:{ ParallelBatchWriterMode: { acquireCount: { r: 226 } }, ReplicationStateTransition: { acquireCount: { w: 452 } }, Global: { acquireCount: { r: 226, w: 226 } }, Database: { acquireCount: { w: 226 } }, Collection: { acquireCount: { w: 226 } }, Mutex: { acquireCount: { r: 452 } } } flowControl:{ acquireCount: 226, timeAcquiringMicros: 69 } storage:{} protocol:op_msg 63ms
2022-11-19T08:13:45.094+0000 D2 COMMAND [LogicalSessionCacheRefresh] run command config.$cmd { delete: "system.sessions", ordered: false, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" }
2022-11-19T08:13:45.127+0000 D2 REPL [LogicalSessionCacheRefresh] **Waiting for write concern. OpTime: { ts: Timestamp(1668845625, 457), t: 48 }, write concern: { w: "majority", wtimeout: 15000** }
2022-11-19T08:13:45.130+0000 I COMMAND [LogicalSessionCacheRefresh] command config.$cmd command: delete { delete: "system.sessions", ordered: false, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" } numYields:0 reslen:230 locks:{ ParallelBatchWriterMode: { acquireCount: { r: 387 } }, ReplicationStateTransition: { acquireCount: { w: 774 } }, Global: { acquireCount: { r: 387, w: 387 } }, Database: { acquireCount: { w: 387 } }, Collection: { acquireCount: { w: 387 } }, Mutex: { acquireCount: { r: 459 } } } flowControl:{ acquireCount: 613, timeAcquiringMicros: 152 } storage:{} protocol:op_msg 35ms```
disabling the disableLogicalSessionCacheRefresh doesnt have any improvement.
Having the same clients but only mongoDB downgraded to 4.0.27 from 4.2 is giving the expected performance. So for sure, there is something in 4.2 which is impacting the performance. We tried changing the FCV to 4.0 on a 4.2 mongo but didnt provide any better results.