Mongo document about writeMajorityCount is not very clear

venkataraman_r · November 7, 2022, 7:31am

Hi Team,

We are using mongo 4.2.21.

Starting from 4.2 mongoDB added a calculated writeMajorityCount. The document explains about how its calcualted by mongo automatically and what would happen if majority is not available.

But what is missing is,

what are the internal mongo operations will have an impact due to this writeMajorityCount derivation by mongo itself.
My client application doesnt set any writeConcern. So my application code I’m sure it will not have an impact. But db.dropDatabase() command is getting stuck when one of the member is down.
So need to know what other internal mongo commands will have an impact. Will REPL has any issue if the majority is not available. Because we see high CPU usage once we move to 4.2. There is no changes except the mongo upgrade. So we need to know if this is having any role to play.

Also is there a way i can change this Parameter. It is needed because we dont want to enable writeMajority .

Thanks,
Venkataraman

kevinadi · November 8, 2022, 5:25am

Hi @venkataraman_r welcome back!

The main reason for writeMajorityCount is that MongoDB is mainly designed to work as a replica set. To ensure writes won’t be rolled back is to have it propagated to the majority of voting nodes. However with arbiters this is tricky since an arbiter is a voting node with no data. See Implicit Default Write Concern for the formula when arbiters are involved.

what are the internal mongo operations will have an impact due to this writeMajorityCount derivation by mongo itself.

dropDatabase() is one command that defaults to majority write concern. However I don’t think there’s a list of commands requiring majority write concern. Note that you can specify write concern for db.dropDatabase() but if you don’t it defaults to majority.

My client application doesnt set any writeConcern. So my application code I’m sure it will not have an impact. But db.dropDatabase() command is getting stuck when one of the member is down.

I assume you’re using a PSA setup? This is expected since if the other data bearing member is down, the arbiter cannot acknowledge the write. Thus the command will just wait for a secondary acknowledgment that doesn’t arrive. The page Mitigate Performance Issues with PSA Replica Set would have more details into how to overcome this.

So need to know what other internal mongo commands will have an impact. Will REPL has any issue if the majority is not available. Because we see high CPU usage once we move to 4.2. There is no changes except the mongo upgrade. So we need to know if this is having any role to play.

I think the increased CPU is tangent to the write concern issue you’re seeing. However the first port of call is usually ensuring you’re following the production notes for the optimal setup. The blog post Performance Best Practices: Hardware and OS Configuration might also be of interest to you.

we dont want to enable writeMajority .

Write concern majority becomes the default in MongoDB 5.0, since it provides much greater assurances that your writes won’t be rolled back. However this depends on your use case. I understand that some use case might not need the majority assurance since it’s ok if some data got rolled back and you need high writing speed.

However another advantage of write concern majority is that it provides a measure of control for data going into the replica set. Without this, it’s easy to overwhelm the servers with writes that it cannot replicate fast enough, leading to issues such as a secondary falling off the oplog.

Having said that, if you understand all the risks for disabling write concern majority, for most commands, you can specify a write concern setting to deviate from the default.

Hope this is useful!

Best regards
Kevin

venkataraman_r · November 9, 2022, 8:14am

Thanks Kevin,

Thank you for your reply.

#2. Yes, we are using PSA 4 data bearing member and 1 ARB. We carefully reviewed the performance and decided to disable enableMajorityReadConcern and w:1 default . We dont want durability but higher throughput.
But the problem I notice from 4.2 is that, mongo provides the rs.conf().getLastErrorDefaults to set the write majority and timeout. Then we expect mongo to honor that value. Instead mongo does create a computation and creates this writeMajorityCount. what do you think about this discrepancy?

#3 Yes we follow the production checklist. But my question is will REPL suffer or contribute to high CPU.
Does REPL wait untill the oplog is written to the majority of the members due to this new writemajoritycount ?

As I said, starting from mongo 4.2, there is no server option to disable the write concern majority. We set the following but still mongo calculates as 3.
“getLastErrorDefaults” : {
“w” : 1,
“wtimeout” : 0
},

kevinadi · November 9, 2022, 11:53pm

I believe those defaults are for mainly write operations, and dropDatabase() is not really a normal write operation. I think those are treated slightly differently, and you can always set the write concern setting if you want to ensure that dropDatabase() is called with w:1.

However if this is not what you think should happen, please provide a feedback into the MongoDB Feedback Engine so this can be explored by the product team.

Does REPL wait untill the oplog is written to the majority of the members

Please correct me if I’m wrong here, but what I understand as REPL is Read-Eval-Print-Loop for interactive discovery, a feature in most scripted languages like Python or Node. Do you mean doing CRUD operations on mongosh or similar by this? This depends on the write concern setting for that particular write. If you set it to w:1 it should not wait for replication. If you set it to w:majority then it will wait until the majority have replicated the write in their journal.

As I said, starting from mongo 4.2, there is no server option to disable the write concern majority.

Actually the default write concern was changed to majority in MongoDB 5.0, so 4.2 and 4.4 still defaults to w:1

Also in MongoDB 4.4 there is a new command setDefaultRWConcern to set this cluster-wide.

We set the following but still mongo calculates as 3.

If you mean the value of writeMajorityCount, I believe that’s only informational and does not reflect the actual write concern setting. In your case, it is informing you that for a majority write to be acknowledged, you need at least 3 nodes.

Best regards
Kevin

venkataraman_r · November 14, 2022, 3:08am

No REPL, I meant Replication.

venkataraman_r · November 16, 2022, 8:52pm

Hi Kevin,

As I told you, we never send any queries with w:Majority from our clients. But, you can see the following log , operation done on system.sessions by mongo is doing w:majority and taking long time. So it looks like the w:majority is done at different places in mongo as well which is impacting performance time to time. Please check the r/w acquire count.

This is from Primary
/var/log/mongodb-27951.log:2022-11-16T04:34:27.802+0000 I COMMAND [conn8648] command config.$cmd command: update { update: “system.sessions”, ordered: false, allowImplicitCollectionCreation: false, writeConcern: { w: “majority”, wtimeout: 15000 }, $db: “config” } numYields:0 reslen:3160 locks:{ ParallelBatchWriterMode: { acquireCount: { r: 30 } }, ReplicationStateTransition: { acquireCount: { w: 30 } }, Global: { acquireCount: { w: 30 } }, Database: { acquireCount: { w: 30 } }, Collection: { acquireCount: { w: 30 } }, Mutex: { acquireCount: { r: 60 } } } flowControl:{ acquireCount: 30, timeAcquiringMicros: 7 } storage:{} protocol:op_msg 679ms

and sedondary of a different replica-set as well.

/var/log/mongodb-27958.log:2022-11-16T04:34:27.547+0000 I COMMAND [conn8893] command config.$cmd command: update { update: “system.sessions”, ordered: false, allowImplicitCollectionCreation: false, writeConcern: { w: “majority”, wtimeout: 15000 }, $db: “config” } numYields:0 reslen:1899 locks:{ ParallelBatchWriterMode: { acquireCount: { r: 17 } }, ReplicationStateTransition: { acquireCount: { w: 17 } }, Global: { acquireCount: { w: 17 } }, Database: { acquireCount: { w: 17 } }, Collection: { acquireCount: { w: 17 } }, Mutex: { acquireCount: { r: 34 } } } flowControl:{ acquireCount: 17, timeAcquiringMicros: 2 } storage:{} protocol:op_msg 1612ms

kevinadi · November 17, 2022, 12:12am

Let’s go back a little:

What exactly is the performance issue you’re seeing?
Do you see it all the time, or only when the replica set is in degraded state?
How did you conclude that this write into config.system.sessions is the culprit?
Can you provide some telemetry during this performance issue: the output of mongostat, iostat, any slow queries in the logs

And let me go back to your earlier question:

Will REPL has any issue if the majority is not available.

REPL being replication, then yes. Majority not available will hinder replication. You need a primary to be able to write. Using arbiters have known performance impact when the replica set is in a degraded state.

Best regards
Kevin

venkataraman_r · November 17, 2022, 8:44am

After We migrated from 4.0 to 4.2, we see the response time is higher and time to time we see queries taking more than 500ms which result in performance degradation in our application. Note that, there is no change in the client side, we use 3.12.9 java sync driver. and no changes in the DBSchema. we are seeing IDHACK queries also taking more time. we tried to set the FCV to 4.0 and gives some what better but still not equal to 4.0 performance.

Not saying its the only culprit. Since we are analysing what are the slow queries this also getting logged. when we reviewed the 4.2 changes, the writeMajorityCount also one of the addition. So checking if these w:majority query can impact the performance issue we are experiencing

Will provide by tomorrow.

venkataraman_r · November 21, 2022, 7:05pm

Hi Kevin,

I enabled the loglevel and found almost every 5 mins there is a response time increase and which makes some of our queries to become timedout. Looks like starting from 4.2, mongo Added MajorityService and in every 5mins LogicalSessionRefresh its doing a w:majority with timeout of 15000. During the same time the response (in mongotop) goes very high (from 200ms to 8s).

2022-11-19T08:13:11.923+0000 D2 COMMAND  [conn424] run command config.$cmd { update: "system.sessions", ordered: false, allowImplicitCollectionCreation: false, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" }
2022-11-19T08:13:11.970+0000 D2 REPL     [conn424] Waiting for write concern. OpTime: { ts: Timestamp(1668845591, 1583), t: 48 }, write concern: { w: "majority", wtimeout: 15000 }
2022-11-19T08:13:11.974+0000 I  COMMAND  [conn424] command config.$cmd command: update { update: "system.sessions", ordered: false, allowImplicitCollectionCreation: false, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" } numYields:0 reslen:7137 locks:{ ParallelBatchWriterMode: { acquireCount: { r: 183 } }, ReplicationStateTransition: { acquireCount: { w: 366 } }, Global: { acquireCount: { r: 183, w: 183 } }, Database: { acquireCount: { w: 183 } }, Collection: { acquireCount: { w: 183 } }, Mutex: { acquireCount: { r: 366 } } } flowControl:{ acquireCount: 183, timeAcquiringMicros: 65 } storage:{} protocol:op_msg 51ms
2022-11-19T08:13:11.976+0000 D2 COMMAND  [conn424] run command config.$cmd { delete: "system.sessions", ordered: false, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" }
2022-11-19T08:13:11.988+0000 D2 REPL     [conn424] Waiting for write concern. OpTime: { ts: Timestamp(1668845591, 1588), t: 48 }, write concern: { w: "majority", wtimeout: 15000 }
2022-11-19T08:13:11.989+0000 I  COMMAND  [conn424] command config.$cmd command: delete { delete: "system.sessions", ordered: false, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" } numYields:0 reslen:230 locks:{ ParallelBatchWriterMode: { acquireCount: { r: 244 } }, ReplicationStateTransition: { acquireCount: { w: 488 } }, Global: { acquireCount: { r: 244, w: 244 } }, Database: { acquireCount: { w: 244 } }, Collection: { acquireCount: { w: 244 } }, Mutex: { acquireCount: { r: 246 } } } flowControl:{ acquireCount: 244, timeAcquiringMicros: 57 } storage:{} protocol:op_msg 12ms
2022-11-19T08:13:45.030+0000 D2 COMMAND  [LogicalSessionCacheRefresh] run command config.$cmd { update: "system.sessions", ordered: false, allowImplicitCollectionCreation: false, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" }
2022-11-19T08:13:45.088+0000 D2 REPL     **[LogicalSessionCacheRefresh] Waiting for write concern. OpTime: { ts: Timestamp(1668845625, 368), t: 48 }, write concern: { w: "majority", wtimeout: 15000 }**
2022-11-19T08:13:45.094+0000 I  COMMAND  [LogicalSessionCacheRefresh] command config.$cmd command: update { update: "system.sessions", ordered: false, allowImplicitCollectionCreation: false, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" } numYields:0 reslen:3160 locks:{ ParallelBatchWriterMode: { acquireCount: { r: 226 } }, ReplicationStateTransition: { acquireCount: { w: 452 } }, Global: { acquireCount: { r: 226, w: 226 } }, Database: { acquireCount: { w: 226 } }, Collection: { acquireCount: { w: 226 } }, Mutex: { acquireCount: { r: 452 } } } flowControl:{ acquireCount: 226, timeAcquiringMicros: 69 } storage:{} protocol:op_msg 63ms
2022-11-19T08:13:45.094+0000 D2 COMMAND  [LogicalSessionCacheRefresh] run command config.$cmd { delete: "system.sessions", ordered: false, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" }
2022-11-19T08:13:45.127+0000 D2 REPL     [LogicalSessionCacheRefresh] **Waiting for write concern. OpTime: { ts: Timestamp(1668845625, 457), t: 48 }, write concern: { w: "majority", wtimeout: 15000** }
2022-11-19T08:13:45.130+0000 I  COMMAND  [LogicalSessionCacheRefresh] command config.$cmd command: delete { delete: "system.sessions", ordered: false, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" } numYields:0 reslen:230 locks:{ ParallelBatchWriterMode: { acquireCount: { r: 387 } }, ReplicationStateTransition: { acquireCount: { w: 774 } }, Global: { acquireCount: { r: 387, w: 387 } }, Database: { acquireCount: { w: 387 } }, Collection: { acquireCount: { w: 387 } }, Mutex: { acquireCount: { r: 459 } } } flowControl:{ acquireCount: 613, timeAcquiringMicros: 152 } storage:{} protocol:op_msg 35ms```

disabling the disableLogicalSessionCacheRefresh doesnt have any improvement.

Having the same clients but only mongoDB downgraded to 4.0.27 from 4.2 is giving the expected performance. So for sure, there is something in 4.2 which is impacting the performance. We tried changing the FCV to 4.0 on a 4.2 mongo but didnt provide any better results.

kevinadi · November 21, 2022, 11:52pm

Hi @venkataraman_r

I think the log snippet you posted is a symptom rather than a cause. I believe those are internal commands for server sessions and I don’t think they are the main cause of the slowdowns.

During the same time the response (in mongotop) goes very high (from 200ms to 8s).

Could you post the output of mongostat & mongotop and some logs during this period? If you can provide the output of mongostat & mongotop on the 4.0 and 4.2 deployments, we might be able to see what’s the difference between them.

Having the same clients but only mongoDB downgraded to 4.0.27 from 4.2 is giving the expected performance.

Just to make sure we’re on the same page, you’re doing the upgrades & downgrades on the same piece of hardware, or are you doing something like a blue/green deployment where the 4.0 deployment is in one hardware and the 4.2 deployment in another hardware?

Best regards
Kevin