Recommended handling of MongoWriteConcernException

Filippo_Del_Tedesco · April 17, 2025, 9:06am

Hi,

For testing purposes (and historical reasons) we are running a cluster with MongoDB 5.0.31 in a primary-secondary-secondary configuration. We access the cluster programmatically, the code is written in C# and it is built using MongoDB Driver 2.24.0.

During a window of heavy load, we experienced occurrences of MongoWriteConcernException (Error code 64) while writing data to the cluster via BulkWriteAsync.

From what we understood, the exception itself is not necessarily reporting a permanent failure: give it enough time, and the concern might be reached (as it actually happened in our experiment). At the same time, it seems there can be circumstances when the cluster might not reach the concern (e.g. heavy network impairment), yielding spurious data being left around.

So the question is: what would be the recommended way to implement a robust (=tolerant wrt MongoWriteConcernExceptions) method for data ingestion? Should we somehow implement a verification mechanism that ensures the data we inserted can be read later on with “majority” concern? If this is necessary, what is the recommended way to implement such verification?

For reference, at this point we decided to wrap the BulkWriteAsync call in a transaction, which lives inside a try-catch block: in case the BulkWriteAsync call, or any command of the transaction, trigger an exception, the transaction is aborted and the BulkWriteAsync is retried later. However, if the exception is a MongoWriteConcernException, we only log a warning, and we hope the data propagation will eventually succeed.

What’s your take on the proposed strategy? Surely this cannot be the real solution, since it doesn’t address the case when a MongoWriteConcernException is followed by a true failure in reaching the replica concern. But are we getting closer to something workable? Or can you see any other problematic scenarios we are introducing with this approach?

Thanks!
Filippo

Jack_Woehr · April 18, 2025, 4:23pm

Interesting question! I don’t know the answer, @Filippo_Del_Tedesco , but I’m watching this topic to see if anyone does know the answer!

Jack_Woehr · April 18, 2025, 4:24pm

@Rishabh_Bisht do you know the answer?

James_Kovacs · April 23, 2025, 11:57pm

Hi, @Filippo_Del_Tedesco,

Welcome to the MongoDB Community Forums. I understand that you have a question on how you should handle MongoWriteConcernException. I wish that I had a simple answer or code snippet to share, but unfortunately it really depends on the durability guarantees that your application requires. Let me explain the mental model and hopefully that will assist in your decisions.

Writes to MongoDB clusters are routed to the primary where the write is performed on the collection as well as being recorded in the oplog. (I’ll skip over journalling, checkpoints, and related details. These mechanisms ensure that acknowledged writes to a single node are committed to disk even in the event of a failure.) The oplog entries are replicated to the secondaries where they are also applied. The oplog is a serialized stream of write operations that are ordered in time. By applying a write concern such as w: 2, you are requesting that the primary only acknowledge the write (with a response of ok:1) after at least two cluster nodes have applied the write (e.g. the primary and at least one secondary). If the secondaries are slow to replicate the oplog entries, then the write concern can timeout resulting in a MongoWriteConcernException. The write still happened on the primary, but none of the secondaries were able to replicate the write within the time limit. If the write had failed on the primary, you would have received a different error message such as MongoDuplicateKeyException.

The oplog entry is still on the primary. The secondaries might have even replicated said oplog entry but haven’t applied it yet. Assuming that no cluster members crash, that oplog entry will eventually get applied.

But what happens if the primary crashes? In that case the remaining members will elect a new primary. Which member will be elected is influenced by which secondary has the most recent data. (e.g. Which oplog entries has it applied.)

This is where things become complicated. Let’s say you have performed writes w1 and w2 to the primary (node-a) and both return a MongoWriteConcernException. The NIC on node-a then fails. node-b replicated w1 (but after MongoWriteConcernException happened), but not w2. node-c has neither write. node-b is elected the new primary (because it has applied more of the oplog). node-a’s NIC is replaced and comes back online. It rejoins as a secondary, realizes that w2 was never replicated before it crashed and undoes that change before re-joining the cluster as a secondary.

In the absence of node or network failures, writes that result in MongoWriteConcernException will eventually be replicated to other nodes - just not within the time limit of the write acknowledgement. But in the face of node or network failures, you can’t be certain.

How do you handle this in your application? If you must guarantee that writes succeed even in the face of failures, write your writes in a transaction and don’t special case MongoWriteConcernException. If a write returns a failure, roll it back and try again.

If your application can tolerate the occasional lost write when the primary fails, then you can rely on the eventual consistency guarantees provided by oplog replication. The result will be better performance at the cost of slightly lower durability guarantees.

A middle ground could be special-casing verification logic in your MongoWriteConcernException handler. You would have to think through what compensation logic you would implement if the write existed on the primary but had not yet replicated to the secondary. Or how to handle the situation where the write doesn’t exist on the new primary (because that oplog didn’t get replicated before the original primary lost connectivity) but your app node crashes while performing the compensating write. Did the compensating write succeed before the app node crash? Did it fail? How can you tell when the app recovers?

Hopefully this explanation provides you the mental model that you need to make the correct design choices for your application.

Sincerely,
James

Filippo_Del_Tedesco · August 23, 2025, 9:46pm

Hi,

I am extremely grateful for your detailed answer and explanation, and at the same time I am very sorry for my long silence after that, life has been very busy and confusing until very recently – apologizes.

Unfortunately, despite the answer being great, I am still puzzled by what I see in Mongo. Specifically, I’d like to share with you the following scenario, and perhaps you get to see where my understanding gets buggy.

SETUP

I am working with a cluster of 3 nodes, as reported by this (trimmed) invocation of rs.status(), ran on the primary.

m# rs.status()

---------------
 members: [
    {
             _id: 3,
      name: '172.31.12.230:27017',
      health: 1,
      state: 1,
      stateStr: 'PRIMARY',
	  ...
	        _id: 4,
      name: '172.31.5.205:27017',
      health: 1,
      state: 2,
      stateStr: 'SECONDARY',
	  ...
	        _id: 5,
      name: '172.31.32.216:27017',
      health: 1,
      state: 2,
      stateStr: 'SECONDARY',
	]
---------------

CONFUSING RUN

The confusing run I’d like to share with you goes as follows:

connect to master via mongosh, running it local to master
declare a document

m# const doc = {nname: "test", value: 1};

change the db for health and safety

m# use adatabasenamenotfoundinprod;

create a session

m# const session = db.getMongo().startSession();

get a collection that doesnt exist

m# col = session.getDatabase(db.getName()).collection0823_1304;

count the document in the collection, get 0 as expected

m# col.countDocuments({});
0

start a transaction with a legitimate write concern (since we have 3 nodes)

m# session.startTransaction({writeConcern: {w: 3, wtimeout: 100}});

turn off one of the secondary nodes deliberately, by running the following command on the node iteself

mSECONDARY# db.adminCommand({shutdown: 1});

confirm one secondary is gone, by running rs.status() on the primary, getting this (super trimmed) content for the node that got the shutdown command

m# rs.status();

      _id: 5,
      name: '172.31.32.216:27017',
      health: 0,
      state: 8,
      stateStr: '(not reachable/healthy)',
      uptime: 0,

try inserting the document in the primary

m# col.insertOne(doc);
{
  acknowledged: true,
  insertedId: ObjectId('68a99eb853e989d3df6b140b')
}

try to commit the transaction, get the (familiar, slightly trimmed in this case) exception

m# session.commitTransaction();

Uncaught:
MongoWriteConcernError[WriteConcernFailed]: waiting for replication timed out
Additional information: {
  wtimeout: true,
  writeConcern: { w: 3, wtimeout: 100, provenance: 'clientSupplied' }
}
Result: {
  writeConcernError: {
    code: 64,
    codeName: 'WriteConcernFailed',
    errmsg: 'waiting for replication timed out',
    errInfo: {
      wtimeout: true,
      writeConcern: { w: 3, wtimeout: 100, provenance: 'clientSupplied' }
    }
  },
  ok: 1,
  ...
}

try to abort the transaction, get another exeption

m# session.abortTransaction();
MongoTransactionError: Cannot call abortTransaction after calling commitTransaction

try to see what happened to the collection’s content, insert has happened apparently

m# col.countDocuments({}); 
1

end the session

m# session.endSession();

count the document outside the session

m# db.collection0823_1304.countDocuments({});
1

quit the console, then reconnect

m# quit()
b# mongosh

count again, see 1

m# use adatabasenamenotfoundinprod;
m# db.collection0823_1304.countDocuments({});
1

connect to the secondary that remained alive all the way, see that indeed the collection yields 1

ms# use adatabasenamenotfoundinprod;
switched to db adatabasenamenotfoundinprod
ms# db.collection0823_1304.countDocuments({});
1

If I extract a couple of sentences from your reply, ie.

If you must guarantee that writes succeed even in the face of failures, write your writes in a transaction and don’t special case MongoWriteConcernException. If a write returns a failure, roll it back and try again.

I do wonder: how would this work in the case I just shared? I think steps 10-11 behave exactly as recommended (I just commit and rollback in case of failure), but that strategy doesn’t seem to work, and to make it more confusing, the effects of the transaction seem to survive the whole run?

Apologizes for the quite long reply, hopefully you get to see the world through my eyes, and you can tell me what I am still missing

Regards,

Filippo