Hi,
I am extremely grateful for your detailed answer and explanation, and at the same time I am very sorry for my long silence after that, life has been very busy and confusing until very recently – apologizes.
Unfortunately, despite the answer being great, I am still puzzled by what I see in Mongo. Specifically, I’d like to share with you the following scenario, and perhaps you get to see where my understanding gets buggy.
SETUP
I am working with a cluster of 3 nodes, as reported by this (trimmed) invocation of rs.status(), ran on the primary.
m# rs.status()
---------------
members: [
{
_id: 3,
name: '172.31.12.230:27017',
health: 1,
state: 1,
stateStr: 'PRIMARY',
...
_id: 4,
name: '172.31.5.205:27017',
health: 1,
state: 2,
stateStr: 'SECONDARY',
...
_id: 5,
name: '172.31.32.216:27017',
health: 1,
state: 2,
stateStr: 'SECONDARY',
]
---------------
CONFUSING RUN
The confusing run I’d like to share with you goes as follows:
-
connect to master via mongosh, running it local to master
-
declare a document
m# const doc = {nname: "test", value: 1};
- change the db for health and safety
m# use adatabasenamenotfoundinprod;
- create a session
m# const session = db.getMongo().startSession();
- get a collection that doesnt exist
m# col = session.getDatabase(db.getName()).collection0823_1304;
- count the document in the collection, get 0 as expected
m# col.countDocuments({});
0
- start a transaction with a legitimate write concern (since we have 3 nodes)
m# session.startTransaction({writeConcern: {w: 3, wtimeout: 100}});
- turn off one of the secondary nodes deliberately, by running the following command on the node iteself
mSECONDARY# db.adminCommand({shutdown: 1});
- confirm one secondary is gone, by running
rs.status() on the primary, getting this (super trimmed) content for the node that got the shutdown command
m# rs.status();
_id: 5,
name: '172.31.32.216:27017',
health: 0,
state: 8,
stateStr: '(not reachable/healthy)',
uptime: 0,
- try inserting the document in the primary
m# col.insertOne(doc);
{
acknowledged: true,
insertedId: ObjectId('68a99eb853e989d3df6b140b')
}
- try to commit the transaction, get the (familiar, slightly trimmed in this case) exception
m# session.commitTransaction();
Uncaught:
MongoWriteConcernError[WriteConcernFailed]: waiting for replication timed out
Additional information: {
wtimeout: true,
writeConcern: { w: 3, wtimeout: 100, provenance: 'clientSupplied' }
}
Result: {
writeConcernError: {
code: 64,
codeName: 'WriteConcernFailed',
errmsg: 'waiting for replication timed out',
errInfo: {
wtimeout: true,
writeConcern: { w: 3, wtimeout: 100, provenance: 'clientSupplied' }
}
},
ok: 1,
...
}
- try to abort the transaction, get another exeption
m# session.abortTransaction();
MongoTransactionError: Cannot call abortTransaction after calling commitTransaction
- try to see what happened to the collection’s content, insert has happened apparently
m# col.countDocuments({});
1
- end the session
m# session.endSession();
- count the document outside the session
m# db.collection0823_1304.countDocuments({});
1
- quit the console, then reconnect
m# quit()
b# mongosh
- count again, see 1
m# use adatabasenamenotfoundinprod;
m# db.collection0823_1304.countDocuments({});
1
- connect to the secondary that remained alive all the way, see that indeed the collection yields 1
ms# use adatabasenamenotfoundinprod;
switched to db adatabasenamenotfoundinprod
ms# db.collection0823_1304.countDocuments({});
1
If I extract a couple of sentences from your reply, ie.
If you must guarantee that writes succeed even in the face of failures, write your writes in a transaction and don’t special case MongoWriteConcernException. If a write returns a failure, roll it back and try again.
I do wonder: how would this work in the case I just shared? I think steps 10-11 behave exactly as recommended (I just commit and rollback in case of failure), but that strategy doesn’t seem to work, and to make it more confusing, the effects of the transaction seem to survive the whole run?
Apologizes for the quite long reply, hopefully you get to see the world through my eyes, and you can tell me what I am still missing 
Regards,
Filippo