Crash Consistency for a batch update query

Anand_Kayande · August 23, 2023, 6:13am

Can someone tell me if MongoDB provides this type of crash consistency - Suppose there are 3 shards - A, B and C, in a cluster. And we are executing a big batch “atomic” update query that will update few rows on each of the shard - A, B and C. Now, suppose update on shard A is done, but before B and C are executed, both those nodes are crashed. Now, when all 3 nodes comes back, will they have updates available on A only without B and C updates? Or how? In general, how this scenario will be handled?

Kobe_W · August 24, 2023, 4:33am

I’m not sure how mongodb implements cross shard transaction internally. But if it claims (which it is) the transaction across shards are atomic meaning all or nothing , it has to ensure that eventually all happen or nothing happens.

In terms of when the state will be in sync, no idea.

You can try reading concept of two/three phase commit.

Anand_Kayande · August 29, 2023, 8:42am

Thanks @Kobe_W. Another question on similar lines - Suppose MongoDB is having a big “batch atomic transaction” - updating 3 shards A, B and C. Now, after A is updated and before B and C are updated, if both shards B and C are down, what would happen? It will roll back updates on A or what? How MongoDB handles such scenario? Any official document supporting it?

Kobe_W · August 29, 2023, 8:04pm

If the transaction should succeed, then the transaction coordinator should ensure once B and C are back online, the change will be made.

If the transaction should be aborted, then the transaction coordinator should ensure once B and C are back online, any changes are reverted.

cross node transaction is difficult. Nodes can crash at any time, and it may take a long time for those nodes to be come in sync again. (e.g. imagine what happens if B and C are never back online? or back after 1 year?).

Anand_Kayande · August 30, 2023, 5:33am

Thanks again @Kobe_W. That sounds perfect.

In existing MongoDB implementation i.e. 4.2 onwards for its SSPL version, how this scenario is handled?

I mean, did they follow part 1 of what you mentioned (i.e. If the transaction should succeed) or part 2 (i.e. if transaction should be aborted)?

Basically, I want to know if MongoDB maintain the “atomicity” (all or none) of distributed batch transaction across the shards? I am asking for MongoDB SSPL and not for Atlas. For Atlas, I imagine they maintain the atomicity.

Anand_Kayande · August 30, 2023, 5:35am

Secondly, how do I test this scenario in-house? Though it sounds simple, it is not to reproduce. Any thoughts?

Any official document from MongoDB on this? I tried, but couldn’t find any.

Top1_seo_N_A · September 17, 2023, 3:42pm

I’m not sure how cross-shard transactions are implemented internally by mongodb. However, if it asserts—as it does—that the transaction between shards is atomic, file meaning all or nothing, then it must make sure that eventually all occur or nothing occurs.
I have no idea when the state will be synchronized.
Try reading up on the two- or three-phase commit paradigm.