MongoDB community backup using snapshots with journal on a separate disk/partition

Pawel_Gietki · January 25, 2024, 9:10am

Hi,

Let’s assume we have a sharded cluster with NO replicas (3xshards with primaries only).
Each shard has a separate disk for:

data 5 TB (/dev/sdb) ,
journal 20GB (/dev/sdc),
logs 20GB (/dev/sdd).

I want to use snapshots (using cloud provider eg. GCP or AWS doesn’t matter) for creating backups of each shard - using scheduled snapshot mechanism (i.e. each disk from each shard will be snapshotted individually) - which may be done within an hour difference e.g. data disk at 12:00 and journal at 12:59 or opposite (that timings strictly depends on Cloud provider).

Now let’s assume that the cluster is always online - used heavily (many writes)! So I cannot LOCK it (cannot invoke db.fsyncLock()) .

MongoDB instruction states that

If the journal and data files are on the same logical volume, you can use a single point-in-time snapshot to capture a consistent copy of the data files.
If the journal and data files are on different file systems, you must use db.fsyncLock() and db.fsyncUnlock() to ensure that the data files do not change, providing consistency for the purposes of creating backups.

So in my specific scenario (no replicas + cannot lock the DB) the above condition for locking the database cannot be fulfilled.

The question is: will the above scenario even work as a backup method? Is it even possible to restore from such a backup (having data and journal on a separate disks and created at different time)? Will mongo even start?
Important assumption: I can live with inconsistency between the data and journal in the last hour (when the snapshots started to be created by the Cloud provider) - I want to have the data consistent before the snapshots started happening.

In other words: what does it mean in MongoDB manual:

… you must use db.fsyncLock() and db.fsyncUnlock() to ensure that the data files do not change, providing consistency for the purposes of creating backups.

If any more info is needed please say.

All the best folks!

chris · January 26, 2024, 12:27pm

No, you may as well throw away the journal at this point, the data volume will be consistent as of the last checkpoint.

However multiple shards having the snapshots at different times will have irreconcilable inconsistencies. Without the recommended deployment of replicaSets for config and shards there are no secondaries that could be fsyncLocked so this must be done on the primaries.

Additionally the balancer must be stopped before the backup too:
https://www.mongodb.com/docs/manual/administration/backup-sharded-clusters/

Sameer_Kattel · March 19, 2025, 11:39am

Hi @chris
I am in similar situation.
If we just stop the balancer then what kind of irreconcilable inconsistencies can arise if fsynclock is bypassed?

Also if journal and data are on same partition then is it ok to bypass fynclock before snapshotting disk?

Hi @Pawel_Gietki did you end up with doing fsynclock before snapshotting?

chris · March 22, 2025, 5:22pm

In this topic the shards and config server were single member replicasets.
The timing of the shard snapshots could have a difference of one hour, inherently this will cause an inconsistent backup of the cluster, and if the balancer is not disabled a further difference between the config server and the shards.

A restore of those backups would have each component at a different point on the timeline.

For a sharded cluster a fsynclock is still required to prevent writes to the cluster.

With a self-managed deployment Cloud-Manager and Ops-Manager provide backups without needing to stop the balance or writes. This comes at a cost however.

Sameer_Kattel · March 24, 2025, 7:15am

Thanks for the reply @chris !

I am still failing to understand the inconsistency.

Provided the balancer is stopped , there is no data migration happening and all snapshots taken nearly at the same time what kind of inconsistency can arise if we don’t fsynclock?
Data loss is acceptable. The only requirement is that one should be able to spin up a working mongo with restored data from snapshot.

chris · March 24, 2025, 9:34pm

Yup. No problem then.

Sameer_Kattel · March 25, 2025, 1:36am

When I said data loss, I meant it’s not complete data loss but the data that might not have been completely synced to all replicas.

Thanks again for the prompt response!