In a recent webinar, MongoDB VP of Education & Cloud Services, Andrew Erlichson, delivered a presentation comparing and contrasting different MongoDB backup strategies. We thought we’d continue on that topic by comparing these options in-depth on the MMS blog.
There are three main approaches for backing up MongoDB:
- mongodump, a utility bundled with the MongoDB database
- Filesystem snapshots, such as those provided by Linux LVM or AWS EBS
- MongoDB Management Service (MMS), a fully managed, cloud service that provides continuous backup for MongoDB (also available as on-prem software with a MongoDB subscription)
Let’s look at each of these strategies in further detail.
Backup Strategy #1: mongodump
mongodump is a tool bundled with MongoDB that performs a backup of the data in MongoDB. mongodump may be used to dump an entire database, collection, or result of a query. mongodump can produce a consistent snapshot of the data by dumping the oplog. You can then use the mongorestore utility to restore the data to a new or existing database. mongorestore will import content from the BSON database dumps produced by mongodump and replay the oplog.
mongodump is a straightforward approach and has the benefit of producing backups that can be filtered based on your specific needs. While mongodump is sufficient for small deployments, it is not appropriate for larger systems. mongodump exerts too much load to be a truly scalable solution. It is not an incremental approach, so it requires a complete dump at each snapshot point, which is resource-intensive. As your system grows, you should evaluate lower impact solutions such as filesystem snapshots or MMS.
In addition, while the complexity of deploying mongodump for small configurations is fairly low, the complexity of deploying mongodump in large sharded systems can be significant.
Backup Strategy #2: Copying the Underlying Files
You can back up MongoDB by copying the underlying files that the database processes uses to store data. To obtain a consistent snapshot of the database, you must either stop all writes to the database and use standard file system copy tools, or create a snapshot of the entire file system, if your volume manager supports it.
For example, Linux LVM quickly and efficiently creates a consistent snapshot of the file system that can be copied for backup and restore purposes. To ensure that the snapshot is logically consistent, you must have journaling enabled within MongoDB.
Because backups are taken at the storage level, filesystem snapshots can be a more efficient approach than mongodump for taking full backups and restoring them. However, unlike mongodump, it is a more coarse approach in that you don’t have the flexibility to target specific databases or collections in your backup. This may result in large backup files, which in turn may result in long-running backup operations.
To implement filesystem snapshots requires ongoing maintenance as your system evolves and becomes more complex. To coordinate backups across multiple replica sets, particularly in a sharded system, requires devops expertise to ensure consistency across the various components.
Backup Strategy #3: MongoDB Management Service (MMS)
MongoDB Management Service provides continuous, online backup for MongoDB as a fully managed service. You install the Backup Agent in your environment, which conducts an initial sync to MongoDB’s secure and redundant datacenters. After the initial sync, MMS streams encrypted and compressed MongoDB oplog data to MMS so that you have a continuous backup.
By default, MMS takes snapshots every 6 hours and oplog data is retained for 24 hours. The snapshot schedule and retention policy can be configured to meet your requirements. You also have the flexibility to exclude non-mission critical databases and collections.
For replica sets, a custom, point-in-time snapshot can be restored for any moment in the last 24 hours. For sharded system, MMS produces a consistent snapshot of the cluster every 6 hours. You also have the option of using checkpoints for more granular restores of sharded clusters.
Because MMS reads only the oplog, the ongoing performance impact is minimal, similar to that of adding an additional node to the replica set.
In addition to the cloud-based service, MMS is available as on-prem software as part of a MongoDB Standard or Enterprise Subscription.
To get a more in-depth review of MongoDB backup strategies, read Backup and its Role in Disaster Recovery.