Enhancing Atlas Online Archive With Data Expiration and Scheduled Archiving

Atlas's Online Archive feature allows you to set archiving rules to move data that is not frequently accessed from your Atlas cluster to a MongoDB-managed object store. It also allows you to query both your Atlas and Online Archive data in a unified manner without having to worry about the tier in which the data resides.

We are enhancing this feature by adding two new capabilities: data expiration and scheduled archiving (in preview).

Data expiration: Online Archive makes it easy to tier data out of a live database into an object store, but what if you want to set a second threshold to delete data from the archive entirely? Perhaps you don’t want to store data indefinitely due to costs, or maybe you have a compliance requirement for data expiration that means you need to ensure deletion on a schedule.

Previously, there was no option to remove data from the archive except deleting the archive completely, which is not a preferred option for most use cases. With our new data expiration feature, you can specify for how many days data should be stored in the online archive before being deleted. You can set an expiration from the archive as low as seven days and as high as 9,125 days; you can set the archive expiration time through either the Atlas UI or the Admin API. Expiration rules can be edited after creation, if needed.

Note that once the data is deleted from the archive there is no way to recover it, so you must define your rules carefully.

Archiving with the data expiration feature

Scheduled archiving: Previously, the archiving process ran every five minutes. Though in most cases this is acceptable, some customers were concerned that this process could affect cluster performance when, for example, a cluster is running close to capacity during a specific time period. If an archiving window overlaps with this period, it may overload the cluster and lead to stability issues. These customers requested the ability to schedule archiving during an off-peak time, when the clusters have spare capacity. With this requirement in mind, we are thrilled to introduce the scheduled archiving feature.

You can configure the scheduled window by setting rules. The window can be scheduled to repeat every day, every week, or every month, depending on your preference. To ensure that the archive process is able to work through any backlog that’s accrued, there is a minimum window requirement of two hours.

A scheduled archiving window set to repeat every week

You are also able to edit the archive rule and define when you want to archive your data and when you want to delete it.

Data expiration and scheduled archiving will provide operational efficiency to Atlas customers. Both are in preview mode and will be generally available soon. See the documentation for additional information about data expiration and scheduled archiving.