Archiving data to "Mongodb Online Archive" or "S3 Archive"? Which is best in performance and cost?

Mongodb provides options to archive the data to cloud object storage. We can use Mongodb’s native “Online Archive” feature or archive our data to S3 storage. And access those data through federated queries. I wanted to know which is best archiving process in terms of query read speed and cost?

Hi @Vivek_Paramasivam1 and welcome to MongoDB community forums!!

The MongoDB Online Archival feature is a fully managed service in which moves infrequently accessed data from your Atlas cluster to a MongoDB-managed read-only Federated Database Instance on a cloud object storage. In saying so, for using a managed service for archival, such as Atlas to S3, one should use Online Archival as:

  • It handles automatically transferring of data from Atlas to S3.
  • Atlas encrypts your archived data in S3 using SSE.

If you want to configure your own S3 buckets then you can configure Atlas Data Federation to access data in your AWS S3 buckets - More information regarding this on the Atlas Data Federation Overview documentation.

However, using the MongoDB Online archival comes with a few limitations which could be read on the Online Archival Limitations on the documentations.

A performance consideration between the two is if you activate Online Archive for an AWS cluster, the cloud object storage exists in the same region in AWS as your cluster. For comparison, Atlas Data Federation provides an elastic pool of agents in the region that is nearest to your data where Atlas Data Federation can process the data for your queries. An additional note from a cost point of view, as per the Atlas Data Federations Regions documentation:

To prevent excessive charges on your bill, create your Atlas Data Federation in the same AWS region as your S3 data source.

The below documentations, would help you to understand the pricing information.

  1. Online Archive Costs
  2. Data Federation Costs

Let us know if you have any further questions.

Regards
Aasawari

2 Likes

Hey @Vivek_Paramasivam1, everything Aasawari said is a great start in how to evaluate the two options.

That said, there is a lot of nuance in deciding between the two options. Generally speaking using Data Federation can allow you to fine tune a lot of different parameters to your specific use case which can allow you to get the best possible performance, that said it can become very complex and does need to be managed over time.

Online Archive on the other hand is a fully managed solution, and we do have some really exciting improvements coming which will drastically improve query performance in the next few months.

Let me know if you’d like to discuss this further, and feel free to drop a meeting on my calendar here if that’s easiest: Calendly - Benjamin Flast

Best,
Ben

1 Like