Atlas archiving

Hello everyone, our team currently uses atlas as main db and we have a lot of useless data. I am considering a different options and have some questions:

  1. Do I understand correctly that using the online archive approach will be cheaper than the approach where we move useless data to a cold cluster?
  2. We already have a huge amount of useless data, would it be better if we run some process that archives it first and then set up an scheduled online archive?
  3. Does online archive uses S3 Glacier or does it just use default S3 storage?

Hi @Bohdan_Chystiakov ,

Great questions ! Here are the responses:

  1. Yes, Online Archive is easier and cheaper as archives are fully managed so you don’t need to configure and maintain separate cloud object storage. You can create date-based or custom archival rules that run indefinitely so you don’t have to manually migrate or delete data. In addition, your archives are online and easily accessible alongside your Atlas cluster data.
  2. Great question ! For the first time archival, it is better to run for more time and if you can dedicate exclusive time, it is better to run 24 hours first and then setup scheduled archives. When you are running for the first time someone on your cluster and you might want to archive a massive chunk of data. OA speed limit is 2GB/5 mins, that is around 0.5 TB/day. So, if you have 2 TBs of data to archive, you can let the archive run continuously for 24 hours and it will archive in about 4 days. After the first time archival, your daily archival workloads can be scheduled to run during off peak hours with the “Scheduled archiving” feature.
  3. Online Archive uses the standard S3 storage (not Glacier).