How to aggregate timeseries data to enable archiving

michael_hoeller · October 22, 2021, 10:29am

Hello

I’d like to archive timeseries data after some time to save storage. However, I also like to use the data to do analytics with it.
So this would mean to aggregate timeseries data in a further collection and remove the raw data e.g. with the expireAfterSeconds: 604801 parameter (the example will automatically remove after one full week.

Are there any concptes, ideas, show cases around how to implement such a (pre-compute) pattern with timeseries? What would be the best trigger to compute? How to deal with that trigger when we have late arrivals and docs are inserted some time in the past?

Would be happy to get in a conversation here.

Regrads,
Michael

Pavel_Duchovny · October 23, 2021, 2:42pm

Hi @michael_hoeller ,

This seems to be the classic use case of using online archive in Atlas and than to use a federated connection string to access data in both cluster and MongoDB archive storage.

Another option is to use the preImage option in a delete trigger for the ttl deletes .

So each ttl delete will allow you to write the data to your archive destination and precompute as needed.

See more here:

Thanks
Pavel

system · November 30, 2021, 4:27pm

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.