Schedule Aggregation pipeline MV Refresh

Rai_Deepak · September 13, 2022, 7:38am

We have a requirement where we want to use an “aggregation pipeline” which will include operations over 2 or 3 “source collections” and at the final stage of the pipeline the results has to be added to a “Target Collection”.
Now, the data in the source collections can change on a daily basis and therefore we would want the data inside “Target Collection” also to be Refreshed periodically. On checking the article On-Demand Materialized Views — MongoDB Manual it is mentioned that we can use a function to trigger the pipeline again, however there is no info on how this refresh can be scheduled. So we have the following questions :

Could you kindly suggest how we can schedule the refresh of Target collection on a periodic basis using standard MongoDB instance, we dont have Atlas instance.
If some JSON records insert/updates to the Target Collection are rejected during Aggregation Pipeline run then how can we get a summary of this information at end of run?

MongoDB version : 4.2.17
Hosted on: AWS EC2
Config: Standalone 3 node replica set

Thanks in advance.

Rai_Deepak · September 14, 2022, 10:43am

Can anyone provide their guidance.

Asya_Kamsky · September 14, 2022, 2:27pm

You could write a script and then invoke it with a cron job (assuming Linux OS).

Depends what you mean by “rejected” - if there is an error when inserting or updating then the aggregation will error out with an error message. If the aggregation does not error then all the documents were processed according to the pipeline directive - unfortunately we don’t make the summary of how many documents that was available except potentially in mongod logs.

Asya

Rai_Deepak · September 18, 2022, 3:18pm

We tried out this option however due to organization policies we do not have permissions to schedule a cron job on the server. So is there any alternative to cron jobs? The MongoDB instance is hosted on AWS EC2 so is there any AWS service that might help us in scheduling that you are aware of? Thanks in advance.

Asya_Kamsky · September 19, 2022, 9:05pm

The cron job does not have to run on the server that mongod is running on. You can literally run it on any server, in the cloud or on prem. It just needs to be able to connect to the cluster.

Asya

Rai_Deepak · September 30, 2022, 9:16am

Unfortunately we do not have permission to create cron jobs on any server. Are there any other ways for this?

Asya_Kamsky · September 30, 2022, 1:39pm

Have you looked at Atlas Triggers?

I just realized you already said you’re Hosted on: AWS EC2 - maybe there is an equivalent to Atlas Triggers in AWS…

system · November 29, 2022, 1:39pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.