Triggers failing when cluster suspended overnight

John_Sewell · June 30, 2023, 10:58am

We suspend some of our Atlas clusters overnight when not in use as they are quite large and there is no point keeping some of the testing environment active when all the developers are offline.
The issue we have is that the triggers fail when this happens, we have the advanced option set to Auto Resume the trigger but I believe this is just if the resume token fails as opposed to the whole data connection going away.

Currently we need to manually resume the trigger after the resume has taken place in the morning.

What’s the best solution to this? Get the DBA script to pause the triggers before the cluster is suspended and then resume it after? Is there an alternative setting that could be set to cope with this a touch more gracefully?

Many thanks,

John

Tyler_Kaye · June 30, 2023, 12:10pm

Hi, triggers under the hood are just a Change Stream that we operate for you. Therefore, when you pause your cluster, the trigger attempts to connect to the cluster and open a change stream (it retries this for a while) and ultimately errors and enters the failed state.

We have some customers that do similar things to you and the best solution is to do the following:

Disable all triggers
Pause clusters
When you are ready, re-enable the clusters
Enable the triggers

You can hit the App Services Admin API to pause/resume triggers. Please see here: MongoDB Atlas App Services Admin API

Best,
Tyler

John_Sewell · June 30, 2023, 12:56pm

Thanks Tyler, I suspected that was the approach to take. Our DBA team already use the API for pausing the cluster so I’ll get them to add that call to their scripts.

system · July 5, 2023, 12:57pm

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.