Triggers not working with Global Deployment

Brock · April 15, 2023, 5:16pm

Are you using the same region and provider for both? Or are both global? or is one global and the other regional?

By the behavior, it looks like your faster one is regional, and the one delaying is global. Or they are two different regions or two different providers.

@Try_Catch_Do_Nothing are both on the same region/global setting, and the same provider?

Try_Catch_Do_Nothing · April 15, 2023, 5:19pm

Are you using the same region and provider for both? Or are both global? or is one global and the other regional?

I am currently using AWS Oregon for cluster and app in all environments.

Brock · April 15, 2023, 5:21pm

So both of them are on US West 2 if I’m understanding then?

So from Tylers question “Lastly, I am a touch confused still because it sounds like when you removed the match expression from the trigger, it worked. Is that correct?”

And are both of them the same cluster tier? Is one by chance an M10 and the other an M5 etc? Or are both free tier M0s?

Because if they are both single digits following the M they are on shared tier clusters meaning their resources are shared between neighbors. And each one can be found on different server racks with differing number of neighbor clusters, which all are sharing the same resources.

So one may be experiencing what we used to call “Noisy Neighbor” causing delays, while the other doesn’t go through this.

In which case if that is the case, this can literally just be an issue of shared tier performance vs non-shared tier.

This is just a copy and paste of what I used to share with customers who used shared tier environments:

These behaviors are common for Realm clients as a lot of Realm customers go with shared tiers or M10s often.

Something worth sharing with the customer:

Information about shared tier clusters

Shared tier clusters share resources between all tenants on the same tier.

Resources provided to each tenant of course increase for each tier level.

M0 has the smallest amount of resources in the shared tiers, as M5 has the most.

The shared tier instances theoretically do gain better throughput via networking and the writing service as you go higher in tier. - Due to less tenant density.

All tenants on the same shared tier cluster share the same networking, writing, and other resources.

Cluster Recommendations
MongoDB does not recommend shared tier clusters for full production environments, and we encourage at the least an M10 as a dedicated cluster for a production environment with extreme lows in traffic, but an M20 would be the better starting point. Our official recommendation overall is an M30 cluster for production environments.

Notice
We cannot guarantee stability on a live production environment on a shared tier cluster for the reasons mentioned above, as these clusters are intended for educational and development environments. As the shared cluster gains tenant density the available resources outside of the dedicated RAM, and CPU allotted to the specific tenant on creation will be spread more thin.

Try_Catch_Do_Nothing · April 15, 2023, 5:29pm

Correct.
So with DEV, the app is already “deployed” and only updates are pushed. For Test, with every commit on a pull request, the app/DB is dynamically created and then destroyed (the latter if tests are successful). So maybe that is why I’m not seeing the same behavior in DEV (?)

My next question is, is there a programmatic way to check when a trigger is actually “ready”?

Brock · April 15, 2023, 5:30pm

This more so may not be due to any of the above, and may be due to you deploying to a shared tier cluster and comparing its performance to when you deploy to your M10.

You could maybe put a delay on the larger tier cluster and let things propagate on the lesser tier cluster, if this is the kind of environment you have going on.

Because shared tiers are just meant to be testing environments as a whole, and may be better to test all of this on a same tier to same tier environment for your test and dev.

Try_Catch_Do_Nothing · April 15, 2023, 5:32pm

The delay still happens with an M10 (TEST). I think you have it backwards.

I think the conclusion here is that in some cases there is a several minute delay with trigger startup, even when the app/DB are in the same region.
I just need to know how to code for that and know when the trigger is ready.

Brock · April 15, 2023, 5:33pm

Ok thank you for clarifying, I didn’t realize I had it backwards.

So the shared tier is performing faster/propagating the trigger faster than the higher tier?

If that’s the case, I would encourage signing up for the support free trial for Developer Support and open a support ticket, and request they open a HELP ticket to further investigate what’s going on. Then you got a paper trail for this situation and you can get dedicated engineering support to determine if there’s actually something more going on.

And in regards to Tyler could you verify this?:
So from Tylers question “Lastly, I am a touch confused still because it sounds like when you removed the match expression from the trigger, it worked. Is that correct?”

@Try_Catch_Do_Nothing

Try_Catch_Do_Nothing · April 15, 2023, 5:35pm

So the shared tier is performing faster/propagating the trigger faster than the higher tier?

I am going to test both right now. I will delete the app in DEV (which is shared) and deploy to DEV and TEST and you can check on your side how long the trigger takes to startup.

Try_Catch_Do_Nothing · April 15, 2023, 5:43pm

OK, same result.
In DEV (using the Shared environment) and with a NEW deployment, the trigger did kick off as expected and the tests passed.
In TEST (using M10 cluster), the trigger did not kick off when the tests ran.

So for whatever reason, there seems to be more of a delay in the TEST project…

Brock · April 15, 2023, 5:45pm

I don’t work for MDB anymore, I just help out here and there these days.

Coming across stuff like this, it brings a lot of interesting questions because if it’s not the expressions it’s sounding like something else may be happening.

But either case I’d open a support ticket with the following:

Atlas Cluster Links
-Trigger links
functions to the triggers
-The behavior being seen
The GitHub logic deploying to the clusters (for repro purposes)
Any logs being found/generated
Copy and paste of @Tyler_Kaye questions with the answers to them.
What you’ve tried already.
Request for an RCA if necessary.

Getting all of this together in a ticket will give the Weekend Realm TSE a more clear picture of the situation, and then by requesting the Realm TSE open a HELP, it’ll allow back end and the TSE to run a pincer movement on your problem here and allow a deeper analysis of why this is happen.

You then can please report back here and explain the oddity, but also help improve the product for triggers/functions and maybe identify where there could possibly be improvements in documentation or recommended best practices using GitHub and Triggers/Functions.

@Try_Catch_Do_Nothing I don’t know who works weekends in Amer hours anymore, (used to be me) but the more you provide to just repro and build what you’re doing, the less guesswork they have to do, the more precise of a solution and an RCA that will come out of this…

Tyler_Kaye · April 15, 2023, 8:52pm

I agree. I think it is likely that going through the proper support channels is the best course of action here. I do see it taking about 2.5 minutes to startup. I see an error happening on startup that you can link to in whatever case you open. It seems that the following error occurs:

unable to register orphaned trigger

This causes the trigger to do a full extra retry cycle which is causing your delay. It then re-boots and it back to normal afterwards.

Try_Catch_Do_Nothing · April 15, 2023, 11:42pm

@Tyler_Kaye I went ahead and submitted a support ticket.
Off hand, do you know what the error message means? Is this something you’ve seen before?
Thanks

Try_Catch_Do_Nothing · April 24, 2023, 12:17am

Update for all:
This behavior is confirmed to be a bug with how triggers are setup with draft deployments in a new app:

When a new app is created using realm-cli push , it starts out in draft mode. Right now, we are creating some of the Trigger event elements before the app is actually deployed. Even though the app is not live, the Trigger-related events attempt to process anyway, which results in those events (and the Triggers associated with them) ending up in a failed state. After a few minutes, the events are retried and the Triggers become operational.

system · April 29, 2023, 12:17am

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.