TLS certificates changed - brought down our app

Atlas automatically changed our TLS certificates for our clusters. They are now signed by the ISRG Root X1 certificate. This change brought our app down for 20 hours while we tried to figure out what is happening.

My first thought: What the f*** was Atlas thinking by making this change?! Something this important, something that causes actual downtime in our production environment should have multiple emails, notifications, pop-up boxes when we log-in, even phone calls and text messages.

The damage caused by Atlas is done. I feel deeply disappointed in Atlas. I will lick my wounds and carry on.

Now my second thought: How can I avoid this in the future? And, telling me to scour the Atlas support DB constantly is not an option.

-Frank Cohen, CEO, Clever Moe

1 Like

Hi Frank,

I’m so sorry to hear you experienced an outage related to this change. We’ve sent a serious of communications about this including a method to move to the new CA ahead of time but we are investigating potential gaps in our operational communications on this topic. We’re working on a thorough post-mortem.

But taking a step back, you’re right: emails are clearly not enough here. Out of curiosity, what programming language driver are you using? I believe we may need to target certain communities more susceptible to risk based on trust store affinity more aggressively than others and we do have an understanding of language framework used per cluster.

Please accept my apologies. I will be in touch directly with you this week via email to try and learn more about your experience if you don’t mind.