Safe Software Deployment: Overcoming the Fear and Loathing of Pushing to Prod

Mark Porter

Over the course of my career, I’ve had the privilege of deploying many different types of software. I’ve shipped CDs. I’ve pushed customer software over the web. I’ve updated database instances and control planes. And I’ve live-updated large, running, mission-critical systems.

I call this a privilege because getting software into the hands of end users is what software engineers love most. But deployments are not all fun and games. And while each deployment presents its own unique challenges, there is one thing they all have in common: fear.

Those of you responsible for significant software deployments know exactly what I’m talking about. You work, you prepare, you test. But when the day finally comes for your software to set sail, you are left hoping and praying it proves seaworthy on the Ocean of Production. In most companies, production is so different from your development and staging environments, that it’s almost impossible to know whether the code that worked in staging is going to succeed in production. Yet one thing is certain: if your software fails, everybody is going to know about it. Hence the fear.

When it comes to understanding the effects of fear on the developer, I think Frank Herbert, author of the epic science-fiction saga Dune, said it best: “Fear is the mind-killer.” Fear undermines experimentation and the entrepreneurial spirit. It discourages risk-taking and leads to bad habits, like avoiding deployment for months. And worst of all, fear slows down the innovation process. (See my post on the Innovation Tax many organizations are paying, and don’t know it.)

Pushing to production is undeniably scary. But over the last 30 years, working with my peers, I’ve developed a few methods for creating the conditions for safe, confident deployments. And my next four blogs in this series will unpack each of them in turn:

· The 180 Rule - Enabling fast, automated, easily reversible deployments

· Z Deployments - Limiting downtime from failed rollbacks

· The Goldilocks Gauge - Making the size and frequency of deployments just right

. Through the Looking Glass - Ensuring alignment between Dev, Stage, and Prod environments

These methodologies aren’t perfect and they won’t guarantee you a bug-free deployment. But they’re the best practices I’ve seen. And they help create a culture of confidence within an engineering team, which is the foundation of meaningful innovation.

To get started, my next blog will explain the “180 Rule” to help you reduce outage minutes in production. In the meantime, feel free to share your own tips and techniques for safe deployments with@MarkLovesTech.