Safe Software Deployments: Z Deployments

Mark Porter
October 27, 2021 | Updated: November 27, 2025

If you’ve gotten this far in my Safe Software Deployment series, you know how scary deployment day can be. Sleepless nights. Knots in the stomach. Cold sweats. These are the symptoms of uncertainty. And three decades of experience have taught me that all the positive thinking in the world won’t ensure a bug-free deployment.

That’s why I’ve developed a number of techniques that can consistently help teams minimize fear and achieve safe software deployment. In the last post, we discussed the 180 Rule. The purpose of this post is to explain how you can use “Z Deployments” to mitigate both fear and downtime. In future posts, we’ll look at both the Goldilocks Gauge and Through the Looking Glass.

Z Deployments are more than a catchy name. This is all about failed rollbacks, which in my experience are the biggest source of downtime in any software deployment pipeline. Now, we all try our best to eliminate the need for rollbacks in the first place - but when they do happen, we want them to be successful. However, in most companies, rollbacks are only tested in Prod, not in the prior stages of the pipeline. Even if you use the 180 Rule, which encourages quick and automated rollbacks, you don’t have any more certainty that they will work. This is where Z Deployments come in.

With a Z Deployment, the goal is to make rollbacks just as predictable and reliable as your normal “roll forward” software deployments. I call this technique a Z Deployment, because if you chart out the process, it looks like a Z. But you can also think of Z Deployments as akin to pressing “Command Z” on your keyboard: undo. Fast, simple, no drama. Here’s how it works.

Roll your code forward from development into staging. In staging, do your canary testing.
Then roll back into development. Do your canary testing again. If it doesn’t work, then you just proved that your rollback code was faulty in some way.
Roll your code forward into staging again, and do your full testing.
If it’s successful, roll your code forward into production.

Of course, this only works if your staging environment is clean and your team trusts it. I’ll get into this more in a future post called “Through the Looking Glass.” But the bottom line is that developers need to know that things will work in production; including any needed rollbacks. And the only way to do that is to test rollbacks in staging. Your version of canary tests and full tests might be different - in a perfect world you’d run full tests three full times, but often build systems aren’t set up to do that quickly enough.

Too often, staging is not clean. But generally, when developers deploy to staging, their added functionality tends to work. Everyone else is using staging, and their functionality is working, too. This is the “Happy Path” - where engineers test that their new thing works. That sounds great. But what else happens? Adjacent things get broken.

Often when you roll back, you’re not necessarily returning to your system’s original state, either for your own software change or for the adjacent software components. Your rollback code has to undo all the state changes your deployment to staging (or prod) may have made. Otherwise, the staging environment becomes polluted, and the results in staging won’t match the results in production. Developers lose faith in staging, and deployment again becomes a terrifying ordeal.

I used to work with someone who was absolutely obsessive about staging. He ran testing, and he refused to have a long-term staging environment. Instead, his team blew away staging every month and rebuilt it from scratch. Did I like this? Absolutely. Did it work? Yes. Developers trusted staging, which meant that deployments to prod were less scary.

The next step of safe software deployment is to embrace the Goldilocks Gauge, which helps make deployments routine and even boring – in a good way. It also makes both the 180 Rule and Z Deployments easier to execute, and it’s a necessity for teams working toward continuous development. In the meantime, feel free to share your own techniques for safe deployments at @MarkLovesTech.

← Previous

Real-time Applications Made Simple with MongoDB and Redpanda

MongoDB has a long history of advocating for simplicity and focusing on making developers more agile and productive. MongoDB first disrupted the database market with the document model, storing data records as BSON (binary representation of JSON documents). This approach to working with data enables developers to easily store and query their data as they use it naturally within their applications. As your data changes, you simply add an attribute to your documents and move on to the next ticket. There is no need to waste time altering tables and constraints when the needs of your application change. MongoDB is always on the lookout for more ways to make life easier for developers, such as addressing the challenges of working with streaming data. With streaming data, it may take armies of highly skilled operational personnel to build and maintain a production-ready platform (like Apache Kafka). Developers then have to integrate their applications with these streaming data platforms resulting in complex application architectures. It’s exciting to see technologies like Redpanda seeking to improve developer productivity for working with streaming data. For those unfamiliar with Redpanda, it is a Kafka API compatible streaming platform that works with the entire Kafka ecosystem, such as Kafka-Connect and popular Kafka drivers : librdkafka , kafka-python , and the Apache Kafka Java Client . Redpanda is written in C++ and leverages the RAFT protocol, which makes Apache ZooKeeper irrelevant. Also, its thread-per-core architecture and JVM-free implementation enable performance improvements over other data streaming platforms. On a side note, MongoDB also implements a protocol similar to RAFT for its replica set cluster primary and secondary elections and management. Both MongoDB and Redpanda share a common goal of simplicity and making complex tasks trivial for the developer. So we decided to show you how to pull together a simple streaming application using both technologies. The example application (found in this GitHub repository ) considers the scenario where stock ticker data is written to a Redpanda and consumed by MongoDB. Once you have the example running, a “stock generator” creates a list of 10 fictitious companies and starts writing ticker data to a Redpanda topic. Kafka Connect service listens for data coming into this topic and “sinks” the data to the MongoDB cluster. Once landed in MongoDB, the application issues an aggregation query to determine the moving averages of the stock securities and updates the UI. MongoDB consumes the ticker data and calculates the average stock price trends using the aggregation framework . Once you have downloaded the repository, a docker-compose script includes a Node server, Redpanda deployment, Kafka Connect service, and a MongoDB instance. The Kafka Connect image includes the Dockerfile-MongoConnect file to install the MongoDB Connector for Apache Kafka . The Dockerfile-Nodesvr is included in the nodesvr image and it copies the web app code & installs the necessary files via NPM. There is a run.sh script file that will launch the docker-compose script to launch the containers. To start the demo, simply run this script file via sh run.sh and upon success, you will see a list of the servers and their ports: The following services are running: MongoDB server on port 27017 Redpanda on 8082 (Redpanda proxy on 8083) Kafka Connect on 8083 Node Server on 4000 is hosting the API and homepage Status of kafka connectors: sh status.sh To tear down the environment and stop these services: docker-compose down -v Once started, navigate to localhost:4000 in a browser and click the “Start” button. After a few seconds, you will see the sample stock data from 10 fictitious companies with the moving average price. Get started with MongoDB and Redpanda This example showcases the simplicity of moving data through the Redpanda streaming platform and into MongoDB for processing. Check out these resources to learn more: Introduction to Redpanda MongoDB + Redpanda Example Application GitHub repository Learn more about the MongoDB Connector for Apache Kafka Ask questions on the MongoDB Developer Community forums Sign up for MongoDB Atlas to get your free tier cluster

October 26, 2021

Next →

That’s a Wrap: MongoDB’s 2025 in Review & 2026 Predictions

It’s nearly the end of the year—again! That means it’s time for an end-of-year blog post that expresses disbelief at the passage of time. Which, as the saying goes, flies when you’re having fun. And definitely when you’re as busy as MongoDB was in 2025. It was a big year for the company—and more importantly, for the tens of thousands of customers and millions of developers who rely on MongoDB’s modern data platform for their most mission-critical workloads. At MongoDB, everything we do starts with our obsession with customers and their needs, and if there’s a theme to MongoDB’s 2025, it was (and will continue to be) enabling customer innovation and helping them succeed in the AI era. So here are a few highlights of how MongoDB acted on behalf of customers in 2025. From the acquisition of Voyage AI to customer success across industries, a lot happened in 2025. Let’s go!* *Read to the end for 2026 thoughts. 2025: The (MongoDB) year that was Voyage AI, modernization, and search In February, MongoDB announced the acquisition of Voyage AI, a pioneer in embedding and reranking models, to enhance the accuracy of AI applications. Integrating Voyage AI's advanced retrieval technology with MongoDB’s modern, AI-ready data platform addresses a critical challenge: LLM model hallucinations caused by a lack of context. By improving retrieval accuracy for specialized domains like finance and law, the integration enables businesses to deploy AI for mission-critical use cases. To learn more, see the MongoDB Voyage AI page. Then, in September, we launched MongoDB AMP, an AI-powered Application Modernization Platform. AMP is designed to accelerate the transformation of legacy applications through a combination of AI-powered tooling, a proven delivery framework, and expert guidance (tools, techniques, and talent) to help enterprises reduce technical debt and modernize 2-3 times faster. Want more? Sure you do! Check out this short video. MongoDB also announced the addition of search and vector search capabilities to MongoDB Community Edition and MongoDB Enterprise Server. This allows developers to build and test AI-native applications, including those using retrieval-augmented generation (RAG), in local or on-premises environments. Previously exclusive to MongoDB Atlas, these features enable secure, hybrid deployments where sensitive data can remain on-premises while still leveraging advanced search tools. Here’s a (slightly less short) video about search and vector search on Enterprise Server. Growing and scaling with MongoDB As noted, everything we do at MongoDB starts with our obsession with customers. 2025 was another banner year for customer success and innovation—we were inspired by what organizations of every shape and size, across industries and geographies, built with MongoDB in 2025. Here are just two of the many stories our customers shared in 2025; much more can be found in my colleague Katie Palmer’s blog series, Innovating with MongoDB. Factory By combining the Atlas modern data platform with Voyage AI’s high-performance embeddings, the AI-native startup Factory—which uses AI agents called Droids to accelerate software development lifecycles for organizations—consolidated its fragmented tech stack. This enabled superior code retrieval, simplified operations, and provided the scalability needed to process billions of tokens daily. McKesson McKesson, a global pharmaceutical distributor, replaced its monolithic legacy infrastructure with MongoDB Atlas to meet strict drug tracing mandates. By adopting our modern cloud data platform, McKesson scaled its operations 300x, managing tracking data for 1.2 billion containers annually without latency, and ensuring compliance and patient safety while reducing developer complexity. For more, check out the video of McKesson at MongoDB.local NYC from September. From niche NoSQL to enterprise powerhouse As senior MongoDB engineer and Technical Fellow Ashish Kumar put it earlier this year, “through a sustained and deliberate engineering effort,” MongoDB has gone from a (seemingly) niche NoSQL solution to a trusted enterprise standard, and now delivers “the high availability, tunable consistency, ACID transactions, and robust security that enterprises demand.” A new era of leadership The face of MongoDB has also changed—our CFO, Mike Berry, joined the company in April, and Dev Ittycheria stepped down as CEO in November, after more than 11 years leading the company (including its 2017 IPO). In a LinkedIn post about his role, new MongoDB CEO CJ Desai noted that the company is “at the forefront of a new data revolution, unlocking the next wave of productivity and intelligence.” “Having spent my career building and scaling technology platforms, I’ve always been drawn to companies defined by clarity of vision, relentless organic innovation, and a customer-first culture. MongoDB exemplifies all three,” said Desai. We couldn’t agree more. Onward! Reading the 2026 tea leaves So what might 2026 bring (for MongoDB and tech at large)? Here are a handful of our leaders’ predictions: “As much as people want to talk about Artificial General Intelligence (AGI), we’re still in the phase where most AI use cases automate redundant tasks but benefit from human-in-the-loop checks. Organizations that use AI to complete work that historically is a drain on human resources—but then uses people to carefully verify what AI builds, apply governance frameworks, and maintain accountability across the data lifecycle—will be more successful.” —Pete Johnson, Field CTO, AI, MongoDB “After years of inflated expectations and unsustainable spending, the AI industry is trapped in a bubble where companies reflexively attempt to deploy LLMs at every problem, driving up costs with minimal to no return. Businesses that break free from this spending cycle are the ones that understand the need to ground LLM responses in factual data and learn from prior mistakes. We believe the best way to do this will be with highly accurate embedding models and rerankers for reliable data retrieval.” —Frank Liu, Staff Product Manager, MongoDB "In 2026, cloud independence will evolve from strategic preference to existential imperative across enterprises of every scale. The outages and disruptions of recent years have exposed a fundamental truth: in an always-on digital economy—where commerce, mobility, governance, and even public safety depend on uninterrupted access to cloud services—single-provider reliance is no longer a calculated risk, but a systemic vulnerability. Compounding this is the inexorable rise of data sovereignty. Regulatory regimes worldwide now demand precise jurisdictional control over data residency, rendering rigid cloud commitments incompatible with compliance at global scale. The defining competitive advantage will belong to organizations that transcend fragile prevention theater and engineer true infrastructural resilience: architectures inherently portable, data frictionlessly mobile, and operations autonomously sustained across heterogeneous clouds through AI-orchestrated redundancy. In short, the winners will not merely mitigate downtime—they will design systems that render the concept obsolete." —Ben Cefalo, SVP, Head of Core Products, MongoDB Happy holidays and happy New Year, everyone!

December 22, 2025