Mark Loves Tech
Musings from Mark Porter, CTO
Safe Software Deployments: The Goldilocks Gauge
December 14, 2021
Mark Loves Tech
Safe Software Deployments: Through the Looking Glass
We’ve covered a lot of ground in this Safe Software Deployment series, from the 180 Rule to Z Deployments to the Goldilocks Gauge. But there is an elephant in the room. Or should I say, a jabberwock. In Lewis Carroll’s novel Through the Looking Glass , Alice discovers that the mirror above her mantle is not a mirror at all, but a doorway to another world in which things work very differently. When developers push software from staging to production, they often have a similar experience. Though they want to believe that staging and production are the same, they discover that staging is not a mirror at all, and production is another world, in which things work very differently. And out of that distortion come bugs and outages. The bottom line is this: Staging ≠ Production. And it never will. There are simply too many variables between these two environments to ever achieve exact alignment. Those environments can be different in hardware: CPU cores, threads per core, cache size, microcode; bus architectures; memory size; firmware. Or different in software or configuration: OS versions; compilers; libraries; network traffic profiles. Or different in network topology, edge caching; DNS and directory services. And of course we know that no matter how diligent you test in staging, customer workloads exercise the software in different ways. But the major reason they are different is because both environments have had a different set of software deployed to them over time, with a different set of configuration parameters - and a different combination of patches, hacks, and rollbacks. That staging is a mirror of production is one of the great delusions of software development. Developers often tell this lie to themselves, to overcome their fear of pushing to prod. Worse, developers often know the truth, but don’t really know how to explain this ambiguity to their management chain, leading to inevitable trust issues when deployments fail. So what can we do about it? First, accept reality. Modern, distributed software systems are nonlinear, and all test environments are simulacrums. It’s particularly important to help managers understand that even if it were possible to create exact duplicates of production – which it’s not – it would be practically and financially unjustifiable. Second, approach testing like an actuary. The leap from staging to production is essentially an exercise in probability. You should know your architecture, operational characteristics, and costs well enough to prioritize tests and reduce risk. You may even want to create two or more test environments that are deliberately different to reduce the odds of failure. And you should continue to run tests after the release, so you can surface bugs before your customers do. Third, if you can, treat both production and staging like cattle, not pets. If you have an enlightened software organization, one that believes in best practices, set up your systems so that you can blow away and recreate both your production and staging environments at regular intervals. This will reset many of the deviations and environment drift that build up over time. Finally, you have to be able to do automated rollbacks (the 180 Rule ) which work reliably ( Z Deployments ), and that are the right size for optimum efficiency and safety (the Goldilocks Gauge . ;-) I saved this column for last because completely eliminating the differences between staging and production is not a solvable problem. And frankly, you shouldn’t even try. But you don’t want to get caught flat-footed either, so you need a system of best practices that greatly reduces risk and fosters confidence. In other words, a system of Safe Software Deployment that helps you overcome the fear of pushing to prod. And most importantly, everyone in your organization needs to have a common understanding of the problem, from the top down. So feel free to print this post out, slide it under the door of your manager, and slink away like that Cheshire Cat. Have another technique for managing the divergence between staging and production? Share it with me at @MarkLovesTech .
엔지니어링과 DIRT 비용 감소: 시대에 뒤처진 데이터 아키텍처가 혁신을 가로막는 걸림돌인 이유
저는 지난 2021년 3월, 혁신세 The Innovation Tax 에 관한 글을 썼습니다. '혁신세'는 번거로운 프로세스와 시대에 뒤처진 기술로 인해 엔지니어링 팀이 고객을 만족시킬 만한 우수한 기술을 개발하지 못하는 상황을 일컫는 말입니다. 이후로 몇 개월이 지나면서 저의 생각은 훨씬 더 발전했습니다. 앞으로 얼마나 많은 기술 임원이 자신의 회사에서 이러한 문제점을 빠르게 알아차리고 깊은 상실감을 저에게 털어놓을지 짐작조차 못할 정도입니다. 이번 포스팅에서는 수없이 많이 받은 피드백과 함께 제 생각이 어떻게 발전했는지 알려드리겠습니다. 또한 혁신세의 부담을 줄이기 위해 실천할 수 있는 방법도 전해드리겠습니다. 이를 마다할 분은 없으시겠죠? 혁신세는 소득세처럼 실제로 존재합니다. 물론 고객 이탈과 감소로 인해 사기가 떨어지기도 하지만 그뿐 아니라 금융 및 기회 비용도 발생하기 때문입니다. 혁신세 부담이 큰 기업들은 인력과 자원이 혁신이 아닌 유지보수에 집중되어 있어 혁신에 뒤처지는 경우가 많습니다. 우리는 이를 DIRT 비용이라고 합니다. 왜 DIRT일까요? 먼저 DIRT는 데이터(D)에서 발생합니다. 최신 애플리케이션은 실시간 데이터에 액세스하여 풍부한 사용자 경험을 창출해야 하지만 기존 데이터베이스로는 이를 지원하는 데 어려움이 따를 때가 많기 때문입니다. DIRT는 혁신(I)에 영향을 미칩니다. 개발 팀이 복잡하고 취약한 아키텍처를 지원할 방법을 찾기 위해 끊임없이 고심해야 한다면 혁신에 쏟을 시간이 거의 없기 때문입니다. DIRT는 비일회성이라서 반복적으로(R) 일어납니다. 마치 세금(T)처럼 한 번만 내면 해결할 수 있는 비용이 아니기 때문입니다. 오히려 그 반대입니다. 새로운 프로젝트가 있으면 여러 다른 팀이 관리해야 하는 구성요소, 프레임워크 및 프로토콜이 늘어나기 때문에 DIRT가 발생하여 새로운 프로젝트의 어려움이 더욱 커지기 마련입니다. 돌이켜 보면, 기술 임원들은 분명히 이러한 비용을 인지하고 데이터 아키텍처에서 얼마나 발생하는지, 혹은 얼마나 절감할 수 있는지 파악하려는 노력을 시작했을 것이 분명합니다. 데이터는 까다롭 전략적이며 대규모에 난해하지만 최신 디지털 기업에 없어서는 안 될 핵심 요소입니다. 최신 애플리케이션은 불과 10년 전에 개발했던 애플리케이션과 비교해 봐도 데이터 요건이 훨씬 더 정교해졌습니다. 데이터가 늘어난 것도 분명한 사실이지만 복잡성은 더욱 커졌습니다. 기업은 데이터에서 보내는 신호에 보다 빠르고 현명하게 대응해야 합니다. 하지만 경직되고 비효율적인 단일 모델이면서 프로그래밍이 어려운 관계형 데이터베이스를 포함하고 있는 기존 기술은 이러한 기대에 부응하지 못합니다. 제가 2020년 MongoDB에 입사한 이후 지금까지 300회 넘게 최고 임원들과 대화를 나누었지만 이 문제를 언급한 CTO는 손가락으로 꼽을 만큼도 안 됩니다. 엔지니어링 팀이 기술 스택에서 새로운 애플리케이션의 요건을 처리하지 못하면 필요한 작업(시계열, 텍스트, 그래프 등)에 따라 단일 목적 데이터베이스를 추가하는 경우가 많습니다. 그런 다음 연속되는 파이프라인을 구축하여 데이터를 이리저리 마이그레이션합니다. 그 결과, 모든 것이 느리고 복잡해질 뿐만 아니라 정치적이 되기도 합니다. 하지만 이제, 복잡한 LinkedIn 프로필을 다듬을 때가 되었습니다. 자주 눈에 띄지 않는다면 무시하고 넘어갈 수도 있습니다. 하지만 대기업들은 수백 개에서 수천 개에 이르는 애플리케이션을 사용할 뿐만 아니라 각 애플리케이션마다 데이터 소스나 파이프라인이 다를 수도 있습니다. 시간이 지나 데이터 스토어와 파이프라인이 곱절로 늘어나면 기업의 데이터 아키텍처는 마치 복잡하게 얽힌 스파게티 뭉치처럼 보이기 시작합니다. 결국 얼마 지나지 않아 ETL, ELT, 스트리밍 등 완전한 미들웨어 계층을 운영하면서 유지보수까지 해야 합니다. 프레임워크와 프로토콜, 그리고 간혹 언어까지 다른 기술의 다양성은 개발자 협업을 더욱 어렵게 만드는 원인입니다. 모든 아키텍처가 맞춤 설계로 인해 취약하고 불안정하기 때문에 확장은 더더욱 어렵습니다. 개발자는 기업과 고객이 원하는 새로운 애플리케이션과 기능을 개발하지 못하고 통합 작업에 매달리느라 소중한 '워크플로' 시간을 허비합니다. 기업 아키텍트는 결국 잘못된 일을 해결하는 데 시간을 보낼 때가 많아지는 것입니다. 저는 대부분의 고객은 새로운 데이터 아키텍처 접근 방식을 도입할 준비가 되어 있다고 생각합니다. 제 업무 중에서 가장 좋은 점은 다른 최고 임원들의 얘기를 들으면서 정보를 얻을 수 있다는 것입니다. 팬데믹으로 인해 직접 만나서 얘기를 들을 수 없게 된 후 MongoDB가 이러한 기회를 온라인으로 옮겨 기술 임원들을 초대한 덕분에 온라인에서 일대일로, 혹은 그룹으로 만나 가장 커다란 문제가 무엇인지 터놓고 대화를 나눌 수 있습니다. 이러한 온라인 세션에서 한 CTO는 이렇게 말했습니다. “CFO의 대차대조표에 기술 부채도 들어가야 합니다.” Zoom에서도 이러한 발언은 분명히 설득력이 있었습니다. 저희는 또한 잘 알려진 일부 벤처 캐피탈 회사가 데이터 아키텍처를 설명한 슬라이드 데크도 살펴보기 시작했습니다. 벤처 캐피탈 회사는 자사의 포트폴리오 회사들을 각각 미래의 데이터 아키텍처 분야에서 유력한 업체로 반드시 포지셔닝해야 합니다. 하지만 전반적인 비전은 강렬하지 못했습니다. 한 기술 임원은 “스무 가지 신기술을 보면서 더 배워야 한다고 느꼈습니다. 정말 엄청나더군요.”라고 말했습니다. 다른 임원들도 이런 아키텍처 다이어그램을 보는 것만으로도 약간 당황스럽다고 밝혔습니다. 자사의 데이터 아키텍처가 그 정도로 복잡하다는 사실을 이미 알고 있었기 때문입니다. 대부분 데이터 아키텍처의 간소화 필요성을 알고 있지만 너무 버거운 작업이다 보니 기약 없이 미루고 있었습니다. 저는 최근에 대형 헬스케어 회사를 만났습니다. 이 회사의 임원들은 데이터 아키텍처 간소화를 어렵게 생각했지만, 과감히 작업에 착수한 결과 이제는 간소화는 반드시 필요할 뿐만 아니라 복잡하게 얽힌 데이터 아키텍처를 풀어가는 과정에서 많은 정보를 습득할 수 있다는 사실을 알게 되었다고 했습니다. 대부분 경우 혁신세는 새로운 기술을 생각조차 못하는 무능력에서 발현됩니다. 이는 기본적인 아키텍처가 너무 복잡해서 유지보수에 어려움을 겪다 보니 설상가상으로 잘 알지도 못해 바꿀 생각도 못하기 때문입니다. 수많은 대기업 임원들이 혁신의 둑을 손가락으로 막은 채 앉아서 은퇴만 기다리고 있는 이유가 바로 이것입니다. 자신들은 현대화할 수 없다고 생각하는 것입니다. MongoDB가 어떻게 이러한 문제를 해결했는지 들어보면 놀랄 일도 아닙니다. 범용 데이터베이스도 모든 유형의 데이터를 빠르게 대규모로 처리할 수 있기 때문입니다. 지금부터 자세하게 알려드리겠습니다. 저는 35년 동안 데이터베이스를 전문적으로 다루는 일을 하다가 한 가지 이유로 MongoDB에 입사했습니다. 제가 최소 30년 동안 만들고 싶었던 데이터베이스 및 애플리케이션 개발 환경을 MongoDB에서 구축할 수 있다는 확신이 생긴 것입니다. 이제 MongoDB의 비전은 데이터베이스를 넘어 더욱 광범위한 다양한 목적으로 사용될 뿐만 아니라, 어떤 유형의 애플리케이션이든 개발 방식을 가속화하고 간소화할 수 있는 애플리케이션 데이터 플랫폼을 향해 나아가고 있습니다. 이러한 변화는 예나 지금이나 변함없이 데이터를 통한 작업 용이성이라는 원대한 목표를 향한 거침없는 열정을 잘 보여줍니다. 우리는 데이터가 걸림돌이 아닌 혁신의 원동력으로 사용되기를 바랍니다. 그리고 마침내 기술 팀이 무분별한 기술 투자로 복잡하게 얽힌 스택을 풀고 DIRT를 제거할 수 있게 되기를 바랍니다. 그럼, 어디서부터 시작해야 할까요? 먼저 DIRT가 어떻게 기술 팀의 발목을 붙잡는지 정확하게 이해해야 합니다. 개발자들이 개발 환경의 파편화로 인해 협업에 어려움을 겪고 있습니까? 지원하려는 애플리케이션 변경 사항과 비교했을 때 스키마 변경 사항을 배포하는 시간이 더 오래 걸립니까? 고객을 전방위적으로 파악할 수 있는 시야를 구축하는 데 어려움이 있습니까? 만약 그렇다면 이유는 무엇일까요? 이러한 질문은 DIRT 분석을 처음 시작할 때 유용한 출발점이 됩니다. 또한 애플리케이션과 데이터 소스뿐만 아니라 데이터를 애플리케이션 데이터 플랫폼으로 마이그레이션하는 데 필요한 부분까지 주의 깊게 살펴보는 것이 좋습니다. 예를 들어 애플리케이션 객체를 비롯해 이러한 객체와 상호작용하는 애플리케이션을 전부 살펴보세요. 그런 다음에는 속성, 메소드, 컬렉션 등에 따라 각각 복잡성 점수를 할당할 수 있습니다. 이제 다시 돌아가서 각 객체에 연결되는 애플리케이션을 일일이 확인한 후 미션 크리티컬 수준, 애플리케이션 사용자 수, 애플리케이션에서 실행해야 하는 작업 수, 그리고 각 작업의 복잡성을 기준으로 등급을 매기세요. 이를 통해 모든 복잡성을 완전히 이해했다면 더욱 유리한 위치, 즉 복잡성과 통합 필요성이 가장 적은 데이터 소스부터 시작해서 계획을 세워 기존 시스템에서 벗어날 수 있습니다. 물론 측정 지표나 이점은 상황에 따라 다르지만 출발점으로는 손색이 없습니다. 그렇다고 이런 과정이 쉽다는 뜻은 아닙니다. 많은 분이 그러하듯 저는 지금까지 직무 경험을 쌓으면서 이 문제를 해결하는 데 대부분 시간을 할애해 왔습니다. 이 말은 이 문제가 어떻게 진행되는지 뿐만 아니라 기업이 DIRT를 청산할 수 있는 방법의 시발점까지 제가 잘 알고 있다는 뜻이기도 합니다. 저는 앞으로도 계속해서 이러한 당면 과제에 대해 글을 쓸 것이며 여러분에게 혜안을 드릴 수 있기를 바랍니다. DIRT에 대해 자세히 알고 싶으시면 MongoDB 백서를 다운로드 하시기 바랍니다. 늘 그렇듯 저는 여러분의 찬반 의견이나 다른 견해를 두 팔 벌려 기다립니다. @MarkLovesTech 로 트윗을 남겨주세요. 또한 marklovestech.com 에서도 MongoDB를 비롯해 그와 관련된 저의 요즘 생각들을 확인하실 수 있습니다.
Engineering, Done DIRT Cheap: How an Outdated Data Architecture Becomes a Tax on Innovation
In March 2021, I wrote about The Innovation Tax : the idea that clunky processes and outdated technologies make it harder for engineering teams to produce excellent tech that delights customers. In the months since then, my thinking has evolved even further. I couldn’t have guessed how many technology leaders would immediately recognize these problems in their own organizations and share their own deep frustrations with me. This article puts that evolved thought together with the massive feedback that piece received. It will give you actionable ways to decrease your tax burden — and who wouldn’t want that? The innovation tax, like income tax, is real. Of course, it saps morale (with resulting attrition and churn), but it also has other financial and opportunity costs. Taxed organizations see their pace of innovation suffer as people and resources are locked into maintaining rather than innovating. We named this tax DIRT . Why? Well, it’s rooted in data (D), because it so often springs from the difficulty of using legacy databases to support modern applications that require access to real-time data to create rich user experiences. It affects innovation (I), because your teams have little time to innovate if they’re constantly trying to figure out how to support a complex and rickety architecture. It’s recurring (R), because it’s not as if you pay the tax (T) once and get it over with. Quite the opposite. DIRT makes each new project ever more difficult because it introduces so many components, frameworks, and protocols that need to be managed by different teams of people. In retrospect, it’s clear that technology leaders would recognize this tax and immediately grasp the degree to which it’s caused -- or cured -- by their data architecture. Data is sticky, strategic, heavy, intricate -- and the core of the modern digital company. Modern applications have much more sophisticated data requirements than the applications we were building only 10 years ago. Obviously, there is more data, but it’s more complicated than that: Companies are expected to react more quickly and more cleverly to all of the signals in that data. Legacy technologies, including single-model rigid, inefficient, and hard-to-program relational databases, just don’t cut it. In over 300 CxO conversations I've had since joining MongoDB in 2020, fewer than a handful of CTOs disputed this statement. When your tech stack can’t handle the demands of new applications, engineering teams will often bolt on single-purpose niche databases to do the job (think time series, text, graph, etc.). Then they’ll build a series of pipelines to move data back and forth. And everything will get slow and complicated — and even political. Time to polish up that LinkedIn profile. If this were rare, it wouldn’t be such a big deal. But large enterprises can have hundreds or thousands of applications, each with their own sources of data and their own pipelines. Over time, as data stores and pipelines multiply, an organization’s data architecture starts to look like a plate of spaghetti. Soon you’re operating and maintaining an entire middleware layer of ETL, ELT, and streaming. The variety of technologies, each with their own frameworks, protocols, and sometimes languages, makes it harder for developers to collaborate. It makes it extremely difficult to scale, because every architecture is bespoke and brittle. Developers spend their precious “flow” hours doing integration work instead of building new applications and features that the business needs and customers will love. Enterprise architects often end up spending their time on all the wrong things. It’s clear to me that most customers are ready for a new approach to data architecture. One of the best parts of my job is listening to and learning from other CxOs. Since the pandemic made it impossible to do that in person, MongoDB moved these discussions online, inviting technology leaders to hash out some of their biggest problems 1:1 and in groups with me. In one of those sessions, a CTO commented, “Technical debt should be carried on your CFO's balance sheet.” Even on Zoom, the power of that statement was clear. We also started looking at slide decks about data architecture from some of the best-known venture capital firms. Certainly VCs must position each of their portfolio companies as a critical player in the data architecture of the future. But the overall vision was not compelling. One technology leader said, “When I look at 20 net-new technologies I need to learn, it’s terrifying.” Others commented that just looking at these architecture diagrams was a little off-putting, because they knew their own organization’s data architecture was at least that complicated already. They knew they needed to simplify their data architecture, but more than one admitted to postponing this work -- indefinitely -- because it was just too daunting. I recently met with a major health care company whose executives think it’s just barely possible, but they are bravely diving in anyway, knowing that they must do it and that they’ll learn along the way as they tear down their monoliths. In many cases, the innovation tax manifests as the inability to even consider new technology because the underlying architecture is too complex and difficult to maintain, much less understand and transform. This is why a lot of senior people at enterprise companies are sitting with their fingers in the transformation dike, waiting for retirement -- they think they can’t modernize. It won’t surprise you that we also saw how MongoDB, as a general purpose database able to handle all types of data at speed and scale, could help solve this problem. Let me be clear. I’ve been working on or with databases for my entire 35-year career, and I joined MongoDB for a reason: I believe we can build the database and application-building environment that I’ve wanted to create and use for at least 30 of those years. Our vision of MongoDB goes beyond our namesake database to a broader, more versatile data platform that allows you to accelerate and simplify how you build any type of application. It represents significant progress toward our larger goal, which remains the same as ever: to make data stunningly easy to work with. We want to see data become an enabler of innovation, not a blocker. And we want to finally allow technology teams to start to untangle their sprawl and get rid of their DIRT. Where to start? It’s good to have a better understanding of just how DIRT might be holding your teams back. Do your developers have trouble collaborating because the development environment is so fragmented? Do schema changes take longer to roll out than the application changes they’re designed to support? Do you have trouble building 360-degree views of your customers? And if so, why? These are all good places to start digging in the DIRT. You might also take a hard look at your applications and data sources, as well as what it would take to move your data onto a new data platform. That could mean identifying the objects in your applications and all the applications that interact with them. You could then assign a complexity score to each one based on attributes such as properties, methods, collections, and attributes. Now take a step back and identify each application that connects to each of those objects and rank it based on how mission-critical it is, how many people rely on it, how many tasks it has to perform, and the complexity of those tasks. Once you have a better handle on all this complexity, you’ll be better positioned to create a plan to move off your legacy systems, perhaps starting with the least complex and least integrated data sources. Of course, your metrics and your mileage will vary, but the point is to start. I don’t pretend any of this is easy. Like many of you, I’ve spent most of my career working on problems just like these. But that also means I know progress when I see it, and the beginning of a way for organizations to start to clean up their DIRT. I’ll be continuing to write more about these challenges and hopefully continue to add some perspective. If you’re curious to learn more about DIRT, you can download our white paper . As always, I’m eager to have you tweet your alignment, lack thereof, or other thoughts at @MarkLovesTech . You can also reach out to me on marklovestech.com , where you will find a compilation of my latest musings related to MongoDB and otherwise.
Safe Software Deployments: Z Deployments
If you’ve gotten this far in my Safe Software Deployment series, you know how scary deployment day can be. Sleepless nights. Knots in the stomach. Cold sweats. These are the symptoms of uncertainty. And three decades of experience have taught me that all the positive thinking in the world won’t ensure a bug-free deployment. That’s why I’ve developed a number of techniques that can consistently help teams minimize fear and achieve safe software deployment. In the last post, we discussed the 180 Rule . The purpose of this post is to explain how you can use “Z Deployments” to mitigate both fear and downtime. In future posts, we’ll look at both the Goldilocks Gauge and Through the Looking Glass. Z Deployments are more than a catchy name. This is all about failed rollbacks, which in my experience are the biggest source of downtime in any software deployment pipeline. Now, we all try our best to eliminate the need for rollbacks in the first place - but when they do happen, we want them to be successful. However, in most companies, rollbacks are only tested in Prod, not in the prior stages of the pipeline. Even if you use the 180 Rule, which encourages quick and automated rollbacks, you don’t have any more certainty that they will work. This is where Z Deployments come in. With a Z Deployment, the goal is to make rollbacks just as predictable and reliable as your normal “roll forward” software deployments. I call this technique a Z Deployment, because if you chart out the process, it looks like a Z. But you can also think of Z Deployments as akin to pressing “Command Z” on your keyboard: undo. Fast, simple, no drama. Here’s how it works. Roll your code forward from development into staging. In staging, do your canary testing. Then roll back into development. Do your canary testing again. If it doesn’t work, then you just proved that your rollback code was faulty in some way. Roll your code forward into staging again, and do your full testing. If it’s successful, roll your code forward into production. Of course, this only works if your staging environment is clean and your team trusts it. I’ll get into this more in a future post called “Through the Looking Glass.” But the bottom line is that developers need to know that things will work in production; including any needed rollbacks. And the only way to do that is to test rollbacks in staging. Your version of canary tests and full tests might be different - in a perfect world you’d run full tests three full times, but often build systems aren’t set up to do that quickly enough. Too often, staging is not clean. But generally, when developers deploy to staging, their added functionality tends to work. Everyone else is using staging, and their functionality is working, too. This is the “Happy Path” - where engineers test that their new thing works. That sounds great. But what else happens? Adjacent things get broken. Often when you roll back, you’re not necessarily returning to your system’s original state, either for your own software change or for the adjacent software components. Your rollback code has to undo all the state changes your deployment to staging (or prod) may have made. Otherwise, the staging environment becomes polluted, and the results in staging won’t match the results in production. Developers lose faith in staging, and deployment again becomes a terrifying ordeal. I used to work with someone who was absolutely obsessive about staging. He ran testing, and he refused to have a long-term staging environment. Instead, his team blew away staging every month and rebuilt it from scratch. Did I like this? Absolutely. Did it work? Yes. Developers trusted staging, which meant that deployments to prod were less scary. The next step of safe software deployment is to embrace the Goldilocks Gauge, which helps make deployments routine and even boring – in a good way. It also makes both the 180 Rule and Z Deployments easier to execute, and it’s a necessity for teams working toward continuous development. In the meantime, feel free to share your own techniques for safe deployments at @MarkLovesTech .
Safe Software Deployments: The 180 Rule
In my last post , I talked about the anxiety developers feel when they deploy software, and the negative impact that fear has on innovation. Today, I’m offering the first of four methods I’ve used to help teams overcome that fear: The 180 Rule. Developers need to be able to get software into production, and if it doesn’t work, back it out of production as quickly as possible and return the system to its prior working state. If they have confidence that they can detect problems and fix them, they can feel more confident about deploying. All deployments have the same overall stages: Deployment: You roll the software from staging to production, either in pieces -- by directing more and more transactions to it -- or by flipping a switch. This involves getting binaries or configuration files reliably to production and having the system start using them. Monitoring: How does the system behave under live load? Do we have signals that the software is behaving correctly and performantly? It’s essential that this monitoring focuses more on the existing functionality than just the “Happy Path” of the new functionality. In other words, did we damage the system through the rollout? Rollback: If there is any hint that the system is not working correctly, the change needs to be quickly rolled back from production. In a sense, a rollback is a kind of deployment, because you’re making another change to the live system: returning it to a prior state. The “180” in the name of the rule has a double meaning. Of course, we’re referring here to the “180 degree” about-face of a rollback. But it’s also a reference to an achievable goal of any deployment. I believe that any environment should be able to deploy software to production and roll it back if it doesn’t work in three minutes, or 180 seconds. This gives 60 seconds to roll binaries to the fleet and point your customers to them, 60 seconds to see if the transaction loads or your canaries see problems, and then 60 seconds to roll back the binaries or configurations if needed. Of course, in your industry or for your product, you might need this to be shorter. But the bottom line is that a failed software deployment should not live in production for more than three minutes. Developers follow these three stages all the time, and they often do it manually. I know what you’re thinking: “How can any human being deploy, monitor, and roll back software that fast?” And that is the hidden beauty of the 180 Rule. The only way to meet this requirement is by automating the process. Instead of making the decisions, we must teach the computers how to gather the information and make the decisions themselves. Sadly, this is a fundamental change for many companies. But it’s a necessary change. Because the alternative is hoping things will work while fearing that they will not. And that makes developers loath to deploy software. Sure, there are a lot of tools out there that help with deployments. But this is not an off-the-shelf, set-it-and-forget-it scenario. You, as the developer, must provide those tools with the right metrics to monitor and the right scripts to both deploy the software and possibly roll it back. The 180 Rule does not specify which tools to use. Instead it forces developers to create rigorous scripts and metrics, and ensure they can reliably detect and fix problems quickly. There’s a gotcha that many of you are thinking of: The 180 Rule is not applicable if the deployment is not reversible. For example, deploying a refactored relational schema can be a big problem, because a new schema might introduce information loss that prevents a roll-back. Or the deployment might delete some old config files that aren’t used by the new software. I’ll talk more about how to avoid wicked problems like these in my subsequent posts. But for now, I’m interested to hear what you think of The 180 Rule, and whether you’re using any similar heuristics in your approach to safe deployment.
Safe Software Deployments: Overcoming the Fear and Loathing of Pushing to Prod
Over the course of my career, I’ve had the privilege of deploying many different types of software. I’ve shipped CDs. I’ve pushed customer software over the web. I’ve updated database instances and control planes. And I’ve live-updated large, running, mission-critical systems. I call this a privilege because getting software into the hands of end users is what software engineers love most. But deployments are not all fun and games. And while each deployment presents its own unique challenges, there is one thing they all have in common: fear. Those of you responsible for significant software deployments know exactly what I’m talking about. You work, you prepare, you test. But when the day finally comes for your software to set sail, you are left hoping and praying it proves seaworthy on the Ocean of Production. In most companies, production is so different from your development and staging environments, that it’s almost impossible to know whether the code that worked in staging is going to succeed in production. Yet one thing is certain: if your software fails, everybody is going to know about it. Hence the fear. When it comes to understanding the effects of fear on the developer, I think Frank Herbert, author of the epic science-fiction saga Dune, said it best: “Fear is the mind-killer.” Fear undermines experimentation and the entrepreneurial spirit. It discourages risk-taking and leads to bad habits, like avoiding deployment for months. And worst of all, fear slows down the innovation process. (See my post on the Innovation Tax many organizations are paying, and don’t know it.) Pushing to production is undeniably scary. But over the last 30 years, working with my peers, I’ve developed a few methods for creating the conditions for safe, confident deployments. And my next four blogs in this series will unpack each of them in turn: The 180 Rule - Enabling fast, automated, easily reversible deployments Z Deployments - Limiting downtime from failed rollbacks The Goldilocks Gauge - Making the size and frequency of deployments just right Through the Looking Glass - Ensuring alignment between Dev, Stage, and Prod environments These methodologies aren’t perfect and they won’t guarantee you a bug-free deployment. But they’re the best practices I’ve seen. And they help create a culture of confidence within an engineering team, which is the foundation of meaningful innovation. To get started, my next blog will explain the “180 Rule” to help you reduce outage minutes in production. In the meantime, feel free to share your own tips and techniques for safe deployments with @MarkLovesTech .
MongoDB & Bosch: Discussion on AIoT
For more than a decade, the digital transformation of industry has been focused on the technologies that make up the Internet of Things (IoT). As AI and machine learning tech mature, a new field has appeared that combines these trends: AIoT, the Artificial Intelligence of Things, which applies AI to the data collected by IoT devices. Among the firms pioneering this space is the engineering and industrial giant Bosch, which has long been a leader in IoT. The move to AIoT has allowed for Bosch to create smart products that either have intelligence built in, or have “swarm intelligence” in their back end that allows for the collection of data that is used to improve the products. In April 2021, Mark Porter, CTO at MongoDB, and Dirk Slama, VP of Co-innovation and IT/IoT Alliances at Bosch, sat down to discuss AIoT. Their conversation saw them touch on what MongoDB and Bosch are working on around AIoT, and where they see the future of AIoT heading. Bosch’s new focus on AIoT has enhanced their need for a flexible, modern data platform such as MongoDB. IoT devices collect enormous amounts of data; as Bosch adds sensors and new types of data to their products, MongoDB has allowed it to adapt quickly without having to go through a schema redesign whenever they need to implement a change to their products. As part of their efforts to progress AIoT technology, Bosch and other companies recently created the AIoT User Group , an initiative open to anyone. The group’s goal is to bring end users working on AIoT business and use cases together with technology experts to share best practices around AIoT solutions. This co-creation approach allows for the rapid utilization of best practices to try out and develop new ideas and technologies. Porter and Slama’s conversation covered many AIoT topics — and a glimpse at the technology’s next steps. For instance, Slama wants to see agility added to AIoT without losing control. In AIoT, there are many features that must be perfect on day one; but there are also a lot of features where you want to continuously improve system performance, which requires an agile approach. For Mark Porter and Dirk Slama's full conversation, check out the video below!
The Rise of the Strategic Developer
The work of developers is sometimes seen as tactical in nature. In other words, developers are not often asked to produce strategy. Rather, they are expected to execute against strategy, manifesting digital experiences that are defined by the “business.” But that is changing. With the automation of many time-consuming tasks -- from database administration to coding itself -- developers are now able to spend more time on higher value work, like understanding market needs or identifying strategic problems to solve. And just as the value of their work increases, so too does the value of their opinions. As a result, many developers are evolving, from coders with their heads-down in the corporate trenches to highly strategic visionaries of the digital experiences that define brands. “I think the very definition of ‘developer’ is expanding,” says Stephen “Stennie” Steneker, an engineering manager on the Developer Relations team at MongoDB. “It’s not just programmers anymore. It’s anyone who builds something.” Stennie notes that the learning curve needed to build something is flattening. Fast. He points to an emerging category of low code tools like Zapier, which allows people to stitch web apps together without having to write scripts or set up APIs. “People with no formal software engineering experience can build complex automated workflows to solve business problems. That’s a strategic developer.” Many other traditional developer tasks are being automated as well. At MongoDB, for example, we pride ourselves on removing the most time-consuming, low-value work of database administration. And of course, services like GitHub Copilot are automating the act of coding itself. So what does this all mean for developers? A few things: First, move to higher ground. In describing one of the potential outcomes of GitHub Copilot, Microsoft CTO Kevin Scott said, ““It may very well be one of those things that makes programming itself more approachable.” When the barriers to entry for a particular line of work start falling, standing still is not an option. It’s time to up your strategic game by offering insight and suggestions on new digital experiences that advance the objectives of the business. Second, accept more responsibility. A strategic developer is someone who can conceive, articulate, and execute an idea. That also means you are accountable for the success or failure of that idea. And as Stennie reminded me, “There are more ways than ever before to measure the success of a developer’s work.” And third, never stop skilling. Developers with narrow or limited skill sets will never add strategic value, and they will always be vulnerable to replacement. Like software itself, developers need to constantly evolve and improve, expanding both hard and soft skills. How do you see the role of the developer evolving? Any advice for those that aspire to more strategic roles within their organizations? Reach out and let me know what you think at @MarkLovesTech .
4 Common Misperceptions about MongoDB
One year ago, in the middle of the pandemic, Dev Ittycheria, the CEO of MongoDB, brought me on as Chief Technology Officer. Frankly, I thought I knew everything about databases and MongoDB. After all, I’d been in the database business for 32 years already. I’d been on MongoDB’s Board of Directors and used the products extensively. And of course I’d done my due diligence, met the leadership team, and analyzed earnings reports and product roadmaps. Even with all that knowledge, this past year as MongoDB’s CTO has taught me that many of my preconceived notions were just plain wrong. This made me wonder how many other people might also have the wrong impression about this company. And this blog is my attempt to set those perceptions straight by sharing my four major revelations of the last year. My first revelation is that MongoDB is not trying to become this generation’s relational database. For years I assumed that MongoDB basically wanted to be a better, more modern version of Oracle when it grew up. In other words, compete with the huge footprint of Oracle and other commercial RDBMSs that have been the industry archetype for so long. I was way off. The whole point of MongoDB is to leave all those forms of archaic, legacy database technology in the historical dust. This was never supposed to be an evolution, but instead a revolution. Our founders not only envisioned the world's fastest and most scalable persistent store, but also one that would be programmed and operated differently. The combination of embedded documents and structures combined with automatic high availability and almost-infinite distribution capability all add up to a fundamentally different way of working with data, building applications, and running those applications in production. Oracle and (SQL*Server, etc) still hang their hats on E.F. Codd’s 51-year old vision of rows and columns. To obtain high availability and distribution of data, you need add ons, options packages, baling wire and duct tape. And you need a lot of database administrators. Not cheap. Even after all that, you’re still trailing the technological edge. This is how wrong I was. Our durable competitive advantages over these legacy data stores make competing with those products almost irrelevant. We instead focus on the modern needs of modern developers building modern applications. These developers need to create their own competitive advantage through language-native development, reliable deployments to production, and lightning fast iteration. And the world is noticing; just check out the falling slope of Oracle and SQL*Server and the rising slope of MongoDB on the db-engines website. Which brings me to my second revelation: MongoDB was built for developers, by developers. I always knew that MongoDB was exceedingly fast and easy to program against. One time while I was bored in a meeting (yes, it happens here as well!), I built an Atlas database, loaded it with 350MB of data, downloaded and learned our Compass data discovery tool, built-in analytics aggregation pipelines, and our Charts package, and embedded live charts in a web page. This took me all of 19 minutes, end to end. To build something like that for engineers , it just has to be built by engineers , ones that are free to focus on all the rough edges that creep into products as features are added. I was first exposed to software planning and management over 40 years ago, and my LinkedIn profile shows a pretty diverse tour around the industry. Now, one year in, I can emphatically state that engineering and product at MongoDB are both different and better than any company I’ve ever had the privilege to work at. Our executive leadership gives engineering and product broad brushstokes of goals and desired outcomes, and then we work together to come up with detailed roadmaps, updated quarterly, that meet those goals in the way we think best, with no micromanagement. And we’re not afraid of 3-5 year projects, either. For example, multi-cloud was more than three years in the making. Also unlike any other company I’ve been at, we embrace the creation and re-payment of tech debt, rather than sweeping it under the rug. We do this through giving our product and engineering teams huge amounts of context, delivered with candor and openness. And one more essential thing; we have an empowered program management team that improves processes (including killing them) as fast as we create them. In short, we paint the targets for our teams and let them decide how and when to shoot. They even design the arrows and bows. It’s true bottoms-up engineering. Our engineers feel valued and understood. And that, in turn, empowers them to develop features that make our customers feel valued and understood, like a unified query language, or real-time analytics and charting directly in the console, or multi-region/multi-cloud clusters where all the networking cruft is taken care of for you. And this brings me to my third revelation: MongoDB is built for even the most demanding mission critical applications. Fast? Yes. Easy? Of course. But mission-critical? That’s not how I saw MongoDB when I used Version 2 for a massive student data project 10 years ago. While it was the only possible datastore we could have chosen for the amount of data and the speed of ingestion and processing needed, it was pretty hard to set up and use in a 24 x 365 environment. MongoDB had gotten ahead of itself in the early 2010’s. There was a gap between our capabilities and the expectations of the market. And it was painful. Other databases had had more than 30 years to solidify their systems and operations. We’d had five. But with Version 3 we added a new storage engine, full ACID transactions, and search. We built on it with Version 4. And then again with Version 5, released this week at our .Live conference. I knew about all this progress intellectually of course when I joined, but not viscerally. I came to realize that the security, durability, availability, scalability, and operability our platform offers (of course in addition to all the features that developers love too) was ideal for architecting fast-moving enterprise applications. And I found the proof in our customer list. It reads like a Who’s Who of major global banks, retailers, and telecommunications companies, running core systems like payments, IoT applications, content management, and real-time analytics. They use our database, data lake, analytics, search, and mobile products across their entire businesses, in every major cloud, on-premises, and on their laptops. And that leads me to my fourth and final revelation. MongoDB is no longer just a database. Of course, the database is still the core. But MongoDB now provides an enterprise-class, mission-critical data platform. A cohesive, integrated suite of offerings capable of managing modern data requirements across even the most sprawling digital estates, and scaling to meet the level of any company’s ambition, without sacrificing speed or security. Since the day I was first introduced to MongoDB’s products, I’ve had tremendous respect and admiration for the teams and their work. After all, I’m a developer, first and foremost. And it always felt like they “got” me. But had I known then what I know now, I would have jumped on this train a long time ago. In fact, I might have camped out on their doorstep with my resume in hand. And who knows? Maybe a bunch of people reading this will do just that, and have their own revelations about how fulfilling and exciting it can be to be at a great company, with a great culture, producing great products. I’ll write another letter a year from now, and let you know how it’s going then. In the meantime, please reach out to me here, or at @MarkLovesTech .
몽고DB에 대한 4가지 오해
일년 전 코로나19가 한창일 때 몽고DB CEO 데브 이티체리아가 나를 CTO(최고 기술 책임자)로 채용했다. 솔직히 나는 데이터베이스와 몽고DB에 대해 이미 모든 것을 알고 있다고 생각했다. 데이터 베이스 업계에서 일한지도 벌써 32년이나 됐고, 몽고DB 이사회에 참여하면서 몽고DB 제품을 두루 사용해왔기 때문이다. 물론 실사도 진행했고 경영진과 미팅을 가졌으며 실적 보고서와 제품 로드맵을 분석해왔다. 이렇게 지식을 쌓았음에도 불구하고 지난 한 해 동안 몽고DB에서 CTO로 일하면서 나는 내가 얼마나 잘못된 선입견을 갖고 있었는지를 깨닫게 됐다. 그리고 나처럼 몽고DB에 대해 오해하고 있는 사람이 많이 있을 것이라는 생각에 이르렀다. 이에 지난 일 년간 내가 알게 된 4가지 사실을 이 블로그를 통해 공유하여 사람들의 인식을 바로잡고자 한다. 첫째, 몽고DB의 비전은 현 세대의 관계형 데이터베이스가 되는 것이 아니라는 것이다. 나는 수년 전부터 몽고DB가 오라클보다 한층 개선된 최신 버전을 추구한다고 생각해왔다. 다시 말해, 몽고DB가 오라클의 광범위한 입지와 더불어 업계 전형으로 오랜 기간 군림해왔던 여타 상용 RDBMS와 경쟁하려는 듯 보였다. 하지만 이런 내 추측은 완전히 벗겨 갔다. 몽고DB의 궁극적인 목표는 모든 유형의 구식 레거시 데이터 베이스 기술에서 완전히 탈피하는 것이었다. 그리고 이러한 목표를 진화가 아닌 혁명으로 달성하고자 했다. 몽고DB의 설립자들은 세계 최고의 속도와 확정성을 갖춘 퍼시스턴트 저장소에서 한 걸음 더 나아가, 다른 방식으로 프로그래밍 및 운영이 가능한 저장소를 꿈꿨다. 내장된 문서와 구조가 고가용성을 갖춘 무제한에 가까운 자동 배포 기능과 결합하면 데이터 작업, 애플리케이션 구축, 그리고 프로덕션 단계에서의 애플리케이션 실행 방식이 근본적으로 달라지게 된다. 오라클과 SQL*서버 등은 아직도 열과 행이라는 51년 된 E.F 코드의 비전에 의존하고 있다. 데이터의 고가용성과 배포 능력을 확보하기 위해서는 추가 기능을 비롯해 옵션 패키지, 베일링 와이어, 그리고 덕트 테이프가 필요하다. 여기에 데이터베이스 관리자도 상당수 투입해야 하기에 만만치 않은 비용이 들게 된다. 그럼에도 불구하고 많은 이들이 여전히 기술적 우위를 추구하고 있다. 나도 이렇게 잘못 생각하고 있었다. 레거시 데이터 저장소를 능가하는 몽고DB의 지속 가능한 경쟁 우위를 고려할 때, 레거시 데이터 저장소 제품과 경쟁한다는 건 사실상 무의미하다. 대신 우리는 최신 애플리케이션을 구축하는 오늘날 개발자들의 최신 요구사항에 초점을 맞춘다. 지금의 개발자들은 고유 언어 개발, 프로덕션 환경으로의 안정적인 배치, 초고속 반복 작업을 통해 자신만의 경쟁 우위를 창출해야 한다. 그리고 전 세계가 이를 주목하고 있다. db 엔진 사이트 만 하더라도, 오라클과 SQL*서버는 하락세에 접어들고 있는데 반해 몽고DB는 상승세를 타고 있는 것을 알 수 있다. 여기서 내가 두 번째로 깨달은 점이 있다. 몽고DB는 개발자가 직접 개발자를 위해 만든 제품이라는 것이다. 나는 몽고DB가 굉장히 빠르고 프로그래밍 하기도 매우 쉽다는 것을 익히 알고 있었다. 어느 날은 회의가 너무 지루해서 (몽고DB에서도 이런 일은 일어난다!) 아틀라스 데이터베이스를 구축해 350MB의 데이터를 로드했다. 그리고 몽고DB의 콤파스 데이터 디스커버리 툴, 내장형 분석 집계 파이프라인, 몽고DB 차트 패키지, 웹페이지의 내장 라이브 차트를 다운로드해 공부했다. 시작부터 끝까지 총 19분이 걸렸다. 이러한 환경을 엔지니어를 위해 개발하려면 제품에 기능을 추가할 때 발생하는 소소한 오류에 자유롭게 집중할 수 있는 엔지니어들이 개발해야 한다. 40여 년 전, 소프트웨어 기획과 관리를 처음 접하게 되었는데 내 링크드인 계정을 보면 이 쪽 업계에서 아주 다양한 일들을 해왔다는 것을 알 수 있을 것이다. CTO직을 맡은 지 일년이 지난 지금, 나는 우리의 엔지니어링 기술과 제품이 내가 그 간 일했던 다른 기업들의 제품과 다르며 월등하다고 장담할 수 있다. 몽고DB의 경영진은 엔지니어링과 제품 부서에게 우리의 목표와 원하는 결과를 대략적으로 설명한 다음 협력을 통해 세부 로드맵을 마련하고, 또 로드맵을 분기별로 업데이트하며, 사소한 것까지 관리할 필요 없이 우리의 목표를 가장 잘 충족하는 로드맵을 마련한다. 3~5년짜리 프로젝트를 한다고 하더라도 두려울 게 없다. 일례로, 멀티 클라우드를 개발하는 데는 3년 이상이 걸렸다. 과거 일했던 다른 기업들과는 달리, 몽고DB는 기술 부채를 비밀로 덮어두기 보다는 기술 부채의 생성과 상환에 적극 대응하고 있다. 우리는 제품 팀과 엔지니어링 팀에게 이러한 상황 전체를 가감없이 솔직하게, 그리고 공개적으로 알린다. 한 가지 더 중요한 사실은, 프로세스를 생성하는 즉시 개선할 수 있는(프로세스 삭제 포함) 역량 있는 프로그램 관리 팀이 있다는 것이다. 간략히 말해, 우리는 실무팀에게 목표를 제시하고, 목표를 달성하는 방식과 시기를 주체적으로 결정하게 한다. 실무 팀은 이 과정에서 자신들이 사용할 툴까지도 직접 설계한다. 진정한 상향식 엔지니어링인 것이다. 몽고DB의 엔지니어들은 스스로 가치 있고, 경영진으로부터 이해 받고 있다고 느낀다. 이를 통해 엔지니어들은 고객에게도 이러한 감정을 불러일으킬 수 있는 다양한 기능을 만들 수 있다. 이를테면 통합 쿼리 언어나 콘솔 기반의 실시간 분석 및 차트 작성 또는 모든 네트워킹 크러프트를 사용자 대신 처리하는 멀티 리전/멀티 클라우드 클러스터 기능 등이 있다. 그리고 여기서 세 번째로 알게 된 사실은, 몽고DB가 가장 까다로운 미션 크리티컬 애플리케이션에도 사용할 수 있도록 설계됐다는 점이다. 속도가 빠르냐고? 그렇다. 사용 편의성은? 당연히 편리하다. 그렇다면 미션 크리티컬은? 10년 전에 진행된 대규모의 학생 데이터 프로젝트에서 버전 2를 사용할 때까지만 해도 몽고DB를 이렇게 생각하진 않았다. 몽고DB는 작업에 필요한 데이터의 양과 데이터 수집 및 처리 속도를 위해 선택할 수 있는 유일한 데이터 저장소였지만, 24시간 상시 가동 환경에서 구축하여 사용하기에는 턱없이 부족했다. 몽고DB는 2010년대 초에 자체 역량을 넘어섰는데, 우리의 역량과 시장의 기대 사이에는 차이가 있었다. 이 점이 매우 힘들었다. 다른 데이터베이스는 시스템과 운영 방식을 강화하기까지 30년 이상이 걸렸다. 하지만 몽고DB는 이를 단 5년만에 해낸 것이다. 버전 3에는 새로운 저장 엔진, 풀 ACID 트랜잭션, 그리고 검색 기능을 추가했다. 이를 기반으로 버전 4를 만들었다. 그후 버전 5가 MongoDB.Live 2021 컨퍼런스에서 정식 출시됐다.(2021년 7월) 몽고DB에 입사할 당시 나는 이러한 상황을 알고 있었지만 실질적으로 체감하진 못했다. 그러다가 몽고DB 플랫폼이 제공하는 보안, 내구성, 가용성, 확장성, 운용성(그 외에도 개발자들이 선호하는 모든 기능)이 빠르게 변화하는 엔터프라이즈 애플리케이션을 설계하는 데 이상적이라는 것을 알게 되었다. 몽고DB 고객 목록에서 그 증거를 확인할 수 있었다. 우리의 고객 목록은 마치 결제, IoT 애플리케이션, 콘텐츠 관리, 실시간 분석 등 핵심 시스템을 운영하는 주요 글로벌 은행, 소매업체, 통신사의 명사록과 같았다. 이들은 자사의 모든 사업, 모든 주요 클라우드, 사내 현장, 노트북 등에서 몽고DB 데이터베이스, 데이터 레이크, 분석, 검색, 모바일 제품을 사용하고 있다. 여기서 네번째로 알게 된 마지막 사실은, 몽고DB가 단순한 데이터베이스가 아니라는 점이다. 물론 데이터베이스가 몽고DB의 핵심 사업이긴 하다. 하지만 몽고DB는 이제 엔터프라이즈급의 미션 크리티컬 애플리케이션 데이터 플랫폼을 제공하고 있다. 최대 규모의 디지털 단지에서도 최신 데이터 요건을 관리하고 속도나 보안을 저해하지 않으면서 회사가 추구하는 수준으로 확장 가능한 통합 제품군인 것이다. 나는 몽고DB 제품을 처음 접한 날부터 몽고DB 팀과 이들이 만든 제품에 감탄을 금치못했고 엄청난 존경심을 갖게 됐다. 어쨌건 나는 누가 뭐라 해도 개발자이다. 또 항상 개발자들은 나를 “이해한다”는 느낌이 들었다. 하지만 지금 알고 있는 사실을 그때 알았더라면 진작에 몽고DB에 입사했을 것이다. 이력서를 손에 들고 몽고DB 건물 입구에서 밤을 새고 기다렸을 수도 있다. 누가 알랴? 이 글을 읽는 많은 사람들이 그렇게 할지? 훌륭한 기업 문화를 갖춘 기업에서 훌륭한 제품을 만드는 것이 얼마나 보람 있고 흥분되는 일인지 깨달으면서 말이다. 1년 후에 다시 한번 글을 올려 그 때의 상황을 전할 생각이다. 그때까지 이 포스팅에 댓글을 남기거나 MarkLovesTech 로 연락 주기 바란다.
The Foundations of IT Modernization
Lately, I’ve been thinking a lot about the term “modernization.” It’s a broad term that means different things to different people. To some, modernization means migrating legacy systems to the cloud. To others, it means rewriting applications, containerizing, or embracing microservices architecture. And to others still, modernization is synonymous with an equally amorphous (and ubiquitous) term: digital transformation. However you define it, modernization is all the rage right now. IDC says these investments are growing at a compound annual growth rate of 15.5%, and will reach $6.8 trillion by 2023 . (yeah, trillion, with a ‘t’). This frenzy of spending on technology, services, and skills is intended to bring aging systems and business processes up to date. In many cases, these investments are urgent and necessary, as companies of all shapes and sizes, in every industry, must accelerate their pace of innovation in order to survive. But the work of modernization is complex, costly, and technically challenging. It’s like renovating every room in a sprawling estate, while you’re still living in it. It’s hard to even know where to start. To that end, I can’t help but think about the words of the CIO of a $30 billion insurance company who had already been on a modernization journey for years. He said: “We tried everything to accelerate innovation...but, in the end, it was our data platform that was holding us back.” In other words, they were spending millions to fix up their estate, adding radiant heat, smart speakers, and a state-of-the-art home theater. But they were building on top of foundations that were first poured when disco was new. (I’m looking at you, relational databases, first conceived and implemented in the 1970s). In the digital economy, companies succeed or fail based on how fast they innovate. More often than not, that innovation takes the form of software and services, which in turn create value by storing, manipulating, and querying data. And what do you use to store, manipulate, and query all that data? Your developer data platform. Years ago, that just meant ‘database with some scripts around it.’ Those days are gone. Now, MongoDB Atlas has to supply speed, governance, security, availability, and more. So let’s get back my modernization metaphor. You can’t build new solid things on top of creaky, unstable, old things. We all know the old things I’m talking about; databases that make you structure your data in a way that isn’t natural, languages written to be so precise to the computer that they are inscrutable to developers, ‘roach motel’ storage systems that don’t store things in modern, open formats. So if you want to modernize your infrastructure, or modernize your applications, or modernize the way you build software, shouldn’t you first modernize your data platform? Sure, it’s hard to renovate your house. But this is where you live. And if you want that house to last, make sure it’s built on solid foundations. What does that mean? It means that it’s not enough to design and build the right apps. If you want to be truly modern, look at how you input and output your data, how you query, manipulate, and store it, and how you program against it. Get those things right, and you dramatically increase your pace of innovation. No matter where you are in your own modernization journey, it’s not too late to do this. Don’t believe it? Hit me back on Twitter at @MarkLovesTech and I’ll show you how.
The Difference Between R and D
I used to believe that Research and Development (R&D) departments should work in lockstep with Product teams so that they can stay focused on commercially-viable innovations. After all, not every innovation has a market, and not every business has resources to bet on future markets. All of that changed when I met Dr. Michael Cahill, the head of MongoDB Labs in Australia. Michael came to MongoDB through our acquisition of WiredTiger back in 2014, an open source storage engine company he co-founded with Keith Bostic. He holds a PhD in Computer Science and a history of breakthrough innovation. He also has an enlightened point of view on the role of research in any technology company. “Researchers need time and space to pursue the theoretical,” he told me. “We want them to come up with crazy ideas, with much longer time horizons.” Michael is referring to the fundamentally different mindsets required of researchers versus developers. Our developers are focused on new products or features that can make an impact in the next 3-4 quarters. Our researchers are thinking about solving problems that have the potential to reshape entire markets for decades. Big difference. Funding this kind of innovation is challenging for the MBA set, and measuring the ROI of basic research is notoriously difficult. Progress can seem slow and difficult to quantify. Our researchers occupy a space that straddles art and science, industry and academia. They spend a lot of time reading, thinking, and tinkering. Ideas are freely shared, cultivated, iterated, and sometimes abandoned. This is the price of disruptive innovation. In fact, MongoDB would never exist if our founders had set out to simply improve upon relational databases. Instead, they wanted to invent an entirely new way to manage data. It was an ambitious idea that required long-term thinking. An idea that despite MongoDB’s current success, is still only in its infancy. An idea so humongous, Michael Cahill may have even called it “crazy.” Don’t get me wrong. The work of MongoDB Labs is firmly grounded in MongoDB’s core strategy: to constantly improve the way data is stored, accessed, secured and processed. Document databases are only the first act of this play. And I’m certain the next act is being written as we speak, by Michael and his team. Have a different approach to R&D? Think my ideas are “crazy”? Let me hear about it on Twitter at @MarkLovesTech