Lessons from the Delta Outage: See Who is Learning from Costly Airline Disruptions

Before the dust of the massive disruption to Delta Air Line’s global service had fully settled, Reuters published a somewhat pessimistic look at the IT infrastructure of today’s top airlines.

As reported around the world, Delta suffered a power loss in their Atlanta data center, causing service to be cancelled across their network of over 15,000 daily flights. In the technology community, the fact that a seemingly insignificant occurrence (which turned out to be a small fire) could trigger total system failure was an unpleasant reminder of how fragile these aging systems can be.

However, this was not an isolated occurrence. Just weeks ago, Southwest Airlines was left with no alternative than to cancel over 2,000 flights in the wake of a network failure. In 2015, a malfunctioning router halted United Continental flights for two hours.

News outlets reporting on these outages are quick to place blame on ageing infrastructure. Reuters claims that most major US airlines rely on Transaction Processing Facility, an IBM mainframe operating system created in the 1960s for core processes like reservations. While this is certainly the case, lack of infrastructure investment is not negligence as they may imply, but a symptom of the "razor thin” profit margins over the past decades.

More recently, however, falling oil prices have proven to be a boon for those margins. With more cash on hand, airlines will start looking to invest in more wholescale technology overhauls, rather than smaller projects that have a more immediate return on investment like passenger mobile apps, check-in kiosks, and luggage tracking.

Signs of this overhaul are already visible. After decades of outsourcing its core reservations and passenger services system, Delta only brought its core infrastructure in-house in 2014. Southwest Airlines announced it was overhauling its reservations system early this summer by partnering with Amadeus, an IT solutions provider for the travel industry and MongoDB customer.

The second largest airline in China, and the world’s 8th largest carrier by passenger volume, recently replatformed their fare calculation engine with MongoDB. Originally built on an outdated relational database, the old system struggled to provide up-to-date fare data to newer customer-facing systems. Not only were they able to service thousands of more customers every second and deliver near real-time information to passengers at scale, China Eastern Airlines was able to achieve fault resilience by using MongoDB replica sets, which distribute the database across multiple nodes to eliminate a single point of failure.

However, massive system overhauls have been stiffly resisted by some out of fears that upgrades will break more modern systems built on top of the 1960s technologies, or that maintenance work will take mission critical systems offline for too long. Newer database technologies that were designed for distributed, always-on deployments may provide the answer to these concerns.

Rather than rebuilding platforms from scratch, some in the industry are opting to solve the issues beleaguering legacy systems by designing parallel systems using more modern technologies. The 5th largest airline in the world recently achieved a “single view of their customer” by extracting 100TB of data from their legacy systems into a single data lake, and loading it into MongoDB using a common data model. This allowed for a consistent schema for analytics and advanced personalization to draw from, all while leaving legacy systems intact.

The fact remains, as any frequent flyer will tell you, flight schedules can be derailed despite an air carrier’s best efforts. The world’s 3rd largest airline is now better-poised to handle these unexpected disruptions after rebuilding its seat re-accommodation app on MongoDB. With its legacy Oracle application, it could take 5 minutes to provide rebooked seats to passengers from cancelled flights. MongoDB’s flexible data model allowed for additional passenger information to be stored in the database, eliminating the need for feeds from other systems. Now, seat reassignments take seconds.

In a highly competitive industry prone to all manner of existential threats (from severe weather to terrorism to data center failures), it appears that the cost-benefit analyses that have for so long been stacked against technology upgrades may be slowly shifting to favour more modern, agile, and fault tolerant systems.


To learn more about how a database can make a difference in your business, download our white paper:

Quantifying Business Advantage: The Value of Database Selection