The Changing Of The Technology Guard: NoSQL + Hadoop

Matt Asay


Big Data truly is prompting a changing of the technology guard. In an excellent article today, The Wall Street Journal notes that Hadoop is "challenging tech heavyweights like Oracle and Teradata [whose] core database technology is too expensive and ill-suited for typical big data tasks." This follows my own observations that repeated earnings misses across the legacy technology vendor landscape indicate that real, tectonic shifts in the technology landscape are underway.

In other words, NoSQL and Hadoop are the new normal.

What the Journal missed, however, was the right emphasis. As fantastic as Hadoop is, it's only one part of the Big Data story. And not necessarily the most significant part.

For example, the Journal writes:

Traditional databases organize easy-to-categorize information. Customer records or ATM transactions, for example, arrive in a predefined format that is easy to process and analyze. These so-called relational databases are the kind offered by Oracle and Teradata among others, and the market for them runs to an estimated $30 billion a year, according to IDC estimates.

The Internet, though, is messy. Companies now also have to make sense of and store the mass of data being generated from tweets, Web-surfing logs and Internet-connected machines. Hadoop is a cheap technology to make that possible, and it was born of Google technologies detailed in academic papers.

The article is dead-on in most respects, except for the market that Hadoop truly tackles. Of the $30 billion database market, Hadoop addresses just a quarter of it: the OLAP market. The much larger market is the traditional OLTP market, and this is the home of NoSQL databases like MongoDB.

Perhaps unsurprisingly, then, MongoDB has the fastest growing Big Data community, and the second hottest job trend after only HTML5. Big Data, after all, isn't merely about analytics. It's primarily about operational databases that can help enterprises put their data to work in real time.