The data boom and NoSQL

Reading through Mary Meeker’s excellent 2012 KPCB Internet Trends Year-End Update, I was reminded by how critical NoSQL databases are to the present and future of application development. Even the most casual perusal of Meeker’s data indicates a critical need for new data stores that can handle Gartner’s 3 V’s of Big Data: velocity, volume, and variety.

Importantly, as noted previously, such “V’s” aren’t restricted to some niche category of application. Going forward into the post-transactional future, the vast majority of applications will be better suited to a NoSQL database like MongoDB than to a legacy RDBMS.

A few selections from Meeker’s slide deck explain why.

First, the types of devices churning out copious quantities of data are proliferating. Mobile devices powered by Apple’s iOS and Google’s Android now surpass Windows personal computers. This translates into huge new data sources, all of which must be stored somewhere:

Speaking of legacy, look at what has happened to communications, and how fast it happened:

I remember back when it was cool to own a Motorola brick. Very few did, given the expense. But the point is not really mobile phones eventually got to a price point that they could compete with and then dominate the lowly landline, but rather how fast it happened. Rest assured, if mobile phones could unseat the landline in under 20 years, after the landline dominated for 125 years, the next wave will almost certainly take considerably less than 20 years to trounce the mobile phone.

In this shift to mobile, and in subsequent shifts to other communication media, the variety, velocity, and volume of data will change dramatically. An RDBMS is simply incapable of handling such changes.

How much data are we talking about? Meeker gives an answer:

Smart enterprises are turning to MongoDB to future proof their applications. Rather than relying on rigid, fixed schema, as the RDBMS world requires, savvy developers are turning to ÃÂ_ber-flexible document databases, which allow very flexible schema.

This is what The Guardian learned. The venerable UK news organization couldn’t adapt its business to embrace rich, dynamic content with its old-world relational database. By embracing MongoDB, The Guardian was able to embed user engagement into its services, but it also allows The Guardian to easily change its data model over time as business needs shift.

The European Organisation for Nuclear Research, or CERN, for its part, relies on MongoDB to aggregate data from a complex array of different sources. Because it depends on dynamic typing of stored metadata, CERN couldn’t rely on an RDBMS with a fixed schema:

Given the number of different data sources, types and providers that DAS connects to, it is imperative that the system itself is data agnostic and allows us to query and aggregate the meta-data information in customisable way.

As data sources proliferate for all organizations, the volume, velocity, and variety of data will increase, sometimes exponentially. An RDBMS will likely prove useful for some tasks, but for the applications that really drive one’s business? Those are going to be NoSQL and, more often than not, MongoDB.

- Posted by Matt Asay, vice president of Corporate Strategy.

Tagged with: Big Data, Mary Meeker, mobile, landline, smartphone, NoSQL, MongoDB, Gartner, 3 V's, data velocity, data volume, data variety