Level Up Your Skills at MongoDB World/NYC, June 17th – 19th

Only 12% of enterprises have a Big Data project? Not likely

According to a new SAS survey of 339 data management professionals, only 12% of those surveyed have a Big Data project up and running, and a scant 16% more are in the planning or testing phases.

imageGiven that Big Data regularly receives Big Hype, what gives?

Yes, it could be that Big Data is simply a fad, but here at 10gen we don't think so. MongoDB regularly gets named as one of the top-two Big Data technologies, along with Hadoop:

image

At 10gen we see Big Data projects all the time, but importantly, they're often not called "Big Data." Instead, we see a range of data-driven applications, from Reverb to City of Chicago to The Guardian.

In some cases, these involve large volumes of data. But not always. Often these "Big Data" applications depend upon a high velocity of unstructured, varied data. That's Big Data, too.

As Kenneth Cukier and Viktor Mayer-Schonberger write in their excellent book Big Data, "Big Data" refers not to the absolute volume of one's data set, but rather to how much of one's data makes it into the data set. Sampling doesn't count:

When we talk about big data, we mean "big" less in absolute than in relative terms: relative to the comprehensive set of data.

In sum, many enterprises actually have a range of Big Data projects already underway, applications that are driving serious value for these companies. They just don't think of them as "Big Data" projects because they don't involve petabytes of data. 

Which means, of course, they're right in the mainstream of Big Data. SAP surveys and others have found that the average size of a Big Data project is in the tens of terabytes, not the petabytes or zettabytes.

Want to know a convenient way to uncover the Big Data projects in your company? Ask around for data-driven applications, irrespective of volume of data. Invariably you'll find a number of projects that require real-time analysis and/or processing of a variety of data sources. Then the next time SAS comes calling to see if you have a Big Data project, you can happily tell them "Yes. Many."