Most companies use analytics. Many can act on data from months, weeks, or even days ago. But few can respond to changes minute by minute, or second by second. Because:
They're stuck in a mosaic of ETL processes and Excel integrations.
They can’t analyze semi-structured, unstructured, and geospatial data.
The shape of the data changes faster than their systems can cope with.
With MongoDB, analyze any data in place and in real time. Faster. With less money.
Analytics falls along a spectrum. On one end of the spectrum sit batch analytical applications, which are used for complex, long-running analyses. They tend to have slower response times (up to minutes, hours, or days) and lower requirements for availability. Examples of batch analytics include Hadoop-based workloads.
On the other end of the spectrum sit real-time analytical applications, which provide lighter-weight analytics very quickly. Latency is low (sub-second) and availability requirements are high (e.g., 99.99%). MongoDB is typically used for real-time analytics. Example applications include:
Can’t Stay Ahead. You need to account for many types of data, including unstructured and semi-structured data. And new sources present themselves unpredictably. Relational databases aren’t capable of handling this, which leaves you hamstrung.
Can’t Scale. You need to analyze terabytes or petabytes of data. You need sub-second response times. That’s a lot more than a single server can handle. Relational databases weren’t designed for this.
Batch. Batch processes are the right approach for some jobs. But in many cases, you need to analyze rapidly changing, multi-structured data in real time. You don’t have the luxury of lengthy ETL processes to cleanse data for later.
Do the Impossible. MongoDB can incorporate any kind of data – any structure, any format, any source – no matter how often it changes. Your analytical engines can be comprehensive and real-time.
Scale Big. MongoDB is built to scale out on commodity hardware, in your data center or in the cloud. And without complex hardware or extra software. This shouldn’t be hard, and with MongoDB, it isn’t.
Real Time. MongoDB can analyze data of any structure directly within the database, giving you results in real time, and without expensive data warehouse loads.
Most databases make you chose between a flexible data model, low latency at scale, and powerful access. But increasingly you need all three at the same time.
Rigid Schemas. You should be able to analyze unstructured, semi-structured, and polymorphic data. And it should be easy to add new data. But this data doesn’t belong in relational rows and columns. Plus, relational schemas are hard to change incrementally, especially without impacting performance or taking the database offline.
Scaling Problems. Relational databases were designed for single-server configurations, not for horizontal scale-out. They were meant to serve 100s of ops per second, not 100,000s of ops per second. Even with a lot of engineering hours, custom sharding layers, and caches, scaling an RDBMS is hard at best and impossible at worst.
Takes Too Long. Analyzing data in real time requires a break from the familiar ETL and data warehouse approach. You don’t have time for lengthy load schedules, or to build new query models. You need to run aggregation queries against variably structured data. And you should be able to do so in place, in real time.
Organizations are using MongoDB for analytics because it lets them store any kind of data, analyze it in real time, and change the schema as they go.
New Data. MongoDB’s document model enables you to store and process data of any structure: events, time series data, geospatial coordinates, text and binary data, and anything else. You can adapt the structure of a document’s schema just by adding new fields, making it simple to bring in new data as it becomes available.
Horizontal Scalability. MongoDB’s automatic sharding distributes data across fleets of commodity servers, with complete application transparency. With multiple options for scaling – including range-based, hash-based and location-aware sharding – MongoDB can support thousands of nodes, petabytes of data, and hundreds of thousands of ops per second without requiring you to build custom partitioning and caching layers.
Powerful Analytics, In Place, In Real Time. With rich index and query support – including secondary, geospatial and text search indexes – as well as the aggregation framework and native MapReduce, MongoDB can run complex ad-hoc analytics and reporting in place.