Playback: How Cisco Deploys MongoDB For High Volume Time Series Data Streams

Facebook ShareLinkedin ShareReddit ShareTwitter Share

Playback is where the MongoDB blog brings you selected talks from around the world and around the industry. Here, we continue show-casing the great talks at MongoDB World 2018 from people running MongoDB in production and at scale.

The Network Assurance Engine team at Cisco Systems have a lot of experience with time series data and MongoDB. Larger network fabrics generate 12 million time series data points every hour and all of those data points need to be analyzed to ensure that the network and its applications are working correctly. In Gabriel Ng and Tom Monk's talk, MongoDB for High Volume Time Series Data Streams, presented at MongoDB World 2018, they show how they took on that data challenge using MongoDB as their time series database.

With a large amount of data collected - hundreds of millions of event documents - the analysis process needs access to at least several hours of contextual data. With that much data, it can mean indexes can exceed the available RAM. As an added constraint for the team, they needed to stay within a small resources footprint for their database, yet retain fast write throughput and the ability to view older data without

This led the team to focus on optimizing their data model for time series data and by making the timestamp part of their collection index, it allowed them to safely keep only the recently used parts of indexes in memory. Further optimizations came from date partitioning the time series data and working with MongoDB support to get the best performance from the underlying operating system.

The talk dives into these and other elements of enabling Cisco's Network Assurance Engine team to analyze terabytes of time series data by making use of MongoDB's flexibility and versatility when it comes to indexing data.