Nuxeo Achieves 11-Billion-Object Benchmark on AWS in Partnership with MongoDB Atlas

Pushing the Limits to Demonstrate Nuxeo's Ability to Manage the Largest, Most Complex Content Management Requirements

Case Study


Executive Summary

Having already proven that their Content Services Platform could handle nearly unlimited workloads, Nuxeo set out to smash their established benchmark of 1 billion documents. The challenge: load 11 billion objects while maintaining the highest levels of system performance. MongoDB, Nuxeo’s data platform partner, helped make it possible with powerful tools capable of managing hundreds of metadata tags per object across billions of objects. In addition, Nuxeo leveraged the AWS cloud infrastructure and Amazon Elasticsearch Service – both key components of its Nuxeo Cloud offering – to achieve elastic scalability and the highest levels of indexing and search performance.

Customers Expect Content Managment Without Limitations

Global companies use Nuxeo to build applications that manage enormous volumes of digital content including scanned images, documents, PDFs, and even rich-media assets like high-resolution photos and video. Nuxeo supports high levels of complexity including customization, layers of security, and sophisticated metadata as well as complicated workflows and business processes. This enables Nuxeo customers to put their content to work solving complex business problems and delivering unique content-enabled solutions.

David Woolston, VP Business Development at Nuxeo, describes Nuxeo as a content services platform on steroids. “One thing that drives Nuxeo’s success is our ability to scale up to handle the most complex workloads and largest repositories of documents,” explained Woolston. The secret is their relationship with AWS and MongoDB Atlas, the global cloud database service.

As a disruptive player in the content services market, Nuxeo has differentiated itself through technology innovation. The Nuxeo Platform operates on the AWS Cloud, which allows for unparalleled flexibility and scalability. Nuxeo also uses the managed database service, MongoDB Atlas. Atlas is capable of managing hundreds of metadata tags across literally billions of Nuxeo objects, storing them securely and making them easily digestible and queryable in JSON-like documents. This means the Nuxeo team can focus on building new content services and platform capabilities rather than managing a database.

Challenge Accepted

“The biggest companies in the world are coming to us with their largest workloads and saying, ‘we know you can handle this.’ We believe our technology can scale almost endlessly with MongoDB Atlas and AWS,” said Woolston. “With a 1-billion-object benchmark already completed, we really wanted to push the limits and prove it out with a 11-billion-object benchmark.”

The idea was to test Nuxeo from an optimal application and configuration perspective, not to simply throw money at more hardware.

“In order to solve our customers’ complex challenges, Nuxeo provides an extremely robust platform. The deployment includes Elasticsearch, MongoDB Atlas, and all the bells and whistles of the Nuxeo Platform itself,” said Joe Quinto, Senior Program Manager for Nuxeo Cloud. “We wanted to push the boundaries of every element – stress as many components as we could. Effectively managing 11 billion documents was our yardstick. The goal was not just to hit the ceiling, but to break through it.”

The biggest companies in the world are coming to us with their largest workloads and saying, ‘we know you can handle this.’ We believe our technology can scale almost endlessly with MongoDB Atlas and AWS.

David Woolston, VP Business Development, Nuxeo

Dynamic Testing with a Two-Phase Approach

Nuxeo adopted a two-phase approach for its benchmarking exercise. In the first phase, the Nuxeo team used a single Nuxeo repository configured with MongoDB Atlas and Elasticsearch. The point of the exercise was to test the practical limits of a single-repository approach and also to illustrate the inherent advantages of a NoSQL solution like MongoDB. In the first phase, the team was able to successfully scale a single Nuxeo repository to 3 billion objects with no database sharding, a feat that’s virtually impossible with SQL-based technologies.

In the second phase of the project, the Nuxeo team employed a multirepository approach and made use of multiple instances of MongoDB Atlas and the Amazon Elasticsearch Service as well as MongoDB sharding to efficiently scale to over 11 billion objects. For both phases of the benchmark the team used an actual Nuxeo Cloud deployment. This was not a highly orchestrated lab exercise.

In both phases of the benchmarking project, the team continuously tested to ensure that the Nuxeo Platform and underlying technologies would scale and perform at a level that would meet enterprise customers’ high expectations. The team monitored metrics to determine if, as the repository grew, users could continue to import new objects and metadata at an extremely high rate of ingestion. Since this was a real-world exercise, Nuxeo employed its default Ingestion Pipeline throughout the benchmarking project, complete with metadata import and full-text indexing.

Nuxeo also employed automated testing to address common user activities for content management. This included search (both database queries and full-text searches) and navigation as well as create, read, update, and delete (CRUD) actions.

Over three months, as Nuxeo was progressing toward their 11-billion-object benchmark and continuing to load new objects and data into the system, MongoDB Atlas was managing the database. While the test was running, the MongoDB team monitored data access patterns and gave advice on best practices for such a large number of documents. It was important to identify points where they needed to increase capacity on the platform before moving to higher tiers.

Using more than 100 metrics, MongoDB was able to identify tipping points and suggest ways to improve the environment before going on to the next level. “The Nuxeo team was prepared with a solid plan and they let us know what they wanted to achieve,” said Diego Burstyn, Sr. Solutions Architect at MongoDB. “Our role was to correlate that with what we were seeing under the hood.”

Pushing the Limits While Maintaining Response Times and Throughput

As previously mentioned, the key success indicators the Nuxeo team was looking for were response times and throughput. It was important to see that the application was responding as expected, and enormous amounts of data could be loaded in a reasonable amount of time.

A critical outcome for the project was to be able to provide Nuxeo customers with real-world guidance and best practices for scaling up the Nuxeo Platform along with key services, like MongoDB Atlas. The team categorized its learnings in three specific areas: elasticity, steps, and real-life usability.

Bragging Rights and Lessons Learned

The 11-billion-object benchmark test proved Nuxeo’s ability to scale with AWS Cloud elasticity. Flexible scalability allowed the team to expand volume when a capacity limit was reached, or for a temporary need like re-indexing data. By implementing the test with deliberate steps, the team learned about infrastructure and configuration adjustments that needed to be made to maintain optimal performance at extreme high volumes. Testing in an actual environment, with real documents, demonstrated how the Nuxeo Platform can handle enormous document volumes in real-life applications.

Traditional content management systems can’t scale the way we can,” said Woolston. “But beyond that, the power of MongoDB Atlas combined with the complete toolset and support we get from AWS allows us to scale in the most efficient and intelligent way possible.

Hitting the 11-billion-object benchmark was more than a matter of bragging rights. It provided tangible proof of the unique value proposition Nuxeo offers through its partnership with MongoDB Atlas and AWS. This exercise delivered an understanding of how these different components scale and the best practices to do this efficiently. With these learnings and meaningful data, Nuxeo customers are better prepared to scale up, either in Nuxeo Cloud or in their own cloud environments.

Get Started Today

Try MongoDB Atlas on AWS: Redeem promo code NuxeoAtlas100 for $100 in Atlas Credits

Schedule a demo of the Nuxeo Platform