Nuxeo Achieves 11-Billion-Object Benchmark on AWS in Partnership with MongoDB Atlas
Pushing the Limits to Demonstrate Nuxeo's Ability to Manage the Largest, Most Complex Content Management Requirements
Case Study
Executive Summary
Having already proven that their Content Services Platform could
handle nearly unlimited workloads, Nuxeo set out to smash their
established benchmark of 1 billion documents. The challenge: load
11 billion objects while maintaining the highest levels of system
performance. MongoDB, Nuxeo’s data platform partner, helped make
it possible with powerful tools capable of managing hundreds of
metadata tags per object across billions of objects. In addition, Nuxeo
leveraged the AWS cloud infrastructure and Amazon Elasticsearch
Service – both key components of its Nuxeo Cloud offering – to
achieve elastic scalability and the highest levels of indexing and
search performance.
Customers Expect Content Managment Without Limitations
Global companies use Nuxeo to build applications that manage
enormous volumes of digital content including scanned images,
documents, PDFs, and even rich-media assets like high-resolution
photos and video. Nuxeo supports high levels of complexity including
customization, layers of security, and sophisticated metadata as well
as complicated workflows and business processes. This enables Nuxeo
customers to put their content to work solving complex business
problems and delivering unique content-enabled solutions.
David Woolston, VP Business Development at Nuxeo, describes Nuxeo
as a content services platform on steroids. “One thing that drives
Nuxeo’s success is our ability to scale up to handle the most complex
workloads and largest repositories of documents,” explained Woolston.
The secret is their relationship with AWS and MongoDB Atlas, the
global cloud database service.
As a disruptive player in the content services market, Nuxeo has
differentiated itself through technology innovation. The Nuxeo
Platform operates on the AWS Cloud, which allows for unparalleled
flexibility and scalability. Nuxeo also uses the managed database
service, MongoDB Atlas. Atlas is capable of managing hundreds of
metadata tags across literally billions of Nuxeo objects, storing them
securely and making them easily digestible and queryable in JSON-like
documents. This means the Nuxeo team can focus on building new
content services and platform capabilities rather than managing
a database.
Challenge Accepted
“The biggest companies in the world are coming to us with their
largest workloads and saying, ‘we know you can handle this.’ We
believe our technology can scale almost endlessly with MongoDB Atlas
and AWS,” said Woolston. “With a 1-billion-object benchmark already
completed, we really wanted to push the limits and prove it out with a
11-billion-object benchmark.”
The idea was to test Nuxeo from an optimal application and
configuration perspective, not to simply throw money at
more hardware.
“In order to solve our customers’ complex challenges, Nuxeo provides
an extremely robust platform. The deployment includes Elasticsearch,
MongoDB Atlas, and all the bells and whistles of the Nuxeo Platform
itself,” said Joe Quinto, Senior Program Manager for Nuxeo Cloud. “We
wanted to push the boundaries of every element – stress as many
components as we could. Effectively managing 11 billion documents
was our yardstick. The goal was not just to hit the ceiling, but to break
through it.”
The biggest companies in the world are
coming to us with their largest workloads
and saying, ‘we know you can handle this.’
We believe our technology can scale almost
endlessly with MongoDB Atlas and AWS.
David Woolston,
VP Business Development,
Nuxeo
Dynamic Testing with a Two-Phase Approach
Nuxeo adopted a two-phase approach for its benchmarking exercise.
In the first phase, the Nuxeo team used a single Nuxeo repository
configured with MongoDB Atlas and Elasticsearch. The point of the
exercise was to test the practical limits of a single-repository approach
and also to illustrate the inherent advantages of a NoSQL solution like
MongoDB. In the first phase, the team was able to successfully scale a
single Nuxeo repository to 3 billion objects with no database sharding,
a feat that’s virtually impossible with SQL-based technologies.
In the second phase of the project, the Nuxeo team employed a multirepository approach and made use of multiple instances of MongoDB Atlas and the Amazon Elasticsearch Service as well as MongoDB sharding to efficiently scale to over 11 billion objects. For both phases of the benchmark the team used an actual Nuxeo Cloud deployment. This was not a highly orchestrated lab exercise.
In both phases of the benchmarking project, the team continuously
tested to ensure that the Nuxeo Platform and underlying technologies
would scale and perform at a level that would meet enterprise
customers’ high expectations. The team monitored metrics to
determine if, as the repository grew, users could continue to import
new objects and metadata at an extremely high rate of ingestion.
Since this was a real-world exercise, Nuxeo employed its default
Ingestion Pipeline throughout the benchmarking project, complete
with metadata import and full-text indexing.
Nuxeo also employed automated testing to address common user
activities for content management. This included search (both
database queries and full-text searches) and navigation as well as
create, read, update, and delete (CRUD) actions.
Over three months, as Nuxeo was progressing toward their 11-billion-object benchmark and continuing to load new objects and data
into the system, MongoDB Atlas was managing the database. While
the test was running, the MongoDB team monitored data access
patterns and gave advice on best practices for such a large number of
documents. It was important to identify points where they needed to
increase capacity on the platform before moving to higher tiers.
Using more than 100 metrics, MongoDB was able to identify tipping
points and suggest ways to improve the environment before going on
to the next level. “The Nuxeo team was prepared with a solid plan and
they let us know what they wanted to achieve,” said Diego Burstyn, Sr.
Solutions Architect at MongoDB. “Our role was to correlate that with
what we were seeing under the hood.”
Pushing the Limits While Maintaining Response Times and Throughput
As previously mentioned, the key success indicators the Nuxeo team
was looking for were response times and throughput. It was important
to see that the application was responding as expected, and enormous
amounts of data could be loaded in a reasonable amount of time.
A critical outcome for the project was to be able to provide Nuxeo
customers with real-world guidance and best practices for scaling up
the Nuxeo Platform along with key services, like MongoDB Atlas. The
team categorized its learnings in three specific areas: elasticity, steps,
and real-life usability.
Bragging Rights and Lessons Learned
The 11-billion-object benchmark test proved Nuxeo’s ability to scale
with AWS Cloud elasticity. Flexible scalability allowed the team to
expand volume when a capacity limit was reached, or for a temporary
need like re-indexing data. By implementing the test with deliberate
steps, the team learned about infrastructure and configuration
adjustments that needed to be made to maintain optimal performance
at extreme high volumes. Testing in an actual environment, with
real documents, demonstrated how the Nuxeo Platform can handle
enormous document volumes in real-life applications.
Traditional content management systems
can’t scale the way we can,” said Woolston.
“But beyond that, the power of MongoDB
Atlas combined with the complete toolset and
support we get from AWS allows us to scale in
the most efficient and intelligent way possible.
Hitting the 11-billion-object benchmark was more than a matter
of bragging rights. It provided tangible proof of the unique value
proposition Nuxeo offers through its partnership with MongoDB Atlas
and AWS. This exercise delivered an understanding of how these
different components scale and the best practices to do this efficiently.
With these learnings and meaningful data, Nuxeo customers are
better prepared to scale up, either in Nuxeo Cloud or in their own
cloud environments.
Get Started Today
Try MongoDB Atlas on AWS:
Redeem promo code NuxeoAtlas100 for $100 in Atlas Credits
Schedule a
demo
of the Nuxeo Platform
October 7, 2020