MongoDB Design Reviews Help Customers Achieve Transformative Results
The pressure to deliver flawless software can weigh heavily on developers' minds and cause teams to second-guess their processes. While no amount of preparation can guarantee success, we've found that a design review conducted by members of the MongoDB Developer Relations team can go a long way in ensuring best practices have been followed and that optimizations are in place to help the team deliver confidently. Design reviews are hour-long sessions where we partner with our customers to help them fine-tune their data models for specific projects or use cases. They serve to give our customers a jump start in the early stages of application design when the development team is new to MongoDB and trying to understand how best to model their data to achieve their goals. A design review is a valuable enablement session that leverages the development team’s own workload as a case study to illustrate performant and efficient MongoDB design. We also help customers explore the art of the possible and put them on the right path toward achieving their desired outcomes. When participants leave these sessions, they carry the knowledge and confidence to evolve their designs independently.
The underlying principle that characterizes these reviews is the domain-driven design ethos, an indispensable concept in software engineering. Design isn't merely a box to tick; it's a daily routine for developers. Design reviews are more than just academic exercises; they hold tangible goals. A primary aim is to enable and educate developers on a global scale, transitioning them away from legacy systems like Oracle. It's about supporting developers, helping them overcome obstacles, and imparting critical education and training. Mastery of the tools is essential, and our sessions delve deep into addressing access patterns and optimizing schema for performance.
At its core, a design review is a catalyst for transformation. It's a collaborative endeavor, merging expertise and fostering an environment where innovation thrives. It's not just about reviewing. When our guidance and expertise are combined with developer innovation and talent, the journey from envisioning to implementing a robust data model becomes a shared success. During the session, our experts look at the workload's data-related functional requirements — like data entities and, in particular, reads and writes — along with non-functional requirements like growth rates, performance, and scalability. With these insights in hand, we can recommend target document schemas that help developers achieve the goals they established before committing their first lines of code. A properly designed document schema is fundamental for performant and cost-efficient operations. Getting schema wrong is often the number one reason why projects fail. Design reviews help customers avoid lost time and effort due to poor schemas.
Design reviews in practice
Not long ago, we were approached by a customer in financial services who wanted us to conduct a design review for an application they were building in MongoDB Atlas. The application was designed to give regional account managers a comprehensive view of aggregated performance data. Specifically, it aimed to provide insights into individual stock performance within a customer's portfolio across a specified time frame within a designated region.
When we talked to them, the customer highlighted an issue with their aggregation pipeline, which was taking longer than expected, ranging from 20 to 40 seconds to complete. Their SLA demanded a response time of under two seconds.
Most design reviews involve a couple of steps to assess and diagnose the problem. The first involves assessing the workload. During this step, a few of the things we look at include:
-
Number of collections
-
The documents in collections
-
How many records documents contain
-
How frequently data is being written or updated in the collections
-
What hours of the day see the most activity
-
How much storage is being consumed
-
Whether and how old data is being purged from collections
-
The cluster size the customer is running in MongoDB
Once we performed this assessment for our finserv customer, we had a better understanding of the nature and scale of the workload. The next step was examining the structure of the aggregation pipeline. What we found was that the way data was being collected had a few unnecessary steps, such as breaking down the data and then reassembling it through various $unwind and $group stages. The MongoDB DevRel experts suggested using arrays to reduce the number of steps involved to just two: first, finding the right data, and then, looking up the necessary information. Eliminating the $group stage reduced the response time to 19 seconds — a significant improvement but still short of the target.
In the next step of the design review, the MongoDB DevRel team looked to determine which schema design patterns could be applied to optimize the pipeline performance. In this particular case, there was a high volume of stock activity documents being written to the database every minute, but users were querying only a limited number of times per day. With this in mind, our DevRel team decided to apply the computed design pattern.
The computed pattern is ideal when you have data that needs to be computed repeatedly in an application. By pre-calculating and saving commonly requested data, it avoids having to do the same calculation each time the data is requested. With our finserv customer, we were able to pre-calculate the trading volume and the starting, closing, high, and low prices for each stock. These values were then stored in a new collection that the $lookup pipeline could access. This resulted in a response time of 1800 ms — below our two-second target SLA, but our DevRel team wasn't finished. They performed additional optimizations, including using the extended reference pattern to embed region data in the pre-computed stock activity so that all the related data can be retrieved with a single query and avoiding the use of a $lookup-based join. After the team was finished with their optimizations, the final test execution of the pipeline resulted in a response time of 377 ms — a 60x improvement in the performance of their aggregation pipeline and more than four times faster than the application target response time.
Read the complete story, including a step-by-step breakdown with code examples of how we helped one of our financial services customers achieve a 60x performance improvement.
If you'd like to learn more about MongoDB data modeling and aggregation pipelines, we recommend the following resources:
-
Daniel Coupal and Ken Alger’s excellent series of blog posts on MongoDB schema patterns
-
Daniel Coupal and Lauren Schaefer’s equally excellent series of blog posts on MongoDB anti-patterns
-
Paul Done’s ebook, Practical MongoDB Aggregations
-
MongoDB University Course, "M320 - MongoDB Data Modeling"
If you're interested in a Design Review, please contact your account representative.