Building with Patterns: A Summary

Daniel Coupal and Ken W. Alger
April 26, 2019 | Updated: April 24, 2025

This post is also available in: Deutsch, Français, Español, Português, 日本人.

As we wrap up the Building with Patterns series, it’s a good opportunity to recap the problems the patterns that have been covered solve and highlight some of the benefits and trade-offs each pattern has. The most frequent question that is asked about schema design patterns, is “I’m designing an application to do X, how do I model the data?” As we hope you have discovered over the course of this blog series, there are a lot of things to take into consideration to answer that. However, we’ve included a Sample Use Case chart that we’ve found helpful to at least provide some initial guidance on data modeling patterns for generic use cases.

Sample Use Cases

The chart below is a guideline for what we’ve found after years of experience working with our customers of what schema design patterns are used in a variety of applications. This is not a “set in stone” set of rules about which design pattern can be used for a particular type of application. Ensure you look at the ones that are frequently used in your use case. However, don't discard the other ones, they may still apply. How you design your application’s data schema is very dependent on your data access patterns.

Use Cases vs Patterns Matrix

Design Pattern Summaries

Approximation

The Approximation Pattern is useful when expensive calculations are frequently done and when the precision of those calculations is not the highest priority.

Pros

Fewer writes to the database.
Maintain statistically valid numbers.

Cons

Exact numbers aren’t being represented.
Implementation must be done in the application.

Attribute

The Attribute Pattern is useful for problems that are based around having big documents with many similar fields but there is a subset of fields that share common characteristics and we want to sort or query on that subset of fields. When the fields we need to sort on are only found in a small subset of documents. Or when both of those conditions are met within the documents.

Pros

Fewer indexes are needed.
Queries become simpler to write and are generally faster.

Computed

When there are very read intensive data access patterns and that data needs to be repeatedly computed by the application, the Computed Pattern is a great option to explore.

Pros

Reduction in CPU workload for frequent computations.
Queries become simpler to write and are generally faster.

Cons

It may be difficult to identify the need for this pattern.
Applying or overusing the pattern should be avoided unless needed.

Document Versioning

When you are faced with the need to maintain previous versions of documents in MongoDB, the Document Versioning pattern is a possible solution.

Pros

Easy to implement, even on existing systems.
No performance impact on queries on the latest revision.

Cons

Doubles the number of writes.
Queries need to target the correct collection.

Extended Reference

You will find the Extended Reference pattern most useful when your application is experiencing lots of JOIN operations to bring together frequently accessed data.

Pros

Improves performance when there are a lot of JOIN operations.
Faster reads and a reduction in the overall number of JOINs.

Cons

Data duplication.

Outlier

Do you find that there are a few queries or documents that don’t fit into the rest of your typical data patterns? Are these exceptions driving your application solution? If so, the Outlier Pattern is a wonderful solution to this situation.

Pros

Prevents a few documents or queries from determining an application’s solution.
Queries are tailored for “typical” use cases, but outliers are still addressed.

Cons

Often tailored for specific queries, therefore ad hoc queries may not perform well.
Much of this pattern is done with application code.

Pre-allocation

When you know your document structure and your application simply needs to fill it with data, the Pre-Allocation Pattern is the right choice.

Pros

Design simplification when the document structure is known in advance.

Cons

Simplicity versus performance.

Polymorphic

The Polymorphic Pattern is the solution when there are a variety of documents that have more similarities than differences and the documents need to be kept in a single collection.

Pros

Easy to implement.
Queries can run across a single collection.

Schema Versioning

Just about every application can benefit from the Schema Versioning Pattern as changes to the data schema frequently occur in an application’s lifetime. This pattern allows for previous and current versions of documents to exist side by side in a collection.

Pros

No downtime needed.
Control of schema migration.
Reduced future technical debt.

Cons

Might need two indexes for the same field during migration.

Subset

The Subset Pattern solves the problem of having the working set exceed the capacity of RAM due to large documents that have much of the data in the document not being used by the application.

Pros

Reduction in the overall size of the working set.
Shorter disk access time for the most frequently used data.

Cons

We must manage the subset.
Pulling in additional data requires additional trips to the database.

Tree

When data is of a hierarchical structure and is frequently queried, the Tree Pattern is the design pattern to implement.

Pros

Increased performance by avoiding multiple JOIN operations.

Cons

Updates to the graph need to be managed in the application.

Conclusion

As we hope you have seen in this series, the MongoDB document model provides a lot of flexibility in how you model data. That flexibility is incredibly powerful but that power needs to be harnessed in terms of your application’s data access patterns. Remember that schema design in MongoDB has a tremendous impact on the performance of your application. We’ve found that performance issues can frequently be traced to poor schema design.

Keep in mind that to further enhance the power of the document model, these schema design patterns can be used together, when and if it makes sense. For example, Schema Versioning can be used in conjunction with any of the other patterns as your application evolves. With the twelve schema design patterns that have been covered, you have the tools and knowledge needed to harness the power of the document model’s flexibility.

← Previous

MongoDB’s Official Brew Tap Now Open And Flowing

We know macOS users love using Homebrew, aka Brew, "the missing package manager for macOS". Its made life so much simpler installing both open source and freely available applications - it lets anyone create a tap to make their software available. That is why we are very happy to announce that we now have our own official MongoDB Tap which makes it simpler than ever to install the latest MongoDB.

April 25, 2019

Next →

That’s a Wrap: MongoDB’s 2025 in Review & 2026 Predictions

It’s nearly the end of the year—again! That means it’s time for an end-of-year blog post that expresses disbelief at the passage of time. Which, as the saying goes, flies when you’re having fun. And definitely when you’re as busy as MongoDB was in 2025. It was a big year for the company—and more importantly, for the tens of thousands of customers and millions of developers who rely on MongoDB’s modern data platform for their most mission-critical workloads. At MongoDB, everything we do starts with our obsession with customers and their needs, and if there’s a theme to MongoDB’s 2025, it was (and will continue to be) enabling customer innovation and helping them succeed in the AI era. So here are a few highlights of how MongoDB acted on behalf of customers in 2025. From the acquisition of Voyage AI to customer success across industries, a lot happened in 2025. Let’s go!* *Read to the end for 2026 thoughts. 2025: The (MongoDB) year that was Voyage AI, modernization, and search In February, MongoDB announced the acquisition of Voyage AI, a pioneer in embedding and reranking models, to enhance the accuracy of AI applications. Integrating Voyage AI's advanced retrieval technology with MongoDB’s modern, AI-ready data platform addresses a critical challenge: LLM model hallucinations caused by a lack of context. By improving retrieval accuracy for specialized domains like finance and law, the integration enables businesses to deploy AI for mission-critical use cases. To learn more, see the MongoDB Voyage AI page. Then, in September, we launched MongoDB AMP, an AI-powered Application Modernization Platform. AMP is designed to accelerate the transformation of legacy applications through a combination of AI-powered tooling, a proven delivery framework, and expert guidance (tools, techniques, and talent) to help enterprises reduce technical debt and modernize 2-3 times faster. Want more? Sure you do! Check out this short video. MongoDB also announced the addition of search and vector search capabilities to MongoDB Community Edition and MongoDB Enterprise Server. This allows developers to build and test AI-native applications, including those using retrieval-augmented generation (RAG), in local or on-premises environments. Previously exclusive to MongoDB Atlas, these features enable secure, hybrid deployments where sensitive data can remain on-premises while still leveraging advanced search tools. Here’s a (slightly less short) video about search and vector search on Enterprise Server. Growing and scaling with MongoDB As noted, everything we do at MongoDB starts with our obsession with customers. 2025 was another banner year for customer success and innovation—we were inspired by what organizations of every shape and size, across industries and geographies, built with MongoDB in 2025. Here are just two of the many stories our customers shared in 2025; much more can be found in my colleague Katie Palmer’s blog series, Innovating with MongoDB. Factory By combining the Atlas modern data platform with Voyage AI’s high-performance embeddings, the AI-native startup Factory—which uses AI agents called Droids to accelerate software development lifecycles for organizations—consolidated its fragmented tech stack. This enabled superior code retrieval, simplified operations, and provided the scalability needed to process billions of tokens daily. McKesson McKesson, a global pharmaceutical distributor, replaced its monolithic legacy infrastructure with MongoDB Atlas to meet strict drug tracing mandates. By adopting our modern cloud data platform, McKesson scaled its operations 300x, managing tracking data for 1.2 billion containers annually without latency, and ensuring compliance and patient safety while reducing developer complexity. For more, check out the video of McKesson at MongoDB.local NYC from September. From niche NoSQL to enterprise powerhouse As senior MongoDB engineer and Technical Fellow Ashish Kumar put it earlier this year, “through a sustained and deliberate engineering effort,” MongoDB has gone from a (seemingly) niche NoSQL solution to a trusted enterprise standard, and now delivers “the high availability, tunable consistency, ACID transactions, and robust security that enterprises demand.” A new era of leadership The face of MongoDB has also changed—our CFO, Mike Berry, joined the company in April, and Dev Ittycheria stepped down as CEO in November, after more than 11 years leading the company (including its 2017 IPO). In a LinkedIn post about his role, new MongoDB CEO CJ Desai noted that the company is “at the forefront of a new data revolution, unlocking the next wave of productivity and intelligence.” “Having spent my career building and scaling technology platforms, I’ve always been drawn to companies defined by clarity of vision, relentless organic innovation, and a customer-first culture. MongoDB exemplifies all three,” said Desai. We couldn’t agree more. Onward! Reading the 2026 tea leaves So what might 2026 bring (for MongoDB and tech at large)? Here are a handful of our leaders’ predictions: “As much as people want to talk about Artificial General Intelligence (AGI), we’re still in the phase where most AI use cases automate redundant tasks but benefit from human-in-the-loop checks. Organizations that use AI to complete work that historically is a drain on human resources—but then uses people to carefully verify what AI builds, apply governance frameworks, and maintain accountability across the data lifecycle—will be more successful.” —Pete Johnson, Field CTO, AI, MongoDB “After years of inflated expectations and unsustainable spending, the AI industry is trapped in a bubble where companies reflexively attempt to deploy LLMs at every problem, driving up costs with minimal to no return. Businesses that break free from this spending cycle are the ones that understand the need to ground LLM responses in factual data and learn from prior mistakes. We believe the best way to do this will be with highly accurate embedding models and rerankers for reliable data retrieval.” —Frank Liu, Staff Product Manager, MongoDB "In 2026, cloud independence will evolve from strategic preference to existential imperative across enterprises of every scale. The outages and disruptions of recent years have exposed a fundamental truth: in an always-on digital economy—where commerce, mobility, governance, and even public safety depend on uninterrupted access to cloud services—single-provider reliance is no longer a calculated risk, but a systemic vulnerability. Compounding this is the inexorable rise of data sovereignty. Regulatory regimes worldwide now demand precise jurisdictional control over data residency, rendering rigid cloud commitments incompatible with compliance at global scale. The defining competitive advantage will belong to organizations that transcend fragile prevention theater and engineer true infrastructural resilience: architectures inherently portable, data frictionlessly mobile, and operations autonomously sustained across heterogeneous clouds through AI-orchestrated redundancy. In short, the winners will not merely mitigate downtime—they will design systems that render the concept obsolete." —Ben Cefalo, SVP, Head of Core Products, MongoDB Happy holidays and happy New Year, everyone!

December 22, 2025