Paging with the Bucket Pattern - Part 2

Justin LaBreck
August 5, 2019

In the previous part of Paging with the Bucket Pattern, we reviewed the powerful bucket pattern and ways of leveraging it for easy and efficient paging. This part will dive deeper into paging and target easy ways of creating buckets.

We left off discussing ways to quickly generate a list of stock trades using a single document with multiple trades in an array. A page displaying 20 trades per page can store 1000 trades in 50 documents, simply fetching the document that corresponds to page being displayed. Displaying the 21st trade involves fetching the 2nd document.

The diligent reader may notice skip appeared again. We already discussed the drawbacks of using skip. But recall that the slowness associated with skipping documents only manifests when the server must find a large number of documents to return the requested limit. Using the bucket pattern reduces the number of documents that exist for a single customer significantly. Skipping significantly fewer documents containing multiple trades will not experience the same performance issues as skipping a large number of single trade documents.

But what if the user wants to see 100 trades per page? Storing only 20 objects per bucket is fairly limiting. The solution is simple. Store ten buckets of 100 objects instead. Nothing says each bucket can't contain more objects than necessary! In fact, store all 1000 documents in a single bucket. That way, the exact same bucket can be used forty times for displays of 25 trades, twenty times for displays of 50 trades, ten times for displays of 100 trades, etc.

The next logical question is, "why not simply store everything in a single document?" There are several reasons. First, having a single massive document takes up a lot of memory. That's memory that must be allocated on the server to fetch the document and on the client when the document is received. Second, large documents are slower going across the network. Third, MongoDB imposes a (quite intentional) 16MB document size cap. Pick a logical document size and stick with it. Use buckets of 100, 500, or 1000. Larger buckets like 10,000 may make sense for some use cases with small objects, but anything larger may be excessive.

Great! We've found a reasonable bucket size of 1000 trades and can display a history from any page quickly and efficiently.

But how is data inserted into these buckets? It's a lot less work than you may think. Exploring an example will make this easy.

Let's say a new stock purchase occurred and we want to record a history of that trade. Here is the insert statement:

db.history.updateOne({ "_id": /^7000000_/, "count": { $lt: 1000 } },
    { 
    	"$push": { 
            "history": {
            	"type": "buy",
            	"ticker": "MDB",
            	"qty": 25,
            	"date": ISODate("2018-11-02T11:43:10")
    		} },
    	"$inc": { "count": 1 },
    	"$setOnInsert": { "_id": "7000000_1541184190" }
    },
    { upsert: true })

There's a lot to unpack in this statement, so let's work through it one piece at a time.

> db.history.updateOne({ "_id": /^7000000_/, "count": { $lt: 1000 } },

The history collection contains each of our buckets. We're using updateOne on the history collection. It's important to stress that we are using an update statement, not an insert statement, and we're only updating one document. This is because one or more buckets may already exist for customerId 7000000. We want to insert a new trade into a single existing bucket. We'll discuss what happens if the customer does not have an existing bucket in just a moment.

The count field insures that a single bucket never exceeds 1000 trades. The first bucket found that contains 999 objects or fewer will be updated. Buckets that contain 1000 or more trades will be ignored because they do not satisfy the query criteria.

"$push": { 
"history": {
        "type": "buy",
        "ticker": "MDB",
        "qty": 25,
        "date": ISODate("2018-11-02T11:43:10")
} }

The $push operator adds an element to the end of an array. Trade history is represented in the history field so appending a new trade to this field is the correct action to take.

"$inc": { "count": 1 }

The count field must be maintained atomically with an accurate trade count. By using the $inc operator, the bucket count field increases by one for every trade that gets inserted.

    "$setOnInsert": { "_id": "7000000_1541184190" } },
{ upsert: true })

Finally, the upsert modifier and $setOnInsert operator play in concert to cover "does not exist" scenarios.

The query criteria will prevent a document from updating under two conditions: no bucket exists for the customerId (the first part of _id); and no existing buckets contain less than 1000 trades. Using upsert creates a new document when the query criteria fails to update an existing document. In other words, the updateOne statement will create a new bucket if no bucket exists to be updated.

The $setOnInsert operator sets _id when a new bucket gets created. Recall that _id is a compound/concatenated string containing the customerId and time of the first trade in the bucket. The application constructs this field as part of the updateOne operation. First, customerId is already known since it's part of the search criteria. Second, the first trade time of a new bucket is always the trade being inserted. Therefore, constructing _id given two knowns is simple. Take the customerId and date from the current trade and concatenate them to construct the $setOnInsert value. The operator (and string) is ignored if no new bucket needs creating.

The final document looks exactly like the one presented above for display. Voila! Paging using a single document and single update statement.

There are limitations with this approach. For example, what happens if you need to remove a trade from one of the buckets? There will be a gap, which may cause the next updateOne statement to update the wrong document. MongoDB version 4.2 avoids this restriction by allowing expressive update instructions (see the recent Coming in MongoDB 4.2 article on the subject). This methodology also dictates sort order. New trades are always appended to the end of sort order (in this case time order). Sorting on a different field requires using aggregation with the $unwind operator. But even with these limitations, a large number of use cases can be solved with this efficient use of the bucket pattern.

We've explored two options in this pair of articles. Both options provide pros and cons, like all engineering challenges. We discussed modifying the query portion of a find statement to "skip" documents, and we discussed using the bucket pattern to pre-page results. Individual use cases will determine which method works best.

Remember, it's OK to limit the user interface for a better user experience. Pick a method based on the user's needs and design appropriately. Your options are many and with either approach, performance will be excellent and time to release quick-and-easy!

← Previous

Paging with the Bucket Pattern - Part 1

How to get faster pagination for your applications by using the Bucket Pattern.

August 1, 2019

Next →

MongoDB Announces Leadership Transition

Dev Ittycheria, President and Chief Executive Officer, shared the following message with MongoDB employees this morning. This is the hardest email I have ever had to write to all of you. If you have not seen the announcement, I have decided to retire as CEO. Effective November 10, 2025, Chirantan “CJ” Desai will become the new CEO of MongoDB. This was not an easy decision for me. The process to get to this point has been deeply emotional, as I care profoundly about MongoDB and the people who have made the company what it is today. This news may come as a surprise, and for some, perhaps even a shock. That’s natural. Leadership transitions can evoke a range of reactions. I want to share why this is happening, and why it’s the right thing for MongoDB. Every personnel change, including the most senior leadership changes, involves two key decisions: first, recognizing that it is the right time for change, and second, selecting the best person to replace the person leaving. This email is intended to explain both decisions. Earlier this year, as part of our regular succession planning process, the Board and I discussed my long-term commitment. They asked if I would continue as CEO for another five years. After many conversations with my family and the Board, I realized I could not make that commitment. Some CEOs see their title as their identity. I do not. My core responsibility is to serve in the company's best interests. The company is primed for a new leader. One with a fresh perspective, grounded in experience and skills needed to guide MongoDB through its next evolution as a company, what we call MongoDB 3.0. Consequently, I informed the Board that I would commit to two more years to help find a successor. That began the search process for a suitable successor. To our surprise and delight, what we thought would easily take 12 to 24 months happened much faster than anyone expected. After engaging with multiple qualified candidates, we found the right successor in CJ. CJ is uniquely qualified for this role. CJ brings the rare growth-at-scale experience that will help continue to build MongoDB into an iconic technology company. At ServiceNow, he was the only executive to work directly with three of its highly regarded public company CEOs and played a pivotal role in organically scaling the company from just over $1 billion to more than $10 billion in revenue. Only a handful of independent software companies have ever reached that milestone. CJ helped transform ServiceNow from a product company to a platform company, scaled engineering, drove go-to-market excellence, and engaged deeply with investors. More recently, as President of Product and Engineering at Cloudflare, he helped fuel strong growth and stock performance. CJ also possesses the personal qualities needed to succeed as CEO. He is humble, eager to learn, and wants to draw on the perspectives of the people at MongoDB and other stakeholders to inform his thinking. This blend of experience, judgment, and character gives me full confidence that he is well-equipped to lead MongoDB through its next phase of growth. I often think of MongoDB’s journey as a long and extraordinary expedition. For the past eleven years, I have had the privilege of serving as its guide, helping chart the course, rally the team, and climb together through both calm and challenging terrain. Along the way, we have reached remarkable summits and proven what is possible through relentless innovation, persistence, and teamwork. Now it is time for a new guide to lead the next stage of the ascent and take MongoDB to even greater heights. CJ is the right leader to take MongoDB to the next summit. MongoDB is on a strong footing, with a clear strategy, an exceptional leadership team, a product platform that is more relevant than ever, and a business that is executing well. The rise of AI and the explosion of data-intensive applications play directly to MongoDB’s strengths. Our technology sits at the center of how modern applications are built and how organizations will harness data to power intelligent, adaptive systems. I am confident MongoDB is perfectly positioned to capture this next wave of innovation. As for me, I am not running away from MongoDB or leaving to join another company as CEO. I will remain on the Board and work closely with CJ to ensure a seamless transition. Over the years, this role has demanded an enormous amount of focus and energy; as a result, there are many things I’ve missed doing along the way. I’m looking forward to being more present for those moments — from simple time with my family to experiences and travel we’ve long put off. I plan to hold on to my MongoDB stock, as I firmly believe in the people and the opportunity, knowing that MongoDB’s best days are ahead of it. Yes, change can be unsettling. I’m sure you will have many questions about this change, such as why now, why CJ is the best person to lead the company, and what this means for you. We will hold an all-hands meeting tomorrow at 10:30AM ET to discuss this transition, introduce CJ and take your questions. That being said, I want to emphasize that the right change at the right time is how great companies get stronger. Just as a championship team refreshes its roster to stay competitive, MongoDB is bringing in new leadership, including other recent C-suite leaders who came before CJ, to drive our next phase of growth. This is not an ending; it’s the founding of a new moment. I am incredibly proud of what we have built together and genuinely excited about what lies ahead with CJ leading us forward. I also want to thank each of you for making this journey so meaningful. Words cannot fully capture my gratitude for your passion, creativity, and belief in building something truly special. I have often said that I want MongoDB to be an inflection point in people’s careers, a place where they can grow, take risks, and do the best work of their lives. I can say without hesitation that it has been exactly that for me. The skills I have developed, the experiences I have gained, and the relationships I have formed here have shaped me more than any other chapter in my professional life. I will carry them with me always, and will continue to cheer for and support MongoDB every step of the way. --Dev

November 3, 2025