Paging with the Bucket Pattern - Part 1

Justin LaBreck
August 1, 2019 | Updated: August 5, 2019

Have you ever noticed that moving through pages slows down as you view higher and higher numbered pages? Application designers work with pages frequently yet this problem persists. So what's the solution? Use a flexible, easy-to-use data model. MongoDB provides powerful ways of modeling data that make paging fast and efficient. Today, we’re going to explore paging through large amounts of data quickly and easily.

First, we need to understand the problem. Any time a full data set doesn't fit on a screen, paging is necessary. Most developers limit a display to 20, 50, or 100 items before requiring a "next" button. The most common implementation of paging uses sort, skip, and limit at the database level. But "skip and limit" has a problem.

Why do page loads slow down as the page number grows? It's a common problem caused by the way skip and limit works. Imagine a web page of a customer's stock trades that displays 1000 of the most recent trades per page. Querying a history collection to generate the trades list works like this:

db.history.find({ customerId: 7000000 })
            .sort({ date: 1 })
            .skip(0)
            .limit(1000)

The database quickly uses the index { customerId: 1, date: 1 } to find 1000 documents and returns 1000 documents. Everything is sorted by date. Simple enough.

The next page works similarly, except instead of skipping 0, we skip 1000. The database happily finds 2000 documents and returns 1000. Wait ... the database finds 2000? Yes, it finds 2000 documents to return 1000. That's how skip and limit works. Imagine viewing page 5000. That's skip 5,000,000 and limit 1000. The database must find 5,001,000 documents to return 1000 documents. No wonder it takes so long! There is a better way.

Skipping documents is time consuming so conversely, not skipping documents is not time consuming. Remember that first page we loaded? We retrieved 1000 results and prepared them for display. We had to iterate through 1000 documents and each one has a date. Conveniently, we also sorted by date. By holding on to the last date we displayed (via session variable or query string, for example), we can modify the query such that no skipping is required.

db.history.find({ 
    customerId: 7000000,
    date: { $gt: ISODate("2018-07-21T12:01:35") 
    })
	.sort({ date: 1 })
	.limit(1000)

This second query involves no skipping and efficiently uses our index. That's an improvement! But we still have a problem.

Viewing page 5,000 is significantly faster using this method but we have no way to jump directly from page 1 to page 5000. Why? This method modifies the query itself to find results quickly. It requires tracking the last result from the previous page to modify the query. Displaying the documents on page 5000 requires loading the last document from page 4999, which requires loading the last document from page 4998, which requires loading the last document from page 4997, and so on.

The whole point of using another method is to load values without first loading everything before it. This solution requires keeping track of the last document viewed to find the next set of documents. It only works if we don't provide the user with an option for jumping to specific pages.

There is a better way. We can use the bucket pattern.

First, a quick recap of the bucket pattern. Bucketing is most useful whenever a list of similar things all relate to a central entity. Capturing data points over time falls into this category. And importantly, most data sets that require pagination can use this pattern.

Our previous example worked with a collection that looks like this:

{
    "ticker" : "MDB", 
    "customerId": 7000000,
    "type" : "buy", 
    "qty" : 419, 
    "date" : ISODate("2018-10-26T15:47:03.434Z") 
},
{ 
    "ticker" : "MDB",
    "customerId": 7000000, 
    "type" : "sell", 
    "qty" : 29, 
    "date" : ISODate("2018-10-30T09:32:57.765Z") 
 }

And here is that same data set using the bucket pattern:

{
  "_id": "7000000_1540568823",
  "customerId": 7000000,
  "count": 2,
  "history": [
    {
      "type": "buy",
      "ticker": "MDB",
      "qty": 419,
      "date": ISODate("2018-10-26T15:47:03.434Z")
    },
    {
      "type": "sell",
      "ticker": "MDB",
      "qty": 29,
      "date": ISODate("2018-10-30T09:32:57.765Z")
    }
  ]
}

Using the bucket pattern, two trade documents condense into a single document using an array of trades. The array contains two objects because the original design had two documents. Fields that duplicate across the original two documents condense into the root of our single document (i.e. customerId). Other, unique fields appear as part of the history array.

There are lots of benefits to this schema design pattern but let's focus on the benefits to paging. Also note the addition of a count field. It represents how many trades appear in the history array. The count field becomes important later.

How does all this relate to paging? The most efficient way to gather information for display is to store information as it's needed for display. MongoDB excels at this by allowing you to store data exactly as needed in rich and complex ways. For paging, data is needed in buckets of 20, 50, 100, etc. and the bucket pattern allows us to represent each page as a single document.

Let's look at this same concept another way. When rendering a page using the old "find with skip and limit" method, each page load iterates through multiple documents. Displaying 20 trades per page requires iterating 20 times over a cursor, retrieving 20 documents from the server. With the bucket approach to paging, each page load requires only a single document to generate the entire page!

Now let's look deeply at storing the information for display.

Notice the value stored in _id. In our example, _id is a compound value. It's a string that concatenates the customerId and the first trade time in seconds (since the epoch). There is a reason for this.

Our web page is designed to display the stock trade history made by a single customer. Creating a compound _id starting with customerId effectively "groups" all the objects in the history array field by customerId. Using a regular expression, we can quickly find our first complete set of results:

db.history.find({ "_id": /^7000000_/ })
	.sort({ _id: 1 })
	.limit(1)

A single document will be returned. The document contains a history array with multiple stock trades ready to display!

Now imagine there are more than two trades. Let's look at an example with 1000 trades. How does this pattern work?

Going back to the idea that data should be stored as it is needed for display, each bucket should have enough trades to render a complete page. If the display is designed to show 20 trades per page, then store fifty documents containing 20 trades per bucket (1000 trades / 20 = 50 documents).

To display page one, fetch the first bucket from the server. To display page two, skip the first bucket using .skip(1) and fetch the second bucket from the server. To display page three, fetch the third bucket from the server. To jump to page forty, fetch the fortieth bucket from the server. It's that easy!

Done, right? Not exactly.

In the next part, we'll examine how to create and optimize this pattern for even more efficient and powerful pagination.

← Previous

Coming in MongoDB 4.2: Pipeline Powered Updates and More Expressive Queries

MongoDB 4.2 brings the power of aggregation pipelines to the update command and adds many new expressions and operators making 4.2's update the most powerful update command yet.

July 30, 2019

Next →

MongoDB Announces Leadership Transition

Dev Ittycheria, President and Chief Executive Officer, shared the following message with MongoDB employees this morning. This is the hardest email I have ever had to write to all of you. If you have not seen the announcement, I have decided to retire as CEO. Effective November 10, 2025, Chirantan “CJ” Desai will become the new CEO of MongoDB. This was not an easy decision for me. The process to get to this point has been deeply emotional, as I care profoundly about MongoDB and the people who have made the company what it is today. This news may come as a surprise, and for some, perhaps even a shock. That’s natural. Leadership transitions can evoke a range of reactions. I want to share why this is happening, and why it’s the right thing for MongoDB. Every personnel change, including the most senior leadership changes, involves two key decisions: first, recognizing that it is the right time for change, and second, selecting the best person to replace the person leaving. This email is intended to explain both decisions. Earlier this year, as part of our regular succession planning process, the Board and I discussed my long-term commitment. They asked if I would continue as CEO for another five years. After many conversations with my family and the Board, I realized I could not make that commitment. Some CEOs see their title as their identity. I do not. My core responsibility is to serve in the company's best interests. The company is primed for a new leader. One with a fresh perspective, grounded in experience and skills needed to guide MongoDB through its next evolution as a company, what we call MongoDB 3.0. Consequently, I informed the Board that I would commit to two more years to help find a successor. That began the search process for a suitable successor. To our surprise and delight, what we thought would easily take 12 to 24 months happened much faster than anyone expected. After engaging with multiple qualified candidates, we found the right successor in CJ. CJ is uniquely qualified for this role. CJ brings the rare growth-at-scale experience that will help continue to build MongoDB into an iconic technology company. At ServiceNow, he was the only executive to work directly with three of its highly regarded public company CEOs and played a pivotal role in organically scaling the company from just over $1 billion to more than $10 billion in revenue. Only a handful of independent software companies have ever reached that milestone. CJ helped transform ServiceNow from a product company to a platform company, scaled engineering, drove go-to-market excellence, and engaged deeply with investors. More recently, as President of Product and Engineering at Cloudflare, he helped fuel strong growth and stock performance. CJ also possesses the personal qualities needed to succeed as CEO. He is humble, eager to learn, and wants to draw on the perspectives of the people at MongoDB and other stakeholders to inform his thinking. This blend of experience, judgment, and character gives me full confidence that he is well-equipped to lead MongoDB through its next phase of growth. I often think of MongoDB’s journey as a long and extraordinary expedition. For the past eleven years, I have had the privilege of serving as its guide, helping chart the course, rally the team, and climb together through both calm and challenging terrain. Along the way, we have reached remarkable summits and proven what is possible through relentless innovation, persistence, and teamwork. Now it is time for a new guide to lead the next stage of the ascent and take MongoDB to even greater heights. CJ is the right leader to take MongoDB to the next summit. MongoDB is on a strong footing, with a clear strategy, an exceptional leadership team, a product platform that is more relevant than ever, and a business that is executing well. The rise of AI and the explosion of data-intensive applications play directly to MongoDB’s strengths. Our technology sits at the center of how modern applications are built and how organizations will harness data to power intelligent, adaptive systems. I am confident MongoDB is perfectly positioned to capture this next wave of innovation. As for me, I am not running away from MongoDB or leaving to join another company as CEO. I will remain on the Board and work closely with CJ to ensure a seamless transition. Over the years, this role has demanded an enormous amount of focus and energy; as a result, there are many things I’ve missed doing along the way. I’m looking forward to being more present for those moments — from simple time with my family to experiences and travel we’ve long put off. I plan to hold on to my MongoDB stock, as I firmly believe in the people and the opportunity, knowing that MongoDB’s best days are ahead of it. Yes, change can be unsettling. I’m sure you will have many questions about this change, such as why now, why CJ is the best person to lead the company, and what this means for you. We will hold an all-hands meeting tomorrow at 10:30AM ET to discuss this transition, introduce CJ and take your questions. That being said, I want to emphasize that the right change at the right time is how great companies get stronger. Just as a championship team refreshes its roster to stay competitive, MongoDB is bringing in new leadership, including other recent C-suite leaders who came before CJ, to drive our next phase of growth. This is not an ending; it’s the founding of a new moment. I am incredibly proud of what we have built together and genuinely excited about what lies ahead with CJ leading us forward. I also want to thank each of you for making this journey so meaningful. Words cannot fully capture my gratitude for your passion, creativity, and belief in building something truly special. I have often said that I want MongoDB to be an inflection point in people’s careers, a place where they can grow, take risks, and do the best work of their lives. I can say without hesitation that it has been exactly that for me. The skills I have developed, the experiences I have gained, and the relationships I have formed here have shaped me more than any other chapter in my professional life. I will carry them with me always, and will continue to cheer for and support MongoDB every step of the way. --Dev

November 3, 2025