On Selecting a Shard Key for MongoDB

William Zola

#Technical

One of the killer features of MongoDB is the built-in sharding capabilities. This feature lets you distribute your data, and your database workload, over multiple commodity-scale machines.
While sharding is built-in to MongoDB, there are still a lot of things that you have to get right in order to have a successful installation. One of the trickiest ones is picking a good shard key.

Why is picking a good shard key so tricky and so important? A number of reasons:

  • If you pick the wrong shard key, you can totally trash the performance of your cluster
  • Sharding a collection is a one-way parachute jump; if you get it wrong, you’ll need to migrate the data to a new collection with the right shard strategy.
  • Picking the right shard key is more of an art than a science; there are 5 different considerations, and may not be possible to satisfy them all

Nonetheless, there are some basic principles involved in picking a good shard key, and I'll go over them now.

Recommended Background

I'm going to assume that you know how sharding works in MongoDB, and have at least a basic understanding of what a shard key is. If not, you'll need to review the documentation, and ideally sit through a beginning and advanced presentation before reading on.

The Perfect Shard Key

If you think about it, the perfect shard key would have the following characteristics:

  • All inserts, updates, and deletes would each be distributed uniformly across all of the shards in the cluster
  • All queries would be uniformly distributed across all of the shards in the cluster
  • All operations would only target the shards of interest: an update or delete would never be sent to a shard which didn't own the data being modified
  • Similarly, a query would never be sent to a shard which holds none of the data being queried

If your shard key fails to do one of these things, then the following Bad Things could happen:

Poor Write Scalability

If your write load (inserts, updates, and deletes) isn't uniformly distributed across your shards, then you could end up with a hot shard. Ideally, if you have a 4-shard cluster, you want each shard to take 25% of the write load. That way, your cluster can handle four times the write load that a single replica set could handle. If your shard key ends up directing all of your write load to a single shard, then you end up not having scaled up your write capacity at all. If your shard key ends up directing 75% of your write load to a single shard, and only 25% to the remaining 3 shards, then you've severely limited the benefits of sharding.

Poor Read Scalability

Similarly, if your read load from your find() operations isn't uniformly distributed across your shards, then you could end up with a hot shard for queries, and for the exact same reasons. Even though a hot shard for queries ends up having less of an impact than having a hot shard for writes, it's still less than optimal.

There's another, more subtle, way that a poor shard key can limit read scalability. Ideally, the mongos process would be able to target the query to only the shards which had data. If the mongos cannot target the query, it will have to run a scatter/gather query, in which the query is broadcast to all the shards, and they all report back which data they have. While a scatter/gather query is low-impact on the shards which have no data, it still has some impact. The more shards you have, the more important it is to avoid scatter/gather queries: the impact of scatter/gather queries on a cluster with 50 shards is going to be much higher than on a cluster with 2 shards.

Tradeoffs

Alas, there is no such thing as the perfect shard key. There are criteria and considerations, but there may not be that you can pick a shard key that will be optimal for all of the operations that you'll perform on your collection. As with most things in MongoDB, you'll have to tune your shard key to the expected use case for your application. Is your application read-heavy? Write-heavy? What are your most common queries? What are your most common writes? You'll always need to make tradeoffs. The critical factor - and the one that you can't do without -- is to have a shard key that matches your workload.

Shard Key Considerations

With that said, there are five criteria for a good shard key. They are:

  • Cardinality
  • Write Distribution
  • Read Distribution
  • Read Targeting
  • Read Locality

These are discussed in the documentation, but here are my comments on each:

Cardinality

You need to pick a shard key which can be subdivided into small ranges. If you don't do this, MongoDB will be forced to put too many documents in a single chunk. When this happens, you will end up with "jumbo" chunks in your cluster: this will impact performance and manageability of your cluster.

Consider an application that stores logs from multiple machines in a distributed system. I chose to shard the logs collection by the machine’s hostname. That choice means that all logs for a given machine go into the same chunk. Because my shard key is the machine’s hostname I have locked myself into having at most one chunk per machine. If a machine can be expected to generate more than 64 MB of logs, MongoDB will not be able to split the chunk. A much better shard key would be a compound shard key, using the machine's hostname along with a timestamp with one-second granularity: MongoDB will always be able to split the chunk, and will always be able to split it at a reasonable split point.

Cardinality Reference.

Write Distribution

As discussed above, you want your write load to be uniformly distributed over the shards in the cluster. A monotonically increasing shard key (such as the date or an ObjectID) will guarantee that all inserts go into a single shard, thus creating a hot shard and limiting your ability to scale write load. There are other ways that you can create a hot shard, but using a monotonically increasing shard key is by far the most common mistake I've seen. If your write load is primarily updates, rather than inserts, you'll want to make sure that those are uniformly distributed across the shards as well.

Write Distribution Reference.

Read Distribution

Similarly, you want your read load to be uniformly distributed over the shards in your cluster. How you need to do this depends on your specific application's anticipated read patterns. For example, consider a blogging application which was sharded by timestamp of article creation where your most common query is "Show me the last 20 articles created". This shard key will result in hot shard for inserts; as well as hot shards for reads. A better shard key would be a compound key where the first field is the 2 digit month number (i.e May is 05, June is 06), followed by a high-granularity field like author-id or hash. The coarse, month prefix is used to search for articles created in the current month while the high-granularity field provides the necessary cardinality to split and distribute chunks.

Read Targeting

As discussed above, the mongos query router can perform either a targeted query (query only one shard) or a scatter/gather query (query all of the shards). The only way for the 'mongos' to be able to target a single shard is to have the shard key present in the query. Therefore, you need to pick a shard key that will be available for use in the common queries while the application is running. If you pick a synthetic shard key, and your application can't use it during typical queries, all of your queries will become scatter/gather, thus limiting your ability to scale read load.

Read Targeting Reference.

Read Locality

This criterion only applies if you're doing range queries; for example, "show me the last 10 articles posted by this user", or "show me the latest 10 comments on this posting", or even "show me all the articles posted last January". Note that any query with a sort and a limit is a range query.

If you're doing range queries, you still want it to be targeted to a single shard, for all the reasons I explained above for "Read Targeting". In turn, this means you want the shard key to be such that all of the documents within the range are on the same shard.

The way you typically do this is with a compound shard key. For example, your "articles” collection might be sharded by { userid:1, time_posted:1} If a particular user doesn't post that many articles, they'll all be on a single shard (based on the {userid:1} portion of the shard key), so your range query (something like find({userid: 'Asya'}).sort({time_posted:-1}).limit(10) ) will only target the shard which has "Asya"'s posting.

On the other hand, if "Asya" is a prolific poster, and there are hundreds of chunks with her postings in them, then the {time_posted:1} portion of the shard key will keep consecutive postings together on the same shard. Your query for the latest 10 postings will therefore only have to query one, or at most two, shards.

Common Design Patterns

There are two design patterns that I think work well for shard key selection. The first is using a hashed shard key, based on a field that is usually present in most queries.

Hashed shard keys can often be a good option: out of the 5 criteria, the only one they don't provide is Read Locality. If your application doesn't use range queries, they may be ideal.

Two important things to note about hashed shard keys: the underlying field that they're based on must provide enough cardinality, and the underlying field must be present in most queries in order to allow for Read Targeting.

Consider the example of a massive multiplayer online game which uses MongoDB to persist user state between gaming sessions. My application describes a user’s state in an individual document per user, and I declare a hashed shard key on the _id field. The _id particularly well suited for a hashed shard key as it is the primary key MongoDB uses for identifying a single document. As such it is both a required field and unique within a collection. Hashing the _id field works great for this pattern since I predominately look up an individual’s game state by the user’s id.

The other useful design pattern is a compound shard key, composed of of a low-cardinality ("chunky") first part, and a high-cardinality second part, often a monotonically increasing one. The {userid:1, time_posted:1} example from above is an example of this pattern. If there are enough distinct values in the first part (at least twice the number of shards) you'll get good write and read distribution; the high-cardinality second part gets you good cardinality and read locality.

As with the hashed shard key, you need to have at least the first portion of the shard key present in queries in order to get some level of Read Targeting. Ideally, you'd have both portions of the key present in most queries, but it turns out that you can often get most of the benefit even if you only have the first portion.

Tradeoffs, Tradeoffs, and More Tradeoffs

The most important thing to remember is that it may be impossible to create the perfect shard key. For one thing, these five criteria I listed are typically mutually incompatible: it's very rare to be able to get good write distribution, read distribution, and read locality all with a single shard key.

For another thing, your application may have multiple query patterns: a shard key that is perfectly tuned for one type of query may be sub-optimal for another type of query. For example, if you shard an "articles" collection by {userid:1, time_posted:1}, then queries for postings by a single user will be targeted queries, but queries for all recent postings made by all users will necessarily be scatter/gather.

To further complicate things, different overall application workloads will call for you to select different shard keys. By arbitrarily specifying different types of read/write/update/sort loads, I can make up use cases where each one of the shard key criteria I listed does not affect performance. (The one exception is cardinality: cardinality is always important.) Here are some example workloads where you can ignore one or more of these criteria.

For example: if your workload is 95% inserts and only 5% queries then you really really care about write distribution, care somewhat about cardinality, and the other factors barely matter at all.

To take another example: if you have a cluster, and the workload is 90% read, 9.9% updates, and 0.1% inserts, it Really Doesn't Matter if you have a monotonically increasing shard key as long as the 'update' write load is uniformly distributed across the shard key range: your insert load won't be heavy enough to create a hot shard on its own.

For a final example: if your application never does range queries, or does them only rarely, then there's no point in considering Read Locality.

As such, the only reasonable way to approach MongoDB shard key selection is the way that you approach any other part of MongoDB schema design: you have to carefully consider the requirements arising from all of the different operations your application will perform. Once you have a good idea of the most important requirements, you structure your schema and your shard key to make sure that the important operations are optimized, and the other operations are possible, and reasonably efficient.

Summary (aka -- TL;DR)

Shard key selection requires thought. The key factors you have to consider are:

  • Cardinality
  • Write distribution
  • Read distribution
  • Read targeting
  • Read locality

You may not be able to come up with a shard key that works perfectly for all of your use cases: instead, you must consider all of your operations carefully, make sure that the important ones have been optimized, and that the other ones are reasonably efficient.

Good luck -- and may you never have to re-shard a production system!


If you’re interested in learning more about the performance best practices of MongoDB, read our guide:
Read more about MongoDB performance best practices