The following checklist, along with the Operations Checklist, provides recommendations to help you avoid issues in your production MongoDB deployment.
Data in MongoDB has a dynamic schema. Collections do not enforce document structure. This facilitates iterative development and polymorphism. Nevertheless, collections often hold documents with highly homogeneous structures. See Data Modeling Concepts for more information.
Determine the set of collections that you will need and the indexes required to support your queries. With the exception of the
_idindex, you must create all indexes explicitly: MongoDB does not automatically create any indexes other than
Ensure that your schema design supports your deployment type: if you planning to use sharded clusters for horizontal scaling, design your schema to include a strong shard key. The shard key affects read and write performance by determining how MongoDB partitions data. See: Impacts of Shard Keys on Cluster Operations for information about what qualities a shard key should possess. You cannot change the shard key once it is set.
Ensure that your schema design does not rely on indexed arrays that grow in length without bound. Typically, best performance can be achieved when such indexed arrays have fewer than 1000 elements.
Use an odd number of voting members to ensure that elections proceed successfully. You can have up to 7 voting members. If you have an even number of voting members, and constraints, such as cost, prohibit adding another secondary to be a voting member, you can add an arbiter to ensure an odd number of votes. For additional considerations when using an arbiter for a 3-member replica set (P-S-A), see Replica Set Arbiter.
Ensure that your shard key distributes the load evenly on your shards. See: Shard Keys for more information.
Use targeted operations for workloads that need to scale with the number of shards.
"available"read concern can return orphaned documents from secondary members since it does not check for updated chunk metadata. However, if the return of orphaned documents is immaterial to an application, the
"available"read concern provides the lowest latency reads possible among the various read concerns.
Pre-split and manually balance chunks when inserting large data sets into a new non-hashed sharded collection. Pre-splitting and manually balancing enables the insert load to be distributed among the shards, increasing performance for the initial load.
Make use of connection pooling. Most MongoDB drivers support connection pooling. Adjust the connection pool size to suit your use case, beginning at 110-115% of the typical number of concurrent database requests.
Ensure that your applications handle transient write and read errors during replica set elections.
Ensure that your applications handle failed requests and retry them if applicable. Drivers do not automatically retry failed requests.
Use exponential backoff logic for database request retries.