The following checklist, along with the Operations Checklist, provides recommendations to help you avoid issues in your production MongoDB deployment.
Ensure that your replica set includes at least three data-bearing voting members and that your write operations use
w: majoritywrite concern. Three data-bearing voting members are required for replica-set wide data durability.
Ensure that all instances use journaling.
Data in MongoDB has a dynamic schema. Collections do not enforce document structure. This facilitates iterative development and polymorphism. Nevertheless, collections often hold documents with highly homogeneous structures. See Data Modeling Concepts for more information.
Determine the set of collections that you will need and the indexes required to support your queries. With the exception of the
_idindex, you must create all indexes explicitly: MongoDB does not automatically create any indexes other than
Ensure that your schema design supports your deployment type: if you are planning to use sharded clusters for horizontal scaling, design your schema to include a strong shard key. While you can change your shard key later, it is important to carefully consider your shard key choice to avoid scalability and perfomance issues.
Ensure that your schema design does not rely on indexed arrays that grow in length without bound. Typically, best performance can be achieved when such indexed arrays have fewer than 1000 elements.
Consider the document size limits when designing your schema. The BSON Document Size limit is 16MB per document. If you require larger documents, use GridFS.
Use an odd number of voting members to ensure that elections proceed successfully. You can have up to 7 voting members. If you have an even number of voting members, and constraints, such as cost, prohibit adding another secondary to be a voting member, you can add an arbiter to ensure an odd number of votes. For additional considerations when using an arbiter for a 3-member replica set (P-S-A), see Replica Set Arbiter.
For the following MongoDB versions,
pv1increases the likelihood of
w:1rollbacks compared to
pv0(no longer supported in MongoDB 4.0+) for replica sets with arbiters:
MongoDB 3.2.11 or earlier
Ensure that your secondaries remain up-to-date by using monitoring tools and by specifying appropriate write concern.
Do not use secondary reads to scale overall read throughput. See: Can I use more replica nodes to scale for an overview of read scaling. For information about secondary reads, see: Read Preference.
Ensure that your shard key distributes the load evenly on your shards. See: Shard Keys for more information.
Use targeted operations for workloads that need to scale with the number of shards.
For MongoDB 3.4 and earlier, read from the primary nodes for non-targeted or broadcast queries as these queries may be sensitive to stale or orphaned data.
- For MongoDB 3.6 and later, secondaries no longer return orphaned data unless using read concern
"available"(which is the default read concern for reads against secondaries when not associated with causally consistent sessions).Starting in MongoDB 3.6, all members of the shard replica set maintain chunk metadata, allowing them to filter out orphans when not using
"available". As such, non-targeted or broadcast queries that are not using
"available"can be safely run on any member and will not return orphaned data.The
"available"read concern can return orphaned documents from secondary members since it does not check for updated chunk metadata. However, if the return of orphaned documents is immaterial to an application, the
"available"read concern provides the lowest latency reads possible among the various read concerns.
Pre-split and manually balance chunks when inserting large data sets into a new non-hashed sharded collection. Pre-splitting and manually balancing enables the insert load to be distributed among the shards, increasing performance for the initial load.
Make use of connection pooling. Most MongoDB drivers support connection pooling. Adjust the connection pool size to suit your use case, beginning at 110-115% of the typical number of concurrent database requests.
Ensure that your applications handle transient write and read errors during replica set elections.
Ensure that your applications handle failed requests and retry them if applicable. Drivers do not automatically retry failed requests.
Use exponential backoff logic for database request retries.
cursor.maxTimeMS()for reads and
wtimeoutfor writes if you need to cap execution time for database operations.