BlogAtlas Vector Search voted most loved vector database in 2024 Retool State of AI reportLearn more >>
MongoDB Developer
MongoDB
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
MongoDBchevron-right

Comprehensive Guide to Optimising MongoDB Performance

Srinivas Mutyala7 min read • Published Jul 09, 2024 • Updated Jul 09, 2024
MongoDB
Facebook Icontwitter iconlinkedin icon
Rate this article
star-empty
star-empty
star-empty
star-empty
star-empty
MongoDB is celebrated for its high performance and scalability, making it a popular choice among NoSQL databases. However, to fully leverage its potential, fine-tuning your MongoDB deployment is essential. This guide outlines various strategies and best practices for enhancing MongoDB performance, covering everything from identifying bottlenecks to optimizing queries and hardware.

Understanding your workload

Before diving into performance tuning, it's crucial to understand your workload. MongoDB's performance can vary significantly based on whether your application is read-heavy, write-heavy, or a balanced mix. Utilize tools like MongoDB's Atlas Profiler or the open-source mongostat to analyze your database operations and gain insights into your workload.

Indexing for performance

Effective indexing is one of the most impactful ways to enhance query performance in MongoDB. Here are key practices:
  • Create relevant indexes: Tailor indexes to match your application's query patterns. Use the [explain() method to understand query behavior and optimize accordingly.
db.collection.find({ field: value }).explain("executionStats")
You can also get this information from MongoDB Compass with sophisticated output as shown below.
MongoDB Compass interface showing the 'addrInfo' collection within the 'IndexDB' database. The query { state: 'st1' } is entered into the query bar. The interface displays multiple documents with fields such as _id, state, city, pin, landmark, and lastModifiedOn. Buttons for 'Add Data', 'Export Data', 'Update', and 'Delete' are visible below the query bar, along with options for explaining the query, resetting, and finding documents. The left sidebar lists other collections and databases.
MongoDB Compass Explain Plan interface showing a visual tree of the query execution plan. The diagram includes two steps: FETCH and IXSCAN. The IXSCAN step indicates an index scan on the 'state_1_city_1' index, with 1 document returned and examined, and no multikey index. The FETCH step shows the same document returned and execution time as 0 ms. The Query Performance Summary on the right highlights 1 document returned, 1 document examined, 0 ms execution time, not sorted in memory, and 1 index key examined
  • Avoid over-indexing: While indexes improve query speed, they can hinder write operations and consume additional disk space. Regularly review and remove unused or unnecessary indexes.
db.collection.dropIndex("indexName")
  • Use compound indexes: For queries involving multiple fields, compound indexes can significantly boost performance.
db.collection.createIndex({ field1: 1, field2: -1 })

Optimising query patterns

Optimizing your query patterns is crucial for reducing execution time and resource usage:
  • Projection: Use projection to limit the fields returned by your queries, minimizing data transfer and processing load. Also, it’s better to exclude _id with 0 (false) if it’s not a field pertaining to the application — i.e., an auto-generated field by MongoDB. db.collection.find({ field: value }, { field1: 1, field2: 1 })
  • Aggregation framework: Leverage MongoDB's aggregation framework for complex data processing. Ensure aggregations utilize indexed fields where possible.
db.collection.aggregate([ { $match: { field: value } }, { $group: { _id: "$field", total: { $sum: "$amount" } } } ])
  • Avoid $where: The $where operator can be slow and resource-intensive. Use it sparingly and only when necessary. Instead, the use of $expr with aggregation operators that do not use JavaScript (i.e., non-$function and non-$accumulator operators) is faster than $where because it does not execute JavaScript and is preferable, when possible. However, if you must create custom expressions, $function is preferred over $where.

Hardware considerations

The hardware on which MongoDB runs plays a crucial role in its performance:
  • RAM: MongoDB relies heavily on RAM to store working sets. If your dataset exceeds your available RAM, consider upgrading your memory.
  • Storage: Utilize SSDs for storage to enhance I/O throughput and data access speeds.
  • Network: Ensure your network bandwidth and latency are sufficient, especially in distributed deployments.

Replication and sharding

Replication and sharding MongoDB supports replication and sharding to improve availability and scalability:
  • Replication: This ensures data redundancy and high availability. Configure read preference settings to effectively route read operations across replicas. rs.initiate()
Following are the available read methods with MongoDB which you can configure at the application level.
  • primary: Reads from the primary only
  • primaryPreferred: Reads from the primary if available, otherwise from a secondary
  • secondary: Reads from a secondary only
  • secondaryPreferred: Reads from a secondary if available, otherwise from the primary
  • nearest: Reads from the nearest node based on network latency and operational health
Example: Setting read preferences in application code (Node.js)
  • Sharding: This distributes data across multiple servers and is crucial for managing large datasets and high throughput operations. Choose a shard key that evenly distributes data and query load. sh.enableSharding("mydatabase") sh.shardCollection("mydatabase.mycollection", { shardKey: 1 })
Choosing a shard key in MongoDB can significantly impact performance depending on whether your workload is read-heavy or write-heavy. Here are some guidelines for selecting a shard key based on your workload:
Read-heavy workloads
Shard key selection: Choose a shard key that evenly distributes read operations across shards.
Considerations: Use a high-cardinality field that ensures even distribution of reads. Avoid shard keys that can cause hot spots where most reads target a single shard.
Example: Use a user ID if user-related queries are common.
sh.shardCollection("mydatabase.mycollection", { userID: 1 })
Write-heavy workloads
Shard key selection: Choose a shard key that balances the write load across shards.
Considerations: Use a field that changes frequently and ensures even write distribution. Avoid monotonically increasing keys (e.g., timestamps) as they can lead to a single shard being a bottleneck.
Example: Use a hashed shard key to distribute writes evenly if you can not get a unique shard key.
sh.shardCollection("mydatabase.mycollection", { hashedField: "hashed" })
Additional considerations: Monitor and adjust: Continuously monitor the performance and adjust shard keys if needed.
Indexing: Ensure indexes are aligned with the shard key for optimal query performance. By selecting the appropriate shard key and considering the nature of your workload, you can optimize your MongoDB deployment for both read and write operations.

Performance monitoring and maintenance

Regular monitoring and maintenance are vital for sustained performance:
  • Monitoring tools: Utilize MongoDB Atlas, mongostat, and mongotop to monitor database performance and resource usage.
mongostat --host <host> mongotop --host <host>
  • Routine maintenance: Regularly compact collections, repair databases, and rebalance shards to ensure optimal performance. db.repairDatabase()

Read/write concerns

The choice of write concern can influence both the performance and the durability of the data.

Performance

A lower write concern (e.g., w: 0) can enhance performance by reducing the latency of the write operation. However, it risks data durability.
Impact on latency
Lower write concern (e.g., w: 0):
Latency reduction:
  • The client does not wait for any acknowledgment from the server.
  • The operation is sent to the server and considered complete from the client's perspective.
  • There is no network round-trip latency as there is no need for the server to respond.
Trade-off:
  • There's an increased risk of data loss since the client receives no confirmation of write success.
  • It's suitable for non-critical data or scenarios where high write throughput is needed with minimal latency.
Higher write concern (e.g., w: 1 or w: "majority"):
Latency increase
  • The client waits for acknowledgement from the server.
  • For w: 1, waits for acknowledgment from the primary node.
  • For w: "majority", waits for acknowledgment from the majority of replica set members.
  • Network round-trip latency and server processing time add to the overall latency.
  • Enhanced data durability and consistency.
  • Ensures the write operation is replicated and acknowledged.
db.collection.insertOne({ field: "value" }, { writeConcern: { w: 1 } })

Read preferences

The choice of read preference can influence both the performance and the availability of the data.
Performance: Distributing read operations to secondary members can enhance performance by reducing the load on the primary. To successfully distribute read operations to secondary members and thereby enhance performance, you need to set the read preference in MongoDB. Here are examples of how to configure read preferences:

MongoDB Shell

db.getMongo().setReadPref("secondaryPreferred")
Connection URI
mongodb://host1,host2,host3/?readPreference=secondaryPreferred
** Application code example (NodeJS) **
By setting the read preference to secondaryPreferred, you direct read operations to secondary members when they are available, reducing the load on the primary node and enhancing overall performance.
Checks to identify the common reasons for performance issues:
  • Run mongotop and mongostat, and check which namespace is causing the issue.
  • System level - check for primary replication. Is there any lag, and how is the opLog window?
  • Application level — check for any batch loads at the application level.
  • Any slow queries (with currentOp())?
  • Are there proper indexes?
  • Sharded cluster — are the majority of the queries using the shard key?
  • WT cache? Any evicts?
  • Do you see write contention?
  • Open files ( ulimit -a ) - 65000
  • Check whether the mongod process alone causes server load or any other processes.
  • top or htop: Monitor CPU and memory usage of mongod and other processes.
  • ps and grep: Run ps aux | grep mongod to view mongod resource usage.
  • iostat: Use iostat -x 1 10 to check disk I/O metrics.
  • vmstat: Run vmstat 1 10 for overall system performance snapshots.
write contention in MongoDB. The left side shows a single update operation (db.col.updateOne()) resulting in a new document version (v1 to v2) using MVCC (Multi-Version Concurrency Control). The right side depicts multiple conflicting document updates (v2-1, v2-2, v2-3) with crosses indicating failed operations due to write contention. Arrows and pointers show the process of updates and conflicts. The diagram emphasizes avoiding write contention by revising the schema design and highlights the use of the WT (WiredTiger) storage engine's optimistic concurrency protocol.
Write contention in MongoDB can be identified by the following indicators:
High locking percentages: Use mongostat to monitor lock percentages. High values indicate contention. Slow write operations: Check for slow write operations using db.currentOp() which may indicate contention. Frequent write conflicts: Review logs for messages about write conflicts or rejections.
Increased latency: Observe increased latency in write-heavy operations or applications.
Example command to monitor lock percentages: mongostat --host <hostname>
Designing the schema properly, such as using appropriate indexes and avoiding hotspots with distributed writes, can help mitigate write contention.

Conclusion

Achieving optimal MongoDB performance involves a comprehensive approach, including query optimization, proper indexing, sufficient hardware resources, and continuous monitoring. By implementing the strategies outlined in this guide, you can significantly enhance the efficiency and responsiveness of your MongoDB deployment, ensuring it meets the demands of your applications.
Questions? Comments? Head to the MongoDB Developer Community next.
Top Comments in Forums
Forum Commenter Avatar
Srinivas_MutyalaSrinivas Mutyala6 days ago

Hello Community !!
Seeking your suggestions/ feedback on this article.

See More on Forums

Facebook Icontwitter iconlinkedin icon
Rate this article
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

How to Import Data Into MongoDB With mongoimport


Jun 12, 2024 | 15 min read
Tutorial

The 5-Minute Guide to Working with ESG Data on MongoDB


Aug 24, 2023 | 11 min read
Article

Bloated Documents


May 31, 2022 | 6 min read
Quickstart

Introduction to Multi-Document ACID Transactions in Python


Sep 23, 2022 | 10 min read
Table of Contents