Make the MongoDB docs better! We value your opinion. Share your feedback for a chance to win $100.
Click here >
Docs Menu
Docs Home
/ /

Troubleshoot Slow Queries in Production

This page covers common causes and resolutions for slow queries. If you need additional support after going through the following sections, contact Technical Support.

To confirm your deployment is encountering issues with slow queries, check the following:

To confirm that queries are actually slow, compare current query latency to historical baselines or service level objectives. Identify whether slowness is constant or occurs during specific load patterns, such as load spikes, batch jobs, or maintenance windows.

Use the database profiler and slow query logs to identify specific operations by namespace, pattern, and latency.

For Atlas, use the Performance Advisor and the Query Profiler to find queries with high execution time.

Confirm cluster and node health. Check for issues with:

  • Primary/secondary state

  • Replication lag

  • Frequent elections

  • Node availability

  • Failovers

  • Node restarts

  • Storage errors that occur at the same time as slow query periods

Inspect CPU, memory, disk I/O, and disk utilization on each node, confirming there is no sustained saturation. For details, see Monitoring a Self-Managed MongoDB Deployment.

On Atlas, review metrics dashboards for spikes in CPU, IOPS, connections, and page faults around the time of slow queries.

Determine whether any recent changes occurred at the same time as the slow queries, such as:

  • Releasing an application

  • Index changes

  • Schema migrations

  • Deployment resizing

  • Parameter changes

The following sections describe common causes of slow queries and how to resolve them.

The following indexing issues can cause slow queries.

Queries that perform collection scans or use non-selective indexes can cause high latency under production load. Identify queries that don't use the expected index by using explain with the "executionStats" or “allPlansExecution” option for high verbosity. These options show execution metrics for all the plans during the evaluation phase. Create or refine indexes for the fields used to filter and sort query results.

Queries with sort() that can't use an index require an in-memory sort. This is especially slow on large result sets. You can improve performance by:

  • Creating compound indexes that match the query filter and sort patterns

  • Reducing the result set size before sort

Aggregation pipelines should use $match, $sort, or other selective stages early, to filter the dataset and avoid processing large volumes of data in memory. If your pipeline has late $match or $sort stages, move them earlier whenever possible.

You can also improve performance by creating indexes on fields used in early pipeline stages.

The following schema and query design issues can cause slow queries.

Very large documents, unbounded arrays, and highly nested structures increase I/O and CPU per operation. You can increase performance by identifying collections with unusually large documents or wide arrays and updating the schema to use bucketing or referencing where possible.

Queries that scan large collections or time ranges are slower than those that enforce tight filters and limits. You can improve query performance by:

  • Adding selective filters and indexing queried fields

  • Using pagination patterns instead of large skip() or limit() combinations

  • Narrowing time windows for time-series data, or adjusting the granularity of the underlying collection itself if it's too fine.

Some query operators and patterns prevent MongoDB from using indexes efficiently:

  • $regex with a leading wildcard

  • $nin

  • Very large $in lists

  • Excessive $or branches

  • $ne negation equations that negatively impact index scan

To improve performance, rewrite predicates to be index-friendly. Consider anchored regex and precomputed fields where possible..

Slow queries might be caused by contention for underlying hardware resources under production load. Check if there's a correlation between query latency spikes and metrics for CPU, IOPS, and cache utilization. Consider vertical and/or horizontal scaling or workload redistribution if queries are already optimized.

Long-running operations can block or interfere with other queries. For example:

  • Large index builds

  • Collection scans

  • Heavy writes

Use $currentOp to identify blocking or long-running operations. Consider scheduling heavy operations during maintenance windows. For example, consider running index builds on large collections during maintenance windows.

Queries routed to secondaries with replication lag or less capable hardware can cause slower responses. Review driver read preferences and write concerns. Ensure latency-critical queries are directed to appropriate nodes.

  • Re-run representative queries and compare latency against previous results and desired goals.

  • Confirm that explain("executionStats") shows reduced work, such as fewer documents examined and improved index usage.

  • Review profiler and slow query logs to verify either:

    • Previous slow queries no longer appear.

    • Query times and resource usage have decreased.

  • For Atlas deployments, confirm that query- and cluster-level metrics have returned to normal ranges after changes.

If you require additional support, collect the following information:

  • Sample slow queries with full filter and any applicable projection, sort, and options, including:

    • Approximate frequency

    • Expected latency vs. observed latency

  • explain("executionStats") output for representative slow queries

  • Relevant log excerpts and profiler samples covering the period where slowness occurs

  • Recent changes to:

    • Schema or indexes

    • Deployment size, tier, or topology

    • Application release or query patterns

  • Cluster metrics or host-level stats showing CPU, memory, disk I/O, and connection usage around the time of slow queries.

  • Details about deployment environment:

    • Atlas vs. self-managed

    • Hardware profile

    • Sharding configuration

    • Replication configuration

Back

Block Slow Queries

On this page