Review Available Metrics
You can review the following metrics to monitor your clusters. All hardware metrics include metrics or individual charts for maximum values.
Important
The metrics available depend on your user role and cluster type.
Note
Currently, Serverless instance metrics don't support any third-party services (for example, Datadog).
Metric | Description |
---|---|
Asserts | Displays the following information:
Monitor asserts to track how many errors occur while trying to read or write data. Check the server logs to identify the source of any errors. |
Avg Object Size | Displays the average object size across all collections in the database. Monitor object size to track the size of your objects and better understand your database space. |
Cache Activity | Displays the following information:
Monitor the MongoDB cache, which stores frequently accessed data in memory to service queries faster. |
Cache Ratio | Displays cache fill ratio and dirty fill ratio metrics. Cache Fill Ratio measures how well a cache can serve requests. It is calculated by dividing the number of bytes currently in the cache by the maximum number of bytes configured, represented as a percentage. A high cache fill ratio indicates that most data requests are being served from memory, leading to faster query performance and reduced disk I/O. Dirty Fill Ratio represents the proportion of dirty bytes, which are pages modified in memory but not yet written to disk, relative to the total cache. A high dirty fill ratio suggests that a significant amount of data is waiting to be flushed to disk, which can impact performance. Use this metric when monitoring write-heavy workloads to ensure data durability. |
Cache Usage | Displays the following information:
These metrics include both indexes and data from the working set. Sustained high cache usage indicates the RAM is too small for your workloads. Optimize your queries to avoid frequent disk reads. If write operations make cache usage high, throttle them. |
Catalog | Displays the following information:
Monitor catalog counts to prevent an excessive number of databases, collections, views, or indexes from causing startup failures when you upgrade a cluster tier. |
Collections | Displays the number of collections in the database. Monitor collections to determine restart times, continuous backup performance, and stability. |
Connections (Serverless instance/replica set) or connection (sharded cluster) | Displays the total number of active connections to the cluster. Monitor connections to determine whether the current connection limits are sufficient. If necessary, upgrade the cluster tier. |
Cursors | Displays the following information:
Monitor cursors to close unnecessary cursors and reduce the timeout configuration in the application. |
DB Storage | Displays the following information:
Atlas retrieves database metrics every 20 minutes by default but adjusts frequency when necessary to reduce the impact on database performance. Monitor storage space to determine whether to use disk auto-scaling or manually increase the disk size. You can also monitor this metric to verify backup billing. |
Disk IOPS | Displays input operations per second. Monitor whether disk IOPS approaches the maximum provisioned IOPS. Determine whether the cluster can handle future workloads. |
Disk Latency | Displays the following information:
Monitor disk latency to track the efficiency of reading from and writing to disk. |
Disk Queue Depth | Displays the average length of the queue of requests issued to the disk partition that MongoDB uses. Monitor disk queue depth to identify potential issues and bottlenecks. |
Disk Space Free | Displays the total amount of free space remaining on disk. Monitor free disk space to determine whether to use disk auto-scaling or manually increase the disk size. |
Disk Space Percent Free | Displays the total amount of free space remaining on disk as a percentage of the total disk space. Monitor the percentage of free disk space to determine whether to use disk auto-scaling or manually increase the disk size. |
Disk Space Used | Displays the total space on disk used. Monitor the used disk space to determine whether to use disk auto-scaling or manually increase the disk size. |
Disk Throughput | Displays the disk read and write throughput metrics. Disk Read Throughput reflects the rate at which data is read from disk in Megabytes per second, indicating how efficiently the database retrieves data that is not cached in memory. Disk Write Throughput measures the speed at which data is written to disk in Megabytes per second, reflecting the database's ability to handle write operations and persist data to storage efficiently. |
Document Metrics | Displays the following information:
Monitor document metrics to measure the work MongoDB completes. |
Execution Time | Displays the average time in seconds for the following metrics:
Monitor execution time for an increase in read operations to optimize queries and indexes. |
Index Size | Displays the total size of all indexes in the database. This metric includes the overhead incurred by indexes on top of the actual document data on which the indexes are based. Monitor the index size to manage your indexes. To learn more, see Indexing Strategies. |
Indexes | Displays the total number of indexes in the database. Monitor indexes to manage them. To learn more, see Indexing Strategies. |
Max Disk IOPS | Displays the following maximum disk IOPS values over the time period specified by the metric granularity:
Monitor whether disk IOPS approaches the maximum provisioned IOPS. Determine whether the cluster can handle future workloads. |
Max Disk Queue Depth | Displays the maximum disk queue depth values over the time period specified by the metric granularity. Disk queue depth is the average length of the queue of requests issued to the disk partition that MongoDB uses. Monitor disk queue depth to identify potential issues and bottlenecks. |
Max Normalized System CPU | Displays the maximum CPU usage values of all processes on the node, scaled to a range of 0-100% by dividing by the number of CPU cores. Monitor CPU usage to determine whether data is retrieved from disk instead of memory. If you are unable to see the usage that triggered the alert, zoom in on the Normalized System CPU chart by clicking and dragging your mouse over the period of interest. With a higher-resolution view you may be able to identify acute spikes in CPU usage that weren't visible in the overview. |
Max Process CPU | Displays the following maximum process CPU values over the time period specified by the metric granularity:
Monitor CPU usage to determine whether data is retrieved from disk instead of memory. If you are unable to see the usage that triggered the alert, zoom in on the Normalized System CPU chart by clicking and dragging your mouse over the period of interest. With a higher-resolution view you may be able to identify acute spikes in CPU usage that weren't visible in the overview. |
Max System CPU | Displays the maximum CPU usage values of all processes on the node. Monitor CPU usage to determine whether data is retrieved from disk instead of memory. If you are unable to see the usage that triggered the alert, zoom in on the Normalized System CPU chart by clicking and dragging your mouse over the period of interest. With a higher-resolution view you may be able to identify acute spikes in CPU usage that weren't visible in the overview. |
Max System Memory | Displays the maximum system memory values in bytes. Monitor memory to determine whether to upgrade to a higher cluster tier.
This metric is based on |
Memory | Displays the total consumption of memory in megabytes at a particular point in time:
Monitor memory to determine whether to upgrade to a higher cluster tier. This metric represents the average value over the time period specified by the metric granularity. |
Network | Displays the following information:
|
Normalized Process CPU | Displays the following information:
Monitor CPU usage to determine whether data is retrieved from disk instead of memory. If you are unable to see the usage that triggered the alert, zoom in on the Normalized System CPU chart by clicking and dragging your mouse over the period of interest. With a higher-resolution view you may be able to identify acute spikes in CPU usage that weren't visible in the overview. |
Normalized System CPU | Displays the CPU usage of all processes on the node, scaled to a range of 0-100% by dividing by the number of CPU cores. Monitor CPU usage to determine whether data is retrieved from disk instead of memory. If you are unable to see the usage that triggered the alert, zoom in on the Normalized System CPU chart by clicking and dragging your mouse over the period of interest. With a higher-resolution view you may be able to identify acute spikes in CPU usage that weren't visible in the overview. |
Objects | Displays the number of objects in the database. Monitor this metric to better understand your database space. |
Opcounters | Displays the number of the following operations per second run on a MongoDB process since the process last started:
Monitor MongoDB operations to validate performance issues related to high workloads. Confirm the type of operations responsible for the load. |
Opcounters - Repl | Displays the following information:
Monitor MongoDB operations to validate performance issues related to high workloads. Confirm the type of operations responsible for the load. |
Operation Execution Time | Displays the average time in milliseconds to execute the following operations:
Monitor execution time for an increase in read operations to optimize queries and indexes. Determine whether you need to upgrade your cluster tier. |
Oplog GB/Hour | Displays the average rate of the uncompressed oplog data in gigabytes that the primary generates per hour. Monitor oplog data to determine whether you have to increase the oplog size. |
Page Faults | Displays the average rate of page faults on this process per second over the selected sample period. In non-Windows environments this applies to hard page faults only. Monitor page faults to determine whether to increase your memory. |
Process CPU | Displays the following information:
Monitor CPU usage to determine whether data is retrieved from disk instead of memory. If you are unable to see the usage that triggered the alert, zoom in on the Normalized System CPU chart by clicking and dragging your mouse over the period of interest. With a higher-resolution view you may be able to identify acute spikes in CPU usage that weren't visible in the overview. |
Query Executor | Displays the following information:
Monitor the query executor to determine whether you have any inefficient queries. |
Query Targeting | Displays the efficiency of read operations run on MongoDB:
Monitor query targeting to determine read efficiency and optimize queries and indexes. The change streams cursors that the Atlas Search
process ( |
Queues | Displays the following information:
Monitor lock queues to optimize queries. |
Read/Write Units | Displays the following information:
Monitor read and write units to help optimize queries and indexes. |
Replication Headroom | Displays the difference between the primary's replication oplog window and the secondary's replication lag. Monitor replication headroom to determine whether the secondary might fall off the oplog. |
Replication Lag | Displays the approximate number of seconds the secondary is behind the primary in write application. Monitor replication lag to determine whether the secondary might fall off the oplog. |
Replication Oplog Window | Displays the estimated average number of hours of database operations available in the primary's replication oplog, based on oplog churn. If replication lag on a secondary node exceeds the replication oplog window, and replication headroom reaches zero, a full resync is required for that node to become healthy again. Monitor the replication oplog window, together with replication headroom, to determine whether the secondary may soon require a full resync. The replication oplog window often helps to determine in advance the resilience of secondaries to planned and unplanned outages. |
Scan and Order | Displays the number of operations per second returning results that required a sort in-memory. Monitor this metric to identify whether your queries need indexes. |
Shard Data Size | Displays the amount of storage space in bytes that your stored data uses on each shard. You can access this chart only for sharded clusters with MongoDB 6.0+. Monitor this metric to verify whether you have balanced shards. |
Shard Document Count | Displays the number of documents on each shard. You can access this chart only for sharded clusters with MongoDB 6.0+. Monitor this metric to verify whether you have balanced shards. |
System CPU | Displays the CPU usage of all processes on the node. Monitor CPU usage to determine whether data is retrieved from disk instead of memory. If you are unable to see the usage that triggered the alert, zoom in on the Normalized System CPU chart by clicking and dragging your mouse over the period of interest. With a higher-resolution view you may be able to identify acute spikes in CPU usage that weren't visible in the overview. |
System Memory | Displays the following information:
Monitor memory to determine whether to upgrade to a higher cluster tier. This metric represents the average value over the time period specified by the metric granularity. |
System Network | Displays the following information:
Monitor network metrics to track network performance. |
Tickets Available | Displays the following information:
Monitor the tickets available to see when read and write requests queue. For clusters running on MongoDB version 7.0 and later, don't use the number of tickets as a metric for overload alerts. Starting in MongoDB version 7.0, Atlas dynamically adjusts the number of tickets. Instead, use the number of queued readers and writers as an overload metric. |
Views | Displays the number of views in the database. Monitor views to help optimize your database. |