- Administration >
- Monitoring Database Systems
Monitoring Database Systems¶
On this page
Monitoring is a critical component of all database administration. A firm grasp of MongoDB’s reporting will allow you to assess the state of your database and maintain your deployment without crisis. Additionally, a sense of MongoDB’s normal operational parameters will allow you to diagnose issues as you encounter them, rather than waiting for a crisis or failure.
This document provides an overview of the available tools and data provided by MongoDB as well as an introduction to diagnostic strategies, and suggestions for monitoring instances in MongoDB’s replica sets and sharded clusters.
Note
MMS (MongoDB Management Service) is a hosted monitoring service which collects and aggregates data to provide insight into the performance and operation of MongoDB deployments. See the MMS documentation for more information.
Monitoring Tools¶
There are two primary methods for collecting data regarding the state of a running MongoDB instance. First, there are a set of tools distributed with MongoDB that provide real-time reporting of activity on the database. Second, several database commands return statistics regarding the current database state with greater fidelity. Both methods allow you to collect data that answers a different set of questions, and are useful in different contexts.
This section provides an overview of these utilities and statistics, along with an example of the kinds of questions that each method is most suited to help you address.
Utilities¶
The MongoDB distribution includes a number of utilities that return statistics about instances’ performance and activity quickly. These are typically most useful for diagnosing issues and assessing normal operation.
mongotop
¶
mongotop
tracks and reports the current read and write
activity of a MongoDB instance. mongotop
provides
per-collection visibility into use. Use mongotop
to verify
that activity and use match expectations. See the mongotop
manual for details.
mongostat
¶
mongostat
captures and returns counters of database
operations. mongostat
reports operations on a per-type
(e.g. insert, query, update, delete, etc.) basis. This format makes it
easy to understand the distribution of load on the server. Use
mongostat
to understand the distribution of operation types
and to inform capacity planning. See the mongostat manual for details.
REST Interface¶
MongoDB provides a REST interface that exposes a diagnostic
and monitoring information in a simple web page. Enable this by
setting rest
to true
, and access this page via the
local host interface using the port numbered 1000 more than that the
database port. In default configurations the REST interface is
accessible on 28017
. For example, to access the REST interface on a
locally running mongod instance: http://localhost:28017
Statistics¶
MongoDB provides a number of commands that return statistics about the state of the MongoDB instance. These data may provide finer granularity regarding the state of the MongoDB instance than the tools above. Consider using their output in scripts and programs to develop custom alerts, or to modify the behavior of your application in response to the activity of your instance.
serverStatus
¶
Access serverStatus data by way of
the serverStatus
command. This document
contains a general overview of the state of the database, including
disk usage, memory use, connection, journaling, index accesses. The
command returns quickly and does not impact MongoDB performance.
While this output contains a (nearly) complete account of the state of
a MongoDB instance, in most cases you will not run this command
directly. Nevertheless, all administrators should be familiar with the
data provided by serverStatus
.
See also
replSetGetStatus
¶
View the replSetGetStatus data with
the replSetGetStatus
command (rs.status()
from the
shell). The document returned by
this command reflects the state and configuration of the replica
set. Use this data to ensure that replication is properly configured,
and to check the connections between the current host and the members
of the replica set.
dbStats
¶
The dbStats data is accessible
by way of the dbStats
command (db.stats()
from
the shell). This command returns a
document that contains data that reflects the amount of storage used
and data contained in the database, as well as object, collection, and
index counters. Use this data to check and track the state and storage
of a specific database. This output also allows you to compare
utilization between databases and to determine average
document size in a database.
collStats
¶
The collStats data is
accessible using the collStats
command (db.printCollectionStats()
from the shell). It provides
statistics that resemble dbStats
on the collection level:
this includes a count of the objects in the collection, the size of
the collection, the amount of disk space used by the collection, and
information about the indexes.
Introspection Tools¶
In addition to status reporting, MongoDB provides a number of introspection tools that you can use to diagnose and analyze performance and operational conditions. Consider the following documentation:
Third Party Tools¶
A number of third party monitoring tools have support for MongoDB, either directly, or through their own plugins.
Self Hosted Monitoring Tools¶
These are monitoring tools that you must install, configure and maintain on your own servers, usually open source.
Tool | Plugin | Description |
---|---|---|
Ganglia | mongodb-ganglia | Python script to report operations per second, memory usage, btree statistics, master/slave status and current connections. |
Ganglia | gmond_python_modules | Parses output from the serverStatus and replSetGetStatus commands. |
Motop | None | Realtime monitoring tool for several MongoDB servers. Shows current operations ordered by durations every second. |
mtop | None | A top like tool. |
Munin | mongo-munin | Retrieves server statistics. |
Munin | mongomon | Retrieves collection statistics (sizes, index sizes, and each (configured) collection count for one DB). |
Munin | munin-plugins Ubuntu PPA | Some additional munin plugins not in the main distribution. |
Nagios | nagios-plugin-mongodb | A simple Nagios check script, written in Python. |
Also consider dex, an index and query analyzing tool for MongoDB that compares MongoDB log files and indexes to make indexing recommendations.
Hosted (SaaS) Monitoring Tools¶
These are monitoring tools provided as a hosted service, usually on a subscription billing basis.
Name | Notes |
---|---|
MongoDB Management Service |
|
Scout | Several plugins including: MongoDB Monitoring, MongoDB Slow Queries and MongoDB Replica Set Monitoring. |
Server Density | Dashboard for MongoDB, MongoDB specific alerts, replication failover timeline and iPhone, iPad and Android mobile apps. |
Process Logging¶
During normal operation, mongod
and mongos
instances report information that reflect current operation to
standard output, or a log file. The following runtime settings
control these options.
quiet
. Limits the amount of information written to the log or output.verbose
. Increases the amount of information written to the log or output.You can also specify this as
v
(as in-v
.) Set multiplev
, as invvvv = True
for higher levels of verbosity. You can also change the verbosity of a runningmongod
ormongos
instance with thesetParameter
command.logpath
. Enables logging to a file, rather than standard output. Specify the full path to the log file to this setting.logappend
. Adds information to a log file instead of overwriting the file.
Note
You can specify these configuration operations as the command line arguments to mongod or mongos
Additionally, the following database commands affect logging:
getLog
. Displays recent messages from themongod
process log.logRotate
. Rotates the log files formongod
processes only. See Rotate Log Files.
Diagnosing Performance Issues¶
Degraded performance in MongoDB can be the result of an array of causes, and is typically a function of the relationship among the quantity of data stored in the database, the amount of system RAM, the number of connections to the database, and the amount of time the database spends in a lock state.
In some cases performance issues may be transient and related to traffic load, data access patterns, or the availability of hardware on the host system for virtualized environments. Some users also experience performance limitations as a result of inadequate or inappropriate indexing strategies, or as a consequence of poor schema design patterns. In other situations, performance issues may indicate that the database may be operating at capacity and that it is time to add additional capacity to the database.
Locks¶
MongoDB uses a locking system to ensure consistency. However, if
certain operations are long-running, or a queue forms, performance
slows as requests and operations wait for the lock. Because lock
related slow downs can be intermittent, look to the data in the
globalLock section of the serverStatus
response to
assess if the lock has been a challenge to your performance. If
globalLock.currentQueue.total
is consistently high, then
there is a chance that a large number of requests are waiting for a
lock. This indicates a possible concurrency issue that might affect
performance.
If globalLock.totalTime
is
high in context of uptime
then the database has
existed in a lock state for a significant amount of time. If
globalLock.ratio
is also high, MongoDB has likely been
processing a large number of long running queries. Long queries are
often the result of a number of factors: ineffective use of indexes,
non-optimal schema design, poor query structure, system architecture
issues, or insufficient RAM resulting in page faults and disk reads.
Memory Usage¶
Because MongoDB uses memory mapped files to store data, given a data set of sufficient size, the MongoDB process will allocate all memory available on the system for its use. Because of the way operating systems function, the amount of allocated RAM is not a useful reflection of MongoDB’s state.
While this is part of the design, and affords MongoDB superior
performance, the memory mapped files make it difficult to determine if
the amount of RAM is sufficient for the data set. Consider
memory usage statuses to better understand
MongoDB’s memory utilization. Check the resident memory use
(i.e. mem.resident
:) if this exceeds the amount of system
memory and there’s a significant amount of data on disk that isn’t
in RAM, you may have exceeded the capacity of your system.
Also check the amount of mapped memory (i.e. mem.mapped
.) If
this value is greater than the amount of system memory, some
operations will require disk access page faults to read data
from virtual memory with deleterious effects on performance.
Page Faults¶
Page faults represent the number of times that MongoDB requires data
not located in physical memory, and must read from virtual memory. To
check for page faults, see the extra_info.page_faults
value in the
serverStatus
command. This data is only available on
Linux systems.
Alone, page faults are minor and complete quickly; however, in aggregate, large numbers of page fault typically indicate that MongoDB is reading too much data from disk and can indicate a number of underlying causes and recommendations. In many situations, MongoDB’s read locks will “yield” after a page fault to allow other processes to read and avoid blocking while waiting for the next page to read into memory. This approach improves concurrency, and in high volume systems this also improves overall throughput.
If possible, increasing the amount of RAM accessible to MongoDB may
help reduce the number of page faults. If this is not possible, you
may want to consider deploying a sharded cluster and/or
adding one or more shards to your deployment to
distribute load among mongod
instances.
Number of Connections¶
In some cases, the number of connections between the application layer (i.e. clients) and the database can overwhelm the ability of the server to handle requests which can produce performance irregularities. Check the following fields in the serverStatus document:
globalLock.activeClients
contains a counter of the total number of clients with active operations in progress or queued.connections
is a container for the following two fields:
Note
Unless limited by system-wide limits MongoDB has a hard connection
limit of 20 thousand connections. You can modify system limits
using the ulimit
command, or by editing your system’s
/etc/sysctl
file.
If requests are high because there are many concurrent application
requests, the database may have trouble keeping up with demand. If
this is the case, then you will need to increase the capacity of your
deployment. For read-heavy applications increase the size of your
replica set and distribute read operations to
secondary members. For write heavy applications, deploy
sharding and add one or more shards to a
sharded cluster to distribute load among mongod
instances.
Spikes in the number of connections can also be the result of application or driver errors. All of the officially supported MongoDB drivers implement connection pooling, which allows clients to use and reuse connections more efficiently. Extremely high numbers of connections, particularly without corresponding workload is often indicative of a driver or other configuration error.
Database Profiling¶
MongoDB contains a database profiling system that can help identify
inefficient queries and operations. Enable the profiler by setting the
profile
value using the following command in the
mongo
shell:
See
The documentation of db.setProfilingLevel()
for more
information about this command.
Note
Because the database profiler can have an impact on the performance, only enable profiling for strategic intervals and as minimally as possible on production systems.
You may enable profiling on a per-mongod
basis. This
setting will not propagate across a replica set or
sharded cluster.
The following profiling levels are available:
Level | Setting |
0 | Off. No profiling. |
1 | On. Only includes slow operations. |
2 | On. Includes all operations. |
See the output of the profiler in the system.profile
collection of
your database. You can specify the slowms
setting to set a
threshold above which the profiler considers operations “slow” and
thus included in the level 1
profiling data. You may configure
slowms
at runtime, as an argument to the
db.setProfilingLevel()
operation.
Additionally, mongod
records all “slow” queries to its
log
, as defined by slowms
. The data in
system.profile
does not persist between mongod
restarts.
You can view the profiler’s output by issuing the show profile
command in the mongo
shell, with the following operation.
This returns all operations that lasted longer than 100 milliseconds.
Ensure that the value specified here (i.e. 100
) is above the
slowms
threshold.
See also
Optimization Strategies for MongoDB Applications addresses strategies that may improve the performance of your database queries and operations.
Replication and Monitoring¶
The primary administrative concern that requires monitoring with replica sets, beyond the requirements for any MongoDB instance, is “replication lag.” This refers to the amount of time that it takes a write operation on the primary to replicate to a secondary. Some very small delay period may be acceptable; however, as replication lag grows, two significant problems emerge:
- First, operations that have occurred in the period of lag are not replicated to one or more secondaries. If you’re using replication to ensure data persistence, exceptionally long delays may impact the integrity of your data set.
- Second, if the replication lag exceeds the length of the operation log (oplog) then MongoDB will have to perform an initial sync on the secondary, copying all data from the primary and rebuilding all indexes. In normal circumstances this is uncommon given the typical size of the oplog, but it’s an issue to be aware of.
For causes of replication lag, see Replication Lag.
Replication issues are most often the result of network connectivity
issues between members or the result of a primary that does not
have the resources to support application and replication traffic. To
check the status of a replica, use the replSetGetStatus
or
the following helper in the shell:
See the Replica Set Status Reference document for a more in
depth overview view of this output. In general watch the value of
optimeDate
. Pay particular attention to the difference in
time between the primary and the secondary members.
The size of the operation log is only configurable during the first
run using the --oplogSize
argument to
the mongod
command, or preferably the oplogSize
in the MongoDB configuration file. If you do not specify this on the
command line before running with the --replSet
option, mongod
will create a default sized oplog.
By default the oplog is 5% of total available disk space on 64-bit systems.
See also
Sharding and Monitoring¶
In most cases the components of sharded clusters benefit from the same monitoring and analysis as all other MongoDB instances. Additionally, clusters require monitoring to ensure that data is effectively distributed among nodes and that sharding operations are functioning appropriately.
See also
See the Sharding page for more information.
Config Servers¶
The config database provides a map of documents to shards. The
cluster updates this map as chunks move between
shards. When a configuration server becomes inaccessible, some
sharding operations like moving chunks and starting mongos
instances become unavailable. However, clusters remain
accessible from already-running mongos
instances.
Because inaccessible configuration servers can have a serious impact
on the availability of a sharded cluster, you should monitor the
configuration servers to ensure that the cluster remains well
balanced and that mongos
instances can restart.
Balancing and Chunk Distribution¶
The most effective sharded cluster deployments require that
chunks are evenly balanced among the shards. MongoDB
has a background balancer process that distributes data such that
chunks are always optimally distributed among the shards.
Issue the db.printShardingStatus()
or sh.status()
command to the mongos
by way of the mongo
shell. This returns an overview of the entire cluster including the
database name, and a list of the chunks.
Stale Locks¶
In nearly every case, all locks used by the balancer are automatically
released when they become stale. However, because any long lasting
lock can block future balancing, it’s important to insure that all
locks are legitimate. To check the lock status of the database,
connect to a mongos
instance using the mongo
shell. Issue the following command sequence to switch to the
config
database and display all outstanding locks on the shard database:
For active deployments, the above query might return a useful result
set. The balancing process, which originates on a randomly selected
mongos
, takes a special “balancer” lock that prevents other
balancing activity from transpiring. Use the following command, also
to the config
database, to check the status of the “balancer”
lock.
If this lock exists, make sure that the balancer process is actively using this lock.