Docs Menu

Docs HomeDevelop ApplicationsMongoDB Manual

Manage Sharded Cluster Health with Health Managers

On this page

  • Overview
  • Active Fault Duration
  • Progress Monitor
  • Examples

This document describes how to use Health Managers to monitor and manage sharded cluster health issues.

A Health Manager runs health checks on a health manager facet at a specified intensity level. Health Manager checks run at specified time intervals. A Health Manager can be configured to move a failing mongos out of a cluster automatically. Progress Monitor ensures that Health Manager checks do not become stuck or unresponsive.

The following table shows the available Health Manager facets:

Facet
What the Health Observer Checks
configServer
Cluster health issues related to connectivity to the config server.
dns
Cluster health issues related to DNS availability and functionality.
ldap
Cluster health issues related to LDAP availability and functionality.

The following table shows the available Health Manager intensity levels:

Intensity Level
Description
critical
The Health Manager on this facet is enabled and has the ability to move the failing mongos out of the cluster if an error occurs. The Health Manager waits the amount of time specified by activeFaultDurationSecs before stopping and moving the mongos out of the cluster automatically.
non-critical
The Health Manager on this facet is enabled and logs errors, but the mongos remains in the cluster if errors are encountered.
off
The Health Manager on this facet is disabled. The mongos does not perform any health checks on this facet. This is the default intensity level.

When a failure is detected and the Health Manager intensity level is set to critical, the Health Manager waits the amount of time specified by activeFaultDurationSecs before stopping and moving the mongos out of the cluster automatically.

Progress Monitor runs tests to ensure that Health Manager checks do not become stuck or unresponsive. Progress Monitor runs these tests in intervals specified by interval. If a health check begins but does not complete within the timeout given by deadline, Progress Monitor stops the mongos and removes it from the cluster.

Field
Description
Units
interval
How often to ensure Health Managers are not stuck or unresponsive.
Milliseconds
deadline
Timeout before automatically failing the mongos if a Health Manager check is not making progress.
Seconds

The following examples show how Health Managers can be configured. For information on Health Manager parameters, see Health Manager Parameters.

For example, to set the dns Health Manager facet to the critical intensity level, issue the following at startup:

mongos --setParameter 'healthMonitoringIntensities={ values:[ { type:"dns", intensity: "critical"} ] }'

Or if using the setParameter command in a mongosh session that is connected to a running mongos:

db.adminCommand(
{
setParameter: 1,
healthMonitoringIntensities: { values: [ { type: "dns", intensity: "critical" } ] } } )
}
)

Parameters set with setParameter do not persist across restarts. See the setParameter page for details.

To make this setting persistent, set healthMonitoringIntensities in your mongos config file using the setParameter option as in the following example:

setParameter:
healthMonitoringIntensities: "{ values:[ { type:\"dns\", intensity: \"critical\"} ] }"

healthMonitoringIntensities accepts an array of documents, values. Each document in values takes two fields:

  • type, the Health Manager facet

  • intensity, the intensity level

See healthMonitoringIntensities for details.

For example, to set the ldap Health Manager facet to the run health checks every 30 seconds, issue the following at startup:

mongos --setParameter 'healthMonitoringIntervals={ values:[ { type:"ldap", interval: "30000"} ] }'

Or if using the setParameter command in a mongosh session that is connected to a running mongos:

db.adminCommand(
{
setParameter: 1,
healthMonitoringIntervals: { values: [ { type: "ldap", interval: "30000" } ] } } )
}
)

Parameters set with setParameter do not persist across restarts. See the setParameter page for details.

To make this setting persistent, set healthMonitoringIntervals in your mongos config file using the setParameter option as in the following example:

setParameter:
healthMonitoringIntervals: "{ values: [{type: \"ldap\", interval: 200}] }"

healthMonitoringIntervals accepts an array of documents, values. Each document in values takes two fields:

  • type, the Health Manager facet

  • interval, the time interval it runs at, in milliseconds

See healthMonitoringIntervals for details.

For example, to set the duration from failure to crash to five minutes, issue the following at startup:

mongos --setParameter activeFaultDurationSecs=300

Or if using the setParameter command in a mongosh session that is connected to a running mongos:

db.adminCommand(
{
setParameter: 1,
activeFaultDurationSecs: 300
}
)

Parameters set with setParameter do not persist across restarts. See the setParameter page for details.

To make this setting persistent, set activeFaultDurationSecs in your mongos config file using the setParameter option as in the following example:

setParameter:
activeFaultDurationSecs: 300

See activeFaultDurationSecs for details.

Progress Monitor runs tests to ensure that Health Manager checks do not become stuck or unresponsive. Progress Monitor runs these tests in intervals specified by interval. If a health check begins but does not complete within the timeout given by deadline, Progress Monitor stops the mongos and removes it from the cluster.

To set the interval to 1000 milliseconds and the deadline to 300 seconds, issue the following at startup:

mongos --setParameter 'progressMonitor={"interval": 1000, "deadline": 300}'

Or if using the setParameter command in a mongosh session that is connected to a running mongos:

db.adminCommand(
{
setParameter: 1,
progressMonitor: { interval: 1000, deadline: 300 } )
}
)

Parameters set with setParameter do not persist across restarts. See the setParameter page for details.

To make this setting persistent, set progressMonitor in your mongos config file using the setParameter option as in the following example:

setParameter:
progressMonitor: "{ interval: 1000, deadline: 300 }"

See progressMonitor for details.

←  Disable Transparent Huge Pages (THP)UNIX ulimit Settings →