Docs Menu
Docs Home
/
MongoDB Cloud Manager
/ /

Integrate with Prometheus

On this page

  • Prerequisites
  • Procedure
  • Example Configurations
  • Performance Metrics Available to Prometheus
  • MongoDB Metric Labels
  • MongoDB Information Metrics
  • Hardware Metrics
  • Hardware Metric Labels

Prometheus collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts when it observes specific conditions.

Our integration allows you to configure Cloud Manager to send metric data about your deployment to your Prometheus instance.

  • Prometheus integration is available in automation managed clusters that use MongoDB Agent 12.0.15.7646 or later. MongoDB Agent 12.0.15.7646 is released with Cloud Manager 6.0.7.

  • Have a working Prometheus instance. To set up a working instance, see their Installation Guide.

  • (Optional) Use Grafana to visualize your Prometheus metrics.

To integrate Cloud Manager with Prometheus:

1
  1. If it's not already displayed, select the organization that contains your desired project from the Organizations menu in the navigation bar.

  2. If it's not already displayed, select your desired project from the Projects menu in the navigation bar.

  3. Next to the Projects menu, expand the Options menu, then click Integrations.

    The Project Integrations page displays.

2
3

Prometheus authentication credentials are specifically designed for use with the Prometheus integration in Cloud Manager. Cloud Manager uses these credentials to only access the Prometheus discovery endpoint and scrape Prometheus metrics from Cloud Manager nodes. They are strictly limited to these functions and do not have any additional permissions or capabilities beyond accessing and collecting monitoring data.

Important

Copy your username and password in a secure location. You can't access the password after you leave this screen.

4

Tip

The default value, 0.0.0.0:9216, scrapes metrics on port 9216 on all IPv4 addresses on the local machine.

5

If you enable this setting, Cloud Manager assures that your Prometheus instance uses https to scrape metrics.

Fields
Description
TLS Certificate Key File Path

PEM file path that contains certificate and key required to spin up a https Prometheus scraping endpoint.

You are responsible for the following:

  • TLS Certificate Key File issuance and renewal.

  • Checking if the endpoint started correctly in the automation agent logs.

TLS Certificate Key File Password
Required if the certificate key file is encrypted.
6
Discovery Method
Description
This method requires Prometheus v2.28 and later. It generates the scrape_config part of your configuration file to discover targets over an HTTP endpoint.

This method allows Prometheus to read YAML or JSON documents to configure the targets to scrape from.

You are responsible for providing the targets by making a request to the Discovery API and storing its results in a targets.json file.

To make the request, substitute the placeholder text in one of the following tabs or create your own script in another language.

# Sets the `Authorization` header on every scrape
# request with the username and password from the
# previous step. The URL that Prometheus fetches the
# targets from.
# Replace the <group-id> with the project ID of your
# Atlas instance.
curl --header 'Accept: application/json' \
--user <username>:<password> \
--request GET "https://cloud.mongodb.com/prometheus/v1.0/groups/{GROUP-ID}/discovery"

If you need to install the requests library, see their Installation Guide.

import time, json, requests
# This script sets the `Authorization` header on every
# scrape request with the configured username and
# password. Then it tells Prometheus to fetch targets
# from the specified URL.
#
# Note: Replace the <username> and <password> with the
# values in the previous step, and <group-id> with the
# project ID of your Atlas instance.
basic_auth_user="<username>"
basic_auth_password="<password>"
discovery_api_url="https://cloud.mongodb.com/prometheus/v1.0/groups/{GROUP-ID}/discovery"
# The script updates your targets.json file every
# minute, if it successfully retrieves targets.
#
# Note: Replace the <path-to-targets.json> with the
# path to your targets.json file.
starttime = time.time()
while True:
r = requests.get(discovery_api_url, auth=(basic_auth_user, basic_auth_password))
if r.status_code == 200:
with open('<path-to-targets.json>', 'w') as f:
json.dump(r.json(), f)
time.sleep(60.0 - ((time.time() - starttime) % 60.0))

To learn more about the Discovery API, see Return the Latest Targets for Prometheus.

7
8
  1. Copy the generated snippet into the scrape_configs section of your configuration file and substitute the placeholder text.

    For an example of the configuration file in either method, see Example Configurations.

  2. Restart your Prometheus instance.

  3. In your Prometheus instance, click Status in the top navigation bar, and click Targets to see the metrics of your deployment.

The following shows examples of the configuration file when you use the HTTP Service Discovery or File Service Discovery method.

The configuration file in both methods contains the following fields:

Field
Description
scrape_interval
Time that indicates how frequently to scrape targets. This setting supports a minimum time of 10s.
job_name
Human-readable label assigned to scraped metrics.
metrics_path
HTTP resource path that indicates where to fetch metrics from targets.
scheme
Your Prometheus protocol scheme configured for requests, either http or https. If you configure https, you must specify tlsPemPath.
basic_auth
Authorization header to use on every scrape request.

The HTTP Service Discovery method also contains the http_sd_configs field with the following sub-fields:

Field
Description
url
URL from which Prometheus fetches the targets.
refresh_interval
Time that indicates when to re-query the endpoint.
basic_auth
Credentials to use for authenticating to the API server.
global:
scrape_interval: 15s
scrape_configs:
- job_name: "CM-Testing-mongo-metrics"
scrape_interval: 10s
metrics_path: /metrics
scheme : https
basic_auth:
username: prom_user_61e6e34e93eac1632d39f457
password: V7hTyLfkjwiWQbv
http_sd_configs:
- url: https://cloud.mongodb.com/prometheus/v1.0/groups/61e6e34e93eac1632d39f457/discovery
refresh_interval: 60s
basic_auth:
username: prom_user_61e6e34e93eac1632d39f457
password: V7hTyLfkjwiWQbv

The File Service Discovery method also contains the file_sd_configs field with the following sub-field:

Field
Description
files
List that contains the files from which to extract the metrics scraping targets.
global:
scrape_interval: 15s
scrape_configs:
- job_name: "CM-Testing-mongo-metrics"
scrape_interval: 10s
metrics_path: /metrics
scheme : https
basic_auth:
username: prom_user_61e6e34e93eac1632d39f457
password: V7hTyLfkjwiWQbv
file_sd_configs:
- files:
- /usr/local/etc/targets.json

The following metrics are available when you use the Prometheus integration with your MongoDB Atlas cluster:

Each MongoDB metric contains the following labels:

Label
Description
group_id
Unique hexadecimal digit string that identifies the project.
org_id
Unique hexadecimal digit string that identifies the organization.
cl_role
Human readable label that defines the cluster role.
cl_name
Human-readable label that identifies the cluster.
rs_nm
Human-readable label that identifies the replica set.
rs_state
Number that indicates the replica set state.
process_port
Port on which the process runs.

mongodb_info is a gauge that always has the value of 1. This metric contains all the MongoDB Metric Labels and also the following labels:

Label
Description
mongodb_version
String that represents the major, minor, and patch versions.
replica_state_name
String that indicates the replica set member status.
process_type
String that indicates the process running. Its values can be mongod, mongos, or config.

Note

You can also view descriptions of each hardware metric in the Prometheus expression browser.

Name
Operating System
Type
Description
hardware_system_cpu_nice
Unix, Darwin
Counter
Time spent in user mode with low priority.
hardware_system_cpu_io_wait
Unix
Counter
Time waiting for I/O to complete.
hardware_system_cpu_irq
Unix
Counter
Time spent servicing interrupts.
hardware_system_cpu_soft_irq
Unix
Counter
Time spent servicing softirq's.
hardware_system_cpu_steal
Unix
Counter
Time spent in other operating systems when running in a virtual environment.
hardware_system_cpu_guest
Unix
Counter
Time spent running a virtual CPU for the guest operating systems under the control of the Linux kernel.
hardware_system_cpu_guest_nice
Unix
Counter
Time spent running a guest with an adjusted niceness.
hardware_system_cpu_kernel_milliseconds
All
Counter
Time spent in system mode.
hardware_system_cpu_user_milliseconds
All
Counter
Time spent in user mode.
hardware_disk_metrics_weighted_time_io
Unix
Counter
Weighted time spent doing I/O's.
hardware_disk_metrics_physical_write_count
Unix
Counter
Number of physical write I/O's processed.
hardware_disk_metrics_physical_read_count
Unix
Counter
Number of physical read I/O's processed.
hardware_disk_metrics_total_time
Unix
Counter
Total time this block device is active.
hardware_disk_metrics_idle_time
Windows
Counter
Time spent in the idle task.
hardware_disk_metrics_disk_space_free_bytes
All
Gauge
Disk space available in the mounted file system.
hardware_disk_metrics_disk_space_used_bytes
All
Gauge
Disk space used in the mounted file system.
hardware_disk_metrics_read_count
All
Counter
Number of read I/O's processed.
hardware_disk_metrics_read_time_milliseconds
All
Counter
Total wait time for read requests.
hardware_disk_metrics_write_count
All
Counter
Number of write I/O's processed.
hardware_disk_metrics_write_time_milliseconds
All
Counter
Total wait time for write requests.
hardware_process_cpu_children_user
Unix
Counter
Amount of time scheduled in user mode for this process to wait for children.
hardware_process_cpu_children_kernel
Unix
Counter
Amount of time scheduled in kernel mode for this process to wait for children.
hardware_process_cpu_kernel_milliseconds
All
Counter
Amount of time scheduled in kernel mode for this process.
hardware_process_cpu_user_milliseconds
All
Counter
Amount of time scheduled in user mode for this process.
hardware_system_vm_page_swap_in
Unix
Counter
Number of pages the system has swapped in from disk.
hardware_system_vm_page_swap_out
Unix
Counter
Number of pages the system has swapped out to disk.
hardware_system_memory_mem_total
Unix
Gauge
Total usable RAM (physical RAM minus a few reserved bits and the kernel binary code).
hardware_system_memory_mem_free
Unix
Gauge
Sum of LowFree + HighFree.
hardware_system_memory_mem_available
Unix
Gauge
An estimate of how much memory is available for starting new applications, without swapping.
hardware_system_memory_buffers
Unix
Gauge
Temporary storage for raw disk blocks that shouldn't get tremendously large.
hardware_system_memory_cached
Unix
Gauge
In-memory cache for files read from the disk. This doesn't include SwapCached.
hardware_system_memory_swap_total
Unix
Gauge
Total amount of swap space available.
hardware_system_memory_swap_free
Unix
Gauge
Total amount of swap space unused.
hardware_system_memory_shared_mem
Unix
Gauge
Amount of memory consumed in file systems whose contents reside in virtual memory.
hardware_system_memory_swap_free_kilobytes
All
Gauge
Total amount of swap space unused.
hardware_system_memory_swap_total_kilobytes
All
Gauge
Total amount of swap space available.
hardware_platform_num_logical_cpus
All
Gauge
Number of logical CPUs usable by the current process.
hardware_system_network_eth0_bytes_in_bytes
All
Counter
Number of bytes of data received by the interface.
hardware_system_network_eth0_bytes_out_bytes
All
Counter
Number of bytes of data transmitted by the interface.
hardware_system_network_lo_bytes_in_bytes
All
Counter
Number of bytes of data received by the interface.
hardware_system_network_lo_bytes_out_bytes
All
Counter
Number of bytes of data transmitted by the interface.

Each hardware metric contains the following labels:

Label
Description
group_id
Unique hexadecimal digit string that identifies the project.
org_id
Unique hexadecimal digit string that identifies the organization.
process_port
Port on which the process runs.
disk_name
Human-readable label that identifies the disk.

Back

Integrate with PagerDuty