Integrate with Prometheus
On this page
- OAuth 2.0 authentication for programmatic access to Cloud Manager is available as a Preview feature.
- The feature and the corresponding documentation might change at any time during the Preview period. To use OAuth 2.0 authentication, create a service account to use in your requests to the Cloud Manager Public API.
Prometheus collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts when it observes specific conditions.
Our integration allows you to configure Cloud Manager to send metric data about your deployment to your Prometheus instance.
Prerequisites
Prometheus integration is available in automation managed clusters that use MongoDB Agent 12.0.15.7646 or later. MongoDB Agent 12.0.15.7646 is released with Cloud Manager 6.0.7.
Have a working Prometheus instance. To set up a working instance, see their Installation Guide.
(Optional) Use Grafana to visualize your Prometheus metrics.
Procedure
To integrate Cloud Manager with Prometheus:
In MongoDB Cloud Manager, go to the Project Integrations page.
If it's not already displayed, select the organization that contains your desired project from the Organizations menu in the navigation bar.
If it's not already displayed, select your desired project from the Projects menu in the navigation bar.
Next to the Projects menu, expand the Options menu, then click Integrations.
The Project Integrations page displays.
Enter your preferred username and password.
Prometheus authentication credentials are specifically designed for use with the Prometheus integration in Cloud Manager. Cloud Manager uses these credentials to only access the Prometheus discovery endpoint and scrape Prometheus metrics from Cloud Manager nodes. They are strictly limited to these functions and do not have any additional permissions or capabilities beyond accessing and collecting monitoring data.
Important
Copy your username and password in a secure location. You can't access the password after you leave this screen.
(Optional) Encrypt all Prometheus metrics.
If you enable this setting, Cloud Manager assures that your Prometheus
instance uses https
to scrape metrics.
Fields | Description |
---|---|
TLS Certificate Key File Path |
You are responsible for the following:
|
TLS Certificate Key File Password | Required if the certificate key file is encrypted. |
Select your preferred service discovery method.
Discovery Method | Description | ||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
This method requires Prometheus v2.28 and later. It
generates the scrape_config
part of your configuration file
to discover targets over an HTTP endpoint. | |||||||||||||||||||||||||||||||||||||||
This method allows Prometheus to read YAML or JSON documents to configure the targets to scrape from. You are responsible for providing the targets by making a request to the Discovery API and storing its results in a To make the request, substitute the placeholder text in one of the following tabs or create your own script in another language.
If you need to install the
To learn more about the Discovery API, see Return the Latest Targets for Prometheus. |
View Your Cluster Metrics on Prometheus.
Copy the generated snippet into the scrape_configs section of your configuration file and substitute the placeholder text.
For an example of the configuration file in either method, see Example Configurations.
Restart your Prometheus instance.
In your Prometheus instance, click
Status
in the top navigation bar, and clickTargets
to see the metrics of your deployment.
Example Configurations
The following shows examples of the configuration file when you use the HTTP Service Discovery or File Service Discovery method.
The configuration file in both methods contains the following fields:
Field | Description |
---|---|
scrape_interval | Time that indicates how frequently to scrape targets. This
setting supports a minimum time of 10s. |
job_name | Human-readable label assigned to scraped metrics. |
metrics_path | HTTP resource path that indicates where to fetch metrics from
targets. |
scheme | Your Prometheus protocol scheme configured for
requests, either http or https . If you
configure https , you must specify tlsPemPath . |
basic_auth | Authorization header to use on every scrape request. |
HTTP Service Discovery
The HTTP Service Discovery method also contains the http_sd_configs
field with the following sub-fields:
Field | Description |
---|---|
url | URL from which Prometheus fetches the targets. |
refresh_interval | Time that indicates when to re-query the endpoint. |
basic_auth | Credentials to use for authenticating to the API server. |
global: scrape_interval: 15s scrape_configs: - job_name: "CM-Testing-mongo-metrics" scrape_interval: 10s metrics_path: /metrics scheme : https basic_auth: username: prom_user_61e6e34e93eac1632d39f457 password: V7hTyLfkjwiWQbv http_sd_configs: - url: https://cloud.mongodb.com/prometheus/v1.0/groups/61e6e34e93eac1632d39f457/discovery refresh_interval: 60s basic_auth: username: prom_user_61e6e34e93eac1632d39f457 password: V7hTyLfkjwiWQbv
File Service Discovery
The File Service Discovery method also contains the
file_sd_configs
field with the following sub-field:
Field | Description |
---|---|
files | List that contains the files from which to extract the metrics scraping targets. |
global: scrape_interval: 15s scrape_configs: - job_name: "CM-Testing-mongo-metrics" scrape_interval: 10s metrics_path: /metrics scheme : https basic_auth: username: prom_user_61e6e34e93eac1632d39f457 password: V7hTyLfkjwiWQbv file_sd_configs: - files: - /usr/local/etc/targets.json
Performance Metrics Available to Prometheus
The following metrics are available when you use the Prometheus integration with your MongoDB Atlas cluster:
serverStatus metrics
replSetStatus metrics
MongoDB Metric Labels
Each MongoDB metric contains the following labels:
Label | Description |
---|---|
group_id | Unique hexadecimal digit string that identifies the project. |
org_id | Unique hexadecimal digit string that identifies the organization. |
cl_role | Human readable label that defines the cluster role. |
cl_name | Human-readable label that identifies the cluster. |
rs_nm | Human-readable label that identifies the replica set. |
rs_state | Number that indicates the replica set state. |
process_port | Port on which the process runs. |
MongoDB Information Metrics
mongodb_info
is a gauge that always has the value of 1
. This
metric contains all the MongoDB Metric Labels and
also the following labels:
Label | Description |
---|---|
mongodb_version | String that represents the major, minor, and patch versions. |
replica_state_name | String that indicates the replica set member status. |
process_type | String that indicates the process running. Its values can be
mongod , mongos , or config . |
Hardware Metrics
Note
You can also view descriptions of each hardware metric in the Prometheus expression browser.
Name | Operating System | Type | Description |
---|---|---|---|
hardware_system_cpu_nice | Unix, Darwin | Counter | Time spent in user mode with low priority. |
hardware_system_cpu_io_wait | Unix | Counter | Time waiting for I/O to complete. |
hardware_system_cpu_irq | Unix | Counter | Time spent servicing interrupts. |
hardware_system_cpu_soft_irq | Unix | Counter | Time spent servicing softirq's. |
hardware_system_cpu_steal | Unix | Counter | Time spent in other operating systems when running in a virtual
environment. |
hardware_system_cpu_guest | Unix | Counter | Time spent running a virtual CPU for the guest operating systems under the control of the Linux kernel. |
hardware_system_cpu_guest_nice | Unix | Counter | Time spent running a guest with an adjusted niceness. |
hardware_system_cpu_kernel_milliseconds | All | Counter | Time spent in system mode. |
hardware_system_cpu_user_milliseconds | All | Counter | Time spent in user mode. |
hardware_disk_metrics_weighted_time_io | Unix | Counter | Weighted time spent doing I/O's. |
hardware_disk_metrics_physical_write_count | Unix | Counter | Number of physical write I/O's processed. |
hardware_disk_metrics_physical_read_count | Unix | Counter | Number of physical read I/O's processed. |
hardware_disk_metrics_total_time | Unix | Counter | Total time this block device is active. |
hardware_disk_metrics_idle_time | Windows | Counter | Time spent in the idle task. |
hardware_disk_metrics_disk_space_free_bytes | All | Gauge | Disk space available in the mounted file system. |
hardware_disk_metrics_disk_space_used_bytes | All | Gauge | Disk space used in the mounted file system. |
hardware_disk_metrics_read_count | All | Counter | Number of read I/O's processed. |
hardware_disk_metrics_read_time_milliseconds | All | Counter | Total wait time for read requests. |
hardware_disk_metrics_write_count | All | Counter | Number of write I/O's processed. |
hardware_disk_metrics_write_time_milliseconds | All | Counter | Total wait time for write requests. |
hardware_process_cpu_children_user | Unix | Counter | Amount of time scheduled in user mode for this process to wait for children. |
hardware_process_cpu_children_kernel | Unix | Counter | Amount of time scheduled in kernel mode for this process to wait for children. |
hardware_process_cpu_kernel_milliseconds | All | Counter | Amount of time scheduled in kernel mode for this process. |
hardware_process_cpu_user_milliseconds | All | Counter | Amount of time scheduled in user mode for this process. |
hardware_system_vm_page_swap_in | Unix | Counter | Number of pages the system has swapped in from disk. |
hardware_system_vm_page_swap_out | Unix | Counter | Number of pages the system has swapped out to disk. |
hardware_system_memory_mem_total | Unix | Gauge | Total usable RAM (physical RAM minus a few reserved bits and the kernel binary code). |
hardware_system_memory_mem_free | Unix | Gauge | Sum of LowFree + HighFree . |
hardware_system_memory_mem_available | Unix | Gauge | An estimate of how much memory is available for starting new applications, without swapping. |
hardware_system_memory_buffers | Unix | Gauge | Temporary storage for raw disk blocks that shouldn't get tremendously large. |
hardware_system_memory_cached | Unix | Gauge | In-memory cache for files read from the disk. This doesn't include SwapCached . |
hardware_system_memory_swap_total | Unix | Gauge | Total amount of swap space available. |
hardware_system_memory_swap_free | Unix | Gauge | Total amount of swap space unused. |
hardware_system_memory_shared_mem | Unix | Gauge | Amount of memory consumed in file systems whose contents reside in virtual memory. |
hardware_system_memory_swap_free_kilobytes | All | Gauge | Total amount of swap space unused. |
hardware_system_memory_swap_total_kilobytes | All | Gauge | Total amount of swap space available. |
hardware_platform_num_logical_cpus | All | Gauge | Number of logical CPUs usable by the current process. |
hardware_system_network_eth0_bytes_in_bytes | All | Counter | Number of bytes of data received by the interface. |
hardware_system_network_eth0_bytes_out_bytes | All | Counter | Number of bytes of data transmitted by the interface. |
hardware_system_network_lo_bytes_in_bytes | All | Counter | Number of bytes of data received by the interface. |
hardware_system_network_lo_bytes_out_bytes | All | Counter | Number of bytes of data transmitted by the interface. |
Hardware Metric Labels
Each hardware metric contains the following labels:
Label | Description |
---|---|
group_id | Unique hexadecimal digit string that identifies the project. |
org_id | Unique hexadecimal digit string that identifies the organization. |
process_port | Port on which the process runs. |
disk_name | Human-readable label that identifies the disk. |