The mongot process exposes Prometheus metrics that describe its runtime health and performance across core areas of operation. This reference page describes key metrics that are relevant to day-to-day monitoring and troubleshooting. For the complete metric set, scrape the mongot Prometheus metrics endpoint at http://<mongot-host>:9946/metrics.
メトリクスを表示
To view the raw metrics that mongot exposes, send an HTTP GET request to the following mongot Prometheus metrics endpoint:
http://<mongot-host>:9946/metrics
In this endpoint:
<mongot-host>is the hostname or IP address of themongotprocess.9946is the default port for the metrics endpoint. To configure the metrics endpoint port, see themetrics.addresssetting in themongotconfiguration file./metricsis the path for the metrics endpoint.
The /metrics endpoint returns metrics in plain Prometheus text format. To monitor mongot metrics over time, configure your Prometheus instance to scrape this endpoint.
重要
The /metrics endpoint requires no authentication by default. For production deployments, restrict access at the network layer.
Metric Naming Conventions
mongot metric names use a consistent naming pattern:
All metric names start with the
mongot_prefix.Metric names generally follow the pattern
mongot_<area>_<measurement>[_<unit>], where:<area>indicates the subsystem or component the metric belongs to, such asprocess,jvm,replication, orindex.<measurement>indicates what is being measured, such ascpu_usage,heap_memory, orindex_size.<unit>(optional) indicates the unit or counter semantics for the metric. This optional suffix indicates either the unit that the metric is measured in, such asseconds,bytes, orms, or the type of counter the metric represents, such astotal,events, oroperations.注意
Some metric name suffixes don't reflect the actual reported unit for the metric. For example,
mongot_index_stats_query_latency_secondshas the suffix_seconds, butmongotreports the metric in milliseconds, as indicated by thetimeUnit=millisecondslabel in the metric output. To confirm the unit for each metric, check the Unit value in the metric reference tables below.
In addition to the metric name, mongot metrics can include labels (also called dimensions). Labels distinguish multiple time series that share the same base metric name. For example, a metric might use labels to identify a state, status, index type, quantile, or a specific index.
For some metrics, you must interpret the metric as the combination of the metric name and its labels, not by the metric name alone. For example, mongot_replication_mongodb_indexManagerState uses the state label to expose one series for each replication state, such as STEADY_STATE or FAILED. Exactly one of those labeled series has the value 1 at a time. Per-index metrics similarly use labels such as generationId_logString and indexId_logString to distinguish one index from another.
For distribution metrics, the suffix of the metric name indicates the Prometheus series type:
Histograms expose
_bucket,_count,_sum, and_max.Summaries expose
_count,_sum, and_max. Some summaries also include quantile labels such as{quantile="0.5"}.
Common Metric Labels
The following table describes common labels that appear in mongot metrics.
Label Name | Metric Scope | Possible Values |
|---|---|---|
| All executor pools |
|
| Cross-cutting |
|
| Most | Internal opaque Ids (the per-index identifier that the logs use) |
| Many index metrics |
|
| Indexing and initial-sync metrics |
|
| Index size and document metrics |
|
| Latency summary metrics |
|
| Summary metrics |
|
|
|
|
|
|
|
|
|
|
Key Metric Groups
Process and JVM Metrics
Use process and JVM metrics to confirm that mongot is running normally and to identify heap or garbage collection pressure.
プロセス
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | 秒 | The uptime of the Java Virtual Machine. |
| ゲージ | unix seconds | Start time of the process since unix epoch. |
| カウンター | ナノ秒 | The "cpu time" used by the Java Virtual Machine process. Use |
| ゲージ | 0-1 | The "recent cpu usage" for the Java Virtual Machine process. |
JVM Memory
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | バイト | The amount of used memory. Labels: |
| ゲージ | バイト | The amount of memory committed for the Java virtual machine to use. |
| ゲージ | バイト | The maximum memory that can be used. For heap, |
| ゲージ | count | NIO buffer pool counts. Labels: |
| ゲージ | バイト | Memory the JVM uses for NIO buffer pools. |
| ゲージ | バイト | NIO buffer pool capacity. |
JVM Garbage Collection
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| 概要 | 秒 | Time spent in GC pause. No quantile labels. Use |
| 概要 | 秒 | Time spent in concurrent GC phase. |
| ゲージ | バイト | Size of long-lived heap memory pool after reclamation. The "live heap" to watch for memory pressure. |
| ゲージ | バイト | Max size of long-lived heap memory pool. |
| カウンター | バイト | Increase in young heap pool size between GCs. |
| カウンター | バイト | Promotions from young into old generation. |
System Metrics
Use system metrics to monitor host-level CPU, disk, memory, paging, and network conditions that can affect mongot.
CPU
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | count | Processors available to the JVM. |
| ゲージ | 0–1 | Recent system CPU usage. |
| ゲージ | 単位なし | OS 1-minute load average. |
Disk
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | バイト | Free disk space on the |
| ゲージ | バイト | Total disk space on the |
| ゲージ | バイト | Free and total disk space across the file system (different scope than |
| ゲージ | バイト | Bytes read from disk per device. Label: |
| ゲージ | バイト | Bytes written per device. |
| ゲージ | count | Read I/O count per device. Use |
| ゲージ | count | Write I/O count per device. |
| ゲージ | count | Disk queue length (I/Os in progress) per device. |
| ゲージ | ミリ秒 | Time spent reading or writing per device. |
メモリ
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | バイト | Total physical memory on the host. |
| ゲージ | バイト | Physical memory available. |
| ゲージ | バイト | Physical memory in use. |
| ゲージ | バイト | Total physical and virtual memory in use. |
| ゲージ | バイト | Swap state. |
| ゲージ | count | Swap in/out activity. |
| ゲージ | count | Number of memory mappings (relevant for Lucene mmap counts). |
| ゲージ | バイト | System page size. |
ページフォールト
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | count | Major page faults. Use this metric with the storage class advisory threshold. |
| ゲージ | count | Minor page faults. |
ネットワーク
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | バイト | Bytes received and sent per interface ( |
| ゲージ | count | Packets received and sent. |
| ゲージ | count | Error, drop, and collision counters. |
| ゲージ | bits/sec | Negotiated interface speed. |
レプリケーション メトリクス
Use replication metrics to determine whether mongot is healthy, syncing normally, and staying caught up with mongod.
Overall State
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | 0/1 |
|
| ゲージ | 0/1 |
|
| カウンター | count | State transitions. Labels: |
Session Refresher
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | count | Active sessions. |
| カウンター | count | Total session refreshes. |
| カウンター | count | Failed refreshes. |
| 概要 | 秒 | Refresh duration distribution. |
Optime Updater
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| カウンター | count | Optime-update errors. |
| さまざまな | 該当なし | Executor metrics for the optime updater. |
Per-index Metrics
mongot emits the following metrics per index and includes generationId_logString and indexId_logString labels to identify the specific index. Filter by those labels to inspect a specific index, or aggregate across labels to understand fleet-wide behavior.
Index Status and Size
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | 0/1 | Per-index status. One-hot encoded across the |
| ゲージ | 数値 | On-disk index format version. For example, |
| ゲージ | バイト | Total on-disk size of the index. |
| ゲージ | バイト | Largest single file in the index. |
| ゲージ | count | Number of Lucene segment files. |
| ゲージ | count | Lucene documents in the index. |
| ゲージ | count | Maximum Lucene document ID (includes deleted-not-merged). |
| ゲージ | count | Number of indexed Lucene fields. |
| ゲージ | count | Number of Lucene segments. |
| ゲージ | バイト | Estimated required memory for the index. |
Indexing Metrics
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | ミリ秒 | Replication lag per index, in milliseconds. The unit is in the metric name ( |
| ゲージ | BSON Timestamp | Last applied replication optime (numeric encoding). |
| ゲージ | BSON Timestamp | Cap on advance, set by |
| カウンター | count | Indexing operation counts. Label: |
| カウンター | バイト | Total bytes processed by indexing. |
| カウンター | count | Vector fields indexed. |
| 概要 | seconds ( | Per-index commit durations. |
| 概要 | 秒 | Batch duration distribution. |
| カウンター | count | Oversized change-stream events. Label: |
| カウンター | count | Documents rejected for invalid geometry. |
| カウンター | count | Truncated sortable strings. |
| カウンター | count | Exceptions during initial sync. |
| カウンター | count | Steady-state exceptions. |
| ゲージ | count | Consecutive initial-sync resync exceptions for this index. |
クエリ メトリクス
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| カウンター | count | Total queries issued against the index. |
| カウンター | count | Total hits returned. |
| カウンター | count | Queries that failed. |
| カウンター | count | Specific failure-class counters. |
| 概要 | seconds ( | Search batch latency. This is the headline query-latency metric. |
| 概要 | seconds ( | Latency inside Lucene's TopDocs search. Use this metric to distinguish Lucene-internal latency from total latency. |
| 概要 | seconds ( | Vector search result latency. |
| 概要 | seconds ( | Vector search latency phases. |
| 概要 | seconds ( | Facets state-refresh latency. |
| 概要 | bytes / count | Per-batch payload size and document count. |
| histogram | count ( | Distribution of |
| histogram | count | Vector candidates per query, bucketed by quantization. |
| counter / summary | count |
|
| カウンター | count | Result batches with score ties. |
| カウンター | count | Queries that benefited from index sort optimization. |
| カウンター | count | Limit-extraction optimizations triggered. |
| カウンター | count | Phantom-searcher cleanups. |
| 概要 | ratio | Deleted-document ratio in returned results. |
| カウンター | count | Batches that made no forward progress. |
| カウンター | count | Vector-specific counters. |
| カウンター | count | Per-query-feature usage. Label: |
| カウンター | count | Failed |
Per-index Replication Breakdown
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | 0/1 | The canonical replication-state signal. |
| 概要 | bytes / count | Steady-state batch sizes. |
| 概要 | 秒 | Steady-state decoding duration. |
| 概要 | 秒 | Steady-state |
| 概要 | — | Initial-sync change-stream phase metrics (mirrors steady state). |
| 概要 | — | Initial-sync collection-scan phase metrics. |
Lucene Refresh Latency
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| 概要 | seconds ( | Lucene IndexReader refresh latency. |
Command Metrics
mongot accepts the following set of named commands from mongod:
buildinfogetMorehelloisMasterismasterkillCursorsmanageSearchIndexpingplanShardedSearchsearchvectorSearch
For each command, mongot exposes the following metrics, where <name> is a placeholder for the command name:
パターン | タイプ | 説明 |
|---|---|---|
| カウンター | Failure count for the command. |
| 概要 | End-to-end latency including serialization. |
| 概要 | Serialization latency (subset; not all commands). |
Tip
Monitor Search and Vector Search Latency Across Indexes
mongot_command_searchCommandTotalLatency_seconds and mongot_command_vectorSearchCommandTotalLatency_seconds are the primary metrics to monitor for $search and $vectorSearch latency aggregates. These expose latency aggregates for all search and vectorSearch commands across all indexes.
Indexing Scheduler and Dispatcher Metrics
Use indexing scheduler and dispatcher metrics to identify backlog, saturation, and slow work in replication and indexing pipelines.
Steady-state Change-stream Pipeline
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | count | Batches currently being applied. Label: |
| 概要 | 秒 | Duration distribution for in-flight batches. |
| 概要 | 秒 |
|
| ゲージ | count |
|
| ゲージ | count | Scheduled |
| 概要 | 秒 |
|
| 概要 | 秒 | Pre-processing duration per batch. |
| カウンター | count | Total change-stream events observed. Label: |
| カウンター | count | Events that |
| ゲージ | 0/1 | Dispatcher status. Labels: |
| カウンター | count | Events skipped due to missing metadata. |
| カウンター | count | Unexpected batch failures. |
| カウンター | count | Rescheduled embedding getMores. This metric is only available when you configure Automated Embedding. |
| カウンター | count | Failed change-stream mode sampling attempts. |
Indexing Work Scheduler
Indexing work scheduler metrics monitor the queueing and execution of indexing batches.
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | count | Scheduler queue depth. |
| カウンター | count | Enqueue and dequeue counts. |
| 概要 | count | Distribution of batch sizes. Label: |
| 概要 | 秒 | Batch durations. |
| 概要 | 秒 | Scheduling overhead. |
Decoding Work Scheduler
Decoding work scheduler metrics monitor the queueing and execution of change-stream batch decoding.
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | count | Scheduler queue depth. |
| カウンター | count | Enqueue and dequeue counts. |
| 概要 | count | Distribution of batch sizes. Label: |
| 概要 | 秒 | Batch durations. |
| 概要 | 秒 | Scheduling overhead. |
Initial Sync, Lifecycle, and Config Metrics
Use these metrics to track index startup work, recovery, and catalog state.
最初の同期
注意
Some mongot metrics are phase-specific and populate only when the corresponding code path is active. For example, steady-state replication metrics, such as mongot_index_stats_indexing_replicationLagMs and the mongot_index_stats_replication_steadyState_* series, do not populate while an index is in initial sync. Conversely, initial-sync-specific metrics, such as mongot_initialsync_* and mongot_index_stats_replication_initialSync_*, are only relevant while initial sync is running or has run.
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | count | Queued initial syncs. Label: |
| カウンター | count | Embedding initial syncs that were requeued. This metric is only available when you configure Automated Embedding. |
| ゲージ | count | Initial syncs currently in progress. Label: |
| ゲージ | count | Initial syncs queued at the dispatcher. |
| ゲージ | count | In-progress syncs that resumed from a checkpoint. |
| ゲージ | 0/1 | Active collection-scan mode. Label: |
| 概要 | 秒 | Completed sync duration distribution. |
| 概要 | 秒 | Ongoing sync duration. |
| ゲージ | 秒 | Min, max, and sum of in-progress initial sync durations. |
| カウンター | count | Dropped because their on-disk segments could not be read. |
| カウンター | count | Recovered after unreadable segments. Label: |
ライフサイクル
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | count | Indexes currently in the initialized state. |
| 概要 | 秒 | Initialization durations. |
| カウンター | count | Index downloads that failed. |
| カウンター | count | Index drops that failed. |
| カウンター | count | Index initializations that failed. |
Config State
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | count | Indexes currently in the catalog. Labels: |
| ゲージ | count | Indexes being phased out. |
| ゲージ | count | Staged but not yet active indexes. |
| ゲージ | count | Feature-version-4-specific equivalents. |
Cursors and Index Factory
Use these metrics to monitor open cursor state and to detect indexes that mongot dropped or recovered because their on-disk segments were unreadable.
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | count | Currently tracked open cursors. |
| カウンター | count | Indexes dropped because their segments were unreadable. |
| カウンター | count | Recoveries after unreadable segments. Label: |
Lucene Merge
Use these metrics to monitor Lucene segment merge activity, including the number and size of merges in progress, merge input and output sizes, merge durations, and merges discarded by the disk-utilization merge policy.
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | count | Active merges. Label: |
| ゲージ | count | Documents currently being merged. |
| カウンター | count | Total merges executed since startup. |
| カウンター | count | Segments folded by merges. |
| 概要 | バイト | Distribution of merge input sizes. |
| 概要 | バイト | Distribution of merge output sizes. |
| 概要 | count | Documents-per-merge distribution. |
| 概要 | 秒 | Merge duration distribution. |
| カウンター | count | Merges discarded by the disk-utilization-aware policy. |
MongoDB Client Connection Pool Metrics
mongot opens multiple named connection pools to mongod, and labels each pool with a clientName label that identifies the role of each pool. The following table lists possible clientName label values and their corresponding role:
clientName | 目的 |
|---|---|
| Steady-state change-stream replication. |
| Initial sync and session refresh. The |
| Internal metadata service. |
| Optime polling. |
| Database metadata lookups. |
| Server-info lookups. |
| Lease manager. |
| Automated embedding writes. This connection pool only appears when you configure Automated Embedding. |
The following table lists the available metrics for mongot connection pools:
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| ゲージ | count | Currently open connections in the pool. |
| ゲージ | count | Connections currently checked out. |
| ゲージ | count | Configured max pool size. |
| ゲージ | count | Configured min pool size. |
| カウンター | count | Successful native OpenSSL link attempts. |
| カウンター | count | Failed native OpenSSL link attempts. |
シノニム
Use these metrics to monitor synonym synchronization activity, including collection scans, scan and sync durations, queue depth, and exceptions encountered during synonym sync.
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| カウンター | count | Total collection scans performed for synonyms. |
| カウンター | count | Synonym scans triggered by change-stream events. |
| 概要 | 秒 | Scan duration distribution. |
| 概要 | 秒 | Sync duration distribution. |
| ゲージ | count | Current synonym sync queue depth. |
| カウンター | count | Synonym sync exceptions. |
Executor Pools
Use these metrics to monitor the named executor pools that mongot uses to run background work. Each pool exposes the same set of sub-metrics, prefixed with the pool name, so you can track thread activity, pool sizing, queue depth, task throughput, and per-task execution time across all pools.
The following table lists the sub-metrics that every executor pool exposes, where <pool> is the pool name prefix. All executor-pool sub-metrics carry the label name="executorMetrics".
Sub-Metric Suffix | タイプ | 説明 |
|---|---|---|
| ゲージ | Threads currently executing tasks. |
| ゲージ | Pool sizing. |
| ゲージ | Tasks waiting for a thread — the saturation signal. |
| ゲージ | Remaining queue capacity. |
| カウンター | Tasks completed since startup. |
| 概要 | Time threads spent idle between tasks. |
| 概要 | Per-task execution time. |
| カウンター | Scheduled task counts (for scheduling pools). |
The following table lists the prefixes for all available named executor pools and their respective purposes:
Executor Pool Prefix | What it runs |
|---|---|
| Blob-store lifecycle work. |
| Blocking gRPC server worker threads. |
| Change-stream mode selection. |
| Change-stream sync dispatching (one of the busiest in steady state). |
| Config-monitor polling. |
| Decoding pipeline workers. |
| Disk-monitor polling. |
| gRPC health check timer. |
| Idle cursor reaping. |
| Index commit operations. |
| Per-index lifecycle work. |
| Lucene IndexReader refreshes. |
| Indexing pipeline workers (the busiest indexing pool in steady state). |
| Indexing-lifecycle work. |
| Automated-embedding indexing path. This executor pool only appears when you configure Automated Embedding. |
| Init-time lifecycle work. |
| Materialized-view tracking and lifecycle. These metrics are only available when Automated Embedding or other materialized-view-backed features are configured. |
| Optime updater (background). |
| Session refresher. |
| System metrics updater. |
Tip
Watch Saturation Across All Executor Pools
To monitor saturation across all executor pools, run the following PromQL query:
max by (pool) ( label_replace( {__name__=~"mongot_.+_executor_queued_tasks"}, "pool", "$1", "__name__", "mongot_(.+)_executor_queued_tasks" ) )
This query returns the queued-task count for each executor pool.
Prometheus Server Self-Metrics
The following metric is available for the embedded Prometheus server in mongot:
メトリクス | タイプ | 単位 | 説明 |
|---|---|---|---|
| 概要 | seconds ( | How long |
Configuration-Specific Metrics
The following metric families appear in the /metrics output only when you enable specific features.
Metric Family | 説明 | Availability in Self-Managed mongot |
|---|---|---|
| Metrics related to Automated Embedding. For example, | Appear only when you configure Automated Embedding. |
| Failure count for the FTDC executor. | Appears only when you enable the |
PromQL Examples
Most latency metrics in this catalog are summaries, not histograms, so use their published quantile labels directly when they exist. A smaller number of metrics, such as mongot_index_stats_query_limitPerQuery and mongot_index_stats_query_numCandidatesPerQuery, are histograms and expose _bucket series.
# Replication state max by (state) (mongot_replication_mongodb_indexManagerState == 1) # Maximum replication lag across all indexes, converted to seconds max(mongot_index_stats_indexing_replicationLagMs) / 1000 # Index count by status count by (status) (mongot_index_stats_indexStatusCode == 1) # Search query p99 latency across all indexes max(mongot_index_stats_query_searchResultBatchLatencies_seconds{quantile="0.99"}) # Worst recent GC pause max(mongot_jvm_gc_pause_seconds_max) # Average GC pause over 5 minutes rate(mongot_jvm_gc_pause_seconds_sum[5m]) / rate(mongot_jvm_gc_pause_seconds_count[5m]) # Free disk percentage on dataPath mongot_system_disk_space_data_path_free_bytes / mongot_system_disk_space_data_path_total_bytes # Major page fault rate rate(mongot_system_process_majorPageFaults_operations[5m]) # Steady-state and initial sync exceptions over 15 minutes sum(rate(mongot_index_stats_indexing_steadyStateExceptions_total[15m])) sum(rate(mongot_index_stats_indexing_initialSyncExceptions_total[15m]))