Join us Sept 17 at .local NYC! Use code WEB50 to save 50% on tickets. Learn more >
MongoDB Event
Docs Menu
Docs Home
/ /
Atlas Kubernetes Operator
/

View Metrics and Troubleshoot Resource Issues

The AKO binary exposes standard controller-runtime metrics on http://localhost:8080/metrics. There, you can find the following:

  • Total number of reconciliation errors and successful reconciles per controller.

  • Length of reconcile queues per controller.

  • Reconciliation latency.

  • Standard resource metrics such as CPU, memory usage, and file descriptor usage.

  • Go runtime metrics such as the number of Go routines and GC duration.

To learn more, see Controller Metrics.

This problem occurs when the AtlasProject resource is not in a Ready state. It can occur with every Atlas Kubernetes Operator resource type.

  • The resource is not in a Ready state.

  • A high error rate.

To monitor the error rate, you can create a query to calculate the reconciliation error rate for the AtlasProject controller as a percentage over the last minute. This metric helps in identifying and monitoring the health and stability of the AtlasProject controller. A high or rising error percentage indicates issues in the reconciliation process.

To calculate the error rate, use the following Prometheus query:

100 * rate(controller_runtime_reconcile_errors_total{controller="AtlasProject"}[1m]) / rate(controller_runtime_reconcile_total{controller="AtlasProject"}[1m])

Check the resource status condition for further details:

status:
conditions:
- type: Ready
status: "False"
reason: ....
  1. Verify Resource Status:

    • Check the status condition message for more detailed information.

    • If the AtlasProject is not ready, proceed with the next troubleshooting steps.

  2. Check Connection Secret:

    • Ensure the connection secret referenced by spec.connectionSecretRef.name is correctly labeled with atlas.mongodb.com/type=credentials.

  3. Investigate Logs:

    • Review logs for the AtlasProject controller for any potential errors or failed reconciliation attempts.

Back

Compatibility

On this page