Docs Menu

Docs HomeMongoDB Enterprise Kubernetes Operator

Recover the Kubernetes Operator and Ops Manager for Multi-Cluster AppDB Deployments

On this page

  • Prerequisites
  • Considerations
  • Procedure

If you host an Ops Manager resource in the same Kubernetes cluster as the Kubernetes Operator and have the Application Database (AppDB) deployed on selected member clusters in your multi-Kubernetes-cluster deployment, you can manually recover the Kubernetes Operator and Ops Manager in the event that the cluster fails.

To learn more about deploying Ops Manager on a central cluster and the Application Database across member clusters, see Using Ops Manager with Multi-Kubernetes-Cluster Deployments.

Before you can recover the Kubernetes Operator and Ops Manager, ensure that you meet the following requirements:

  • Configure backups for your Ops Manager and Application Database resources, including any ConfigMaps and secrets created by the Kubernetes Operator, to indicate the previous running state of Ops Manager. To learn more, see Backup.

  • The Application Database must have at least three healthy nodes remaining after failure of the Kubernetes Operator's cluster.

  • The healthy clusters in your multi-Kubernetes-cluster deployment must contain a sufficient number of members to elect a primary node. To learn more, see Application Database Architecture.

Because the Kubernetes Operator doesn't support forcing a replica set reconfiguration, the healthy Kubernetes clusters must contain a sufficient number of Application Database members to elect a primary node for this manual recovery process. A majority of the Application Database members must be available to elect a primary. To learn more, see Replica Set Deployment Architectures.

If possible, use an odd number of member Kubernetes clusters. Proper distribution of your Application Database members can help to maximize the likelihood that the remaining replica set members can form a majority during an outage. To learn more, see Replica Sets Distributed Across Two or More Data Centers.

Consider the following examples:


To recover the Kubernetes Operator and Ops Manager, restore the Ops Manager resource on a new Kubernetes cluster:

1

Follow the instructions to install the Kubernetes Operator in a new Kubernetes cluster.

Note

If you plan to re-use a member cluster, ensure that the appropriate service account and role exist. These values can overlap and have different permissions between the central cluster and member cluster.

To see the appropriate role required for the Kubernetes Operator, refer to the sample in the public repository.

2

Copy the object specification for the failed Ops Manager resource and retrieve the following resources, replacing the placeholder text with your specific Ops Manager resource name and namespace.

Resource Type
Values
Secrets
  • <om-name>-db-om-password

  • <om-name>-db-agent-password

  • <om-name>-db-keyfile

  • <om-name>-db-om-user-scram-credentials

  • <om-namespace>-<om-name>-admin-key

  • <om-name>-admin-secret

  • <om-name>-gen-key

  • TLS certificate secrets (optional)

ConfigMaps
  • <om-name>-db-cluster-mapping

  • <om-name>-db-member-spec

  • Custom CA for TLS certificates (optional)

OpsManager
  • <om-name>

Then, paste the specification that you copied into a new file and configure the new resource by using the preceding values. To learn more, see Deploy an Ops Manager Resource.

3

Use the following command to apply the updated resource:

kubectl apply \
--context "$MDB_CENTRAL_CLUSTER_FULL_NAME" \
--namespace "mongodb"
-f https://raw.githubusercontent.com/mongodb/mongodb-enterprise-kubernetes/master/samples/ops-manager/ops-manager-external.yaml

To check the status of your Ops Manager resource, use the following command:

kubectl get om -o yaml -w

Once the central cluster reaches a Running state, you can re-scale the Application Database to your desired distribution of member clusters.

4

To host your MongoDB resource or MongoDBMultiCluster resource on the new Kubernetes Operator instance, apply the following resources to the new cluster:

  • The ConfigMap used to create the initial project.

  • The secrets used in the previous Kubernetes Operator instance.

  • The MongoDB or MongoDBMulticluster custom resource at its last available state on the source cluster, including any Annotations added by the Kubernetes Operator during its lifecycle.

Note

If you deployed a MongoDB resource and not a MongoDBMultiCluster resource and wish to migrate the failed Kubernetes cluster's data to the new cluster, you must complete the following additional steps:

  1. Create a new MongoDB resource on the new cluster.

  2. Migrate the data to the new resource by Backing Up and Restoring the data in Ops Manager.

If you deployed a MongoDBMultiCluster resource, you must re-scale the resource that you applied on the new healthy clusters if the failed cluster contained any Application Database nodes.

←  Set Up a cert-manager IntegrationDeploy and Configure MongoDB Database Resources →