Troubleshoot Deployments with Multiple Kubernetes Clusters

To troubleshoot your multi-Kubernetes-cluster deployments, use the procedures in this section.

Recover from a Kubernetes Cluster Failure

This procedure uses the same cluster names as in the Prerequisites. If the cluster MDB_CLUSTER_1 that holds MongoDB nodes goes down, and if you provision a new cluster named MDB_CLUSTER_4 instead of MDB_CLUSTER_1 to hold the new MongoDB nodes, run the MongoDB kubectl plugin with the updated list of member clusters, and then edit the MongoDBMultiCluster resource spec on the central cluster.

To reconfigure the multi-Kubernetes-cluster deployment after a cluster failure, replace the failed Kubernetes cluster with the newly provisioned cluster as follows:

  1. Run the MongoDB kubectl plugin with the recover parameter and the new cluster MDB_CLUSTER_4 specified in the -member-clusters option. This enables the Kubernetes Operator to communicate with the new cluster to schedule MongoDB nodes on it. In the following example, -member-clusters contains ${MDB_CLUSTER_4_FULL_NAME}.

    kubectl mongodb multicluster recover \
      --central-cluster="MDB_CENTRAL_CLUSTER_FULL_NAME" \
      --member-clusters="${MDB_CLUSTER_2_FULL_NAME},${MDB_CLUSTER_3_FULL_NAME},${MDB_CLUSTER_4_FULL_NAME}" \
      --member-cluster-namespace="mongodb" \
      --central-cluster-namespace="mongodb" \
      --operator-name=mongodb-enterprise-operator-multi-cluster \
  2. On the central cluster, locate and edit the MongoDBMultiCluster resource spec to add the new cluster name to the clusterSpecList and remove the failed Kubernetes cluster from this list. The resulting list of cluster names should be similar to the following example:

      - clusterName: ${MDB_CLUSTER_4_FULL_NAME}
        members: 3
      - clusterName: ${MDB_CLUSTER_2_FULL_NAME}
        members: 2
      - clusterName: ${MDB_CLUSTER_3_FULL_NAME}
        members: 3
  3. Restart the Kubernetes Operator Pod. After the restart, the Kubernetes Operator should reconcile the MongoDB deployment on the newly created MDB_CLUSTER_4 cluster that you created as a replacement for the MDB_CLUSTER_1 failure. To learn more about resource reconciliation, see Deployment Architecture and Diagrams.

Also see ConfigMap Name mongodb-enterprise-operator-member-list is Hard-Coded.