MongoDB Ops Manager shows Primary Member of MongoDB Replicaset is not available although it is!

Hi everybody,

i have a k3s cluster with six nodes on which I deployed MongoDB Enterprise Kubernetes Operator. The Operator ist working fine and behaves generally as expected.

My problem is that MongoDB Ops Manager says the primary member of my replica set unavailable:

Checking the “Servers” tab in MongoDB Ops Manager tells otherwise:

As you can see the MongoDB Members are working as expected with all the features enabled. Also, the “Metrics” tab shows all the metrics I am interested in. Furthermore, these findings imply that MongoDB Automation Agents are working correctly too. So I come to the conclusion that all components, that is mongod and mongodb agent, are well and healthy which is confirmed by the logs of MongoDB Operator:

Is anybody familiar with this issue? If so, can anybody hint me one or maybe two options on how to solve this?

Thanks in advance.

Well, I restarted the deployment and that solved the problem.

But I don’t think this is a long term solution. Especially in enterprise environments. If anybody out there is familiar with this situation please have me know. I appreciate any hint.

Hi @Marco_80669

Glad you have found the solution to the issue.

I don’t think this is a long term solution. Especially in enterprise environments.

I agree this is not an ideal situation in mission-critical enterprise environment. Frequently, these kind of issues are caused by the environment, and a specialized 1-1 support is usually needed to resolve it. Since Ops Manager is part of the enterprise advanced subscription, you will have access to support and thus would be able to contact support when these type of issue surfaces.

If you’re evaluating Ops Manager and would like to know more, please feel free to send a DM to me so I can connect you to the right people.

Best regards
Kevin

Hi Kevin,

thanks for replying to my issue. Do you have any hints on what environmental topics might be reason for this to occur? We are focussed to understand our environment deeply, thus we’d like to do some analysis on ourself so we can better explain what might be the reason for this.

We’d appreciate any hint. Each is of great value for us.

Thanks in advance.

That’s impossible to say without exact knowledge of the infrastructure and deployment methods. However in a very, very general sense, I would say it can be caused by the Ops Manager installation itself (i.e. it was not installed properly), or perhaps network issues. My first suggestion is to check with support since they’ll have more experience troubleshooting Ops Manager deployments, but if you can provide them with observed patterns when/if these issues are occuring, that would be one of the first steps as well.

Best regards
Kevin

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.