I’m looking for some assistance with my mongo replica-set setup on EKS.
Currently, I have two pods (primary and secondary DB instances) along with the operator. To connect from a local computer to any of the instances, I use a VPN and then specify the corresponding pod’s endpoint in the mongo URI. However, I’m facing two issues with this approach.
Firstly, the endpoints keep changing whenever there’s a node restart or similar events, which requires regular maintenance to update the connection information.
Secondly, connecting to the desired instance isn’t automatic. I have to manually specify the primary instance’s endpoint in the URI to connect to it. I’m looking for an alternative solution, something like using the readPreference parameter.
Regarding the first issue, I’ve deployed an internal K8s Load Balancer. Now the connection can be made using the Load Balancer’s ExternalIP in the connection string. However, the problem with this is that the Load Balancer doesn’t know which instance is the primary or secondary, so it connects to either of the two regardless of the specified parameters.
With all this considered, I’m currently stuck. I’d greatly appreciate any suggestions or observations you may have.
As recommended for the production environment, hardcoding an IP address of the pod is not recommended as the pod IPs are subjected to change in the case of a pod restart.
In order to avoid the manual alterations of the connection address, you can specify the DNS names to the pods and use the names for establishing the connection. You can read more about the DNS for service and Pods documentations..
In addition, could you help me understand, what is the deployment type are you using between K8 deployment and K8 statefulsets resources as the later are useful for applications that need more stable network identities.
In my experience, the statefulsets deployments have unique network IDs that are retained through the pod restarts.
If I understand correctly, I think you can make use of the official MongoDB driver to connect to the replica set instead of using a load balancer. Note that official drivers need to connect to all members of a replica set and monitor their status as per the server discovery and monitoring spec implemented by all official drivers.
Considering the scenario where all the members of the replica set are up and running, the setting up the readPreference would help you to read the data from the desired node.
Finally, could you also confirm, if the production environment has one primary, one secondary and one load balancer pods in order to route between the replica sets’ primary and secondary node?
If yes, please note that deploying an even number of nodes in a replica set is not a recommended configuration as per the Deploy a Replica Set page.
Please feel free to reach out if you have further questions.