Application randomly throws 'Failed to look up SRV record "_mongodb._tcp.somedomain-pri.abcde.mongodb.net": No such file or directory'

Hi, we have an application running in Azure AKS which connects to a Mongo Atlas database through a vnet-peered network. Things work well for the most part, but from time to time (once a week to every two days) the application seems to get stuck in a bad place and the only way out is restarting the failing pod.

When the application gets into the “bad place”, all attempts at a database connection result in

Failed to look up SRV record "_mongodb._tcp.somedomain-pri.abcde.mongodb.net": No such file or directory

This should not be an infrastructure DNS problem because the database connection works in most cases. And immediately after this problem occurs, after the pod is restarted, it works again.

Our stack is PHP8, Symfony+Doctrine ODM2, mongodb ext 1.9.1, Atlas version 4.4

Any ideas on why this could be happening?

Thanks in advance

Well, as a workaround, using the regular mongodb:// connection string without SRV (and explicitly including all the nodes to connect to) seems to work fine.

That is, replacing this

mongodb+srv://xyz.foobar.mongodb.net

with this

mongodb://abcd-shard-00-00-pri.foobar.mongodb.net:27017,abcd-shard-00-01-pri.foobar.mongodb.net:27017,abcd-shard-00-02-pri.foobar.mongodb.net:27017/dbname?ssl=true&replicaSet=atlas-ijklm-shard-0&authSource=admin&retryWrites=true&w=majority

The long connection string can be found in the Atlas UI, under the Connect menu, when you select an old enough version of the driver to use (one that did not yet know how to work with mongodb+srv://)