Hard to explain timeout errors

Ernest_Mishkin · May 23, 2024, 6:23am

Hello,

We’re running a fairly small (in terms of traffic / volume) python app on Heroku.
It uses pymongo 4.6.3 with mongoengine 0.28.2 on top to connect to an M10 Atlas cluster.

We’re facing a tough problem with regularly occurring (at least daily) episodes of queries timing out. All the server metrics are super-happy, show hardly any sweat, query insights shows max latencies in tens of ms. Yet, we start seeing errors like these pop up

pymongo.errors.ExecutionTimeout: operation would exceed time limit, remaining timeout:-1.55924 <= network round trip time:0.00538  (configured timeouts: timeoutMS: 10000.0ms, connectTimeoutMS: 3000.0ms), full error: {'ok': 0, 'errmsg': 'operation would exceed time limit, remaining timeout:-1.55924 <= network round trip time:0.00538  (configured timeouts: timeoutMS: 10000.0ms, connectTimeoutMS: 3000.0ms)', 'code': 50}

which are eventually followed by

pymongo.errors.NetworkTimeout: duckbill-prod-a9f81d6-shard-00-02.j73pa.mongodb.net:27017: timed out (configured timeouts: timeoutMS: 10000.0ms, connectTimeoutMS: 3000.0ms)

and/or

pymongo.errors.WaitQueueTimeoutError: Timed out while checking out a connection from connection pool. maxPoolSize: 100, timeout: 10.0

We had a few support cases open but the response is always

these are “network issues” (duh), and
setup VPC peering (we’d love to but it’s super expensive on Heroku)

Here are the MongoClient connection options for the reference:

        w='majority',
        read_preference=ReadPreference.PRIMARY_PREFERRED,
        connectTimeoutMS=3_000,
        socketTimeoutMS=20_000,
        timeoutMS=10_000,
        maxIdleTimeMS=600_000,
        minPoolSize=5

Perhaps somebody here is also running on Heroku? Have you ever faced similar issues?
Any general / directional troubleshooting recommendations?

Thank you!

Shane · July 29, 2025, 4:24pm

These errors occur when a pymongo operation (like find_one(), insert_one(), etc…) exceeds the timeoutMS. When these errors occur do they correspond with server maintenance events like rolling restarts or election of a new primary?

It would be great if you could enable debug logging in pymongo so we can trace the lifecycle of one of these failing operations. Note that logging was added in pymongo 4.7.0: Logging - PyMongo Driver - MongoDB Docs

Mustafa_Mahdi · July 31, 2025, 6:52am

Thanks for sharing. Timeout issues despite healthy metrics can be frustrating—curious to see if it’s a network latency issue or something with connection pooling.