Hello,
I have created an AWS Lambda function in Python 3.9 and pymongo 4.3.3. It should count some documents in two collections and send the counts to slack. I’m facing an issue where at times the first count_documents function call causes the function to hang for 30 seconds and timeout with:
[ERROR] ServerSelectionTimeoutError: cluster0-shard-00-00-....mongodb.net:27017: timed out,cluster0-shard-00-01-....mongodb.net:27017: timed out,cluster0-shard-00-02-....mongodb.net:27017: timed out, Timeout: 30s, Topology Description: <TopologyDescription id: 645dcca1e91881ffaf45360b, topology_type: ReplicaSetNoPrimary, servers: [<ServerDescription ('cluster0-shard-00-00-....mongodb.net', 27017) server_type: Unknown, rtt: None, error=NetworkTimeout('cluster0-shard-00-00-....mongodb.net:27017: timed out')>, <ServerDescription ('cluster0-shard-00-01-....mongodb.net', 27017) server_type: Unknown, rtt: None, error=NetworkTimeout('cluster0-shard-00-01-....mongodb.net:27017: timed out')>, <ServerDescription ('cluster0-shard-00-02-....mongodb.net', 27017) server_type: Unknown, rtt: None, error=NetworkTimeout('cluster0-shard-00-02-....mongodb.net:27017: timed out')>]>
Im setting
client = MongoClient(uri, readPreference='secondaryPreferred')
outside the handler function. This uri is of the format
mongodb+srv://<user>:<pass>@cluster0-....mongodb.net/db
(the db is probably unnecessary, but it is what I use in another app and get it through param store)
And running client.db.collection.count_documents() with:
"createdAt": {
"$gte": datetime(year=yesterday.year, month=yesterday.month, day=yesterday.day),
"$lt": datetime(year=today.year, month=today.month, day=today.day)
},
to get all documents for yesterday.
The lambda function gets as far as the first count_documents call and hangs for the default server selection timeout of 30 seconds and times out and exits the function via error. At times, the function works correctly.
I’ve set the function to be invoked via CRON and AWS Lambda retries twice by default, but even that isn’t always enough. It failed last night and this morning invoking manually it worked on the 19th try. Then sometimes it works on the first try without a problem. This makes me believe it is not a networking issue. The Lambda resides in a VPC and the NAT GW ip is allowed in Mongo Atlas.