Hello,
This is following on my previous topic.
We are still having issues with the connection to MongoDB from our GCP Cloud Run service.
Stack:
- GCP Cloud Run
- Connection set up via VPC Network Peering (requests from Cloud Run to private IPs are routed through the VPC connector)
- Node.js (v18) with mongodb driver (v4.10.0)
- Connection string “mongodb+srv://user:password@cluster-pri.xxxxx.mongodb.net”
- MongoDB Atlas on version 6.0.3 (provider GCP), M30
GCP support team has verified that the configuration on that side is correct. The issue seems to be related to MongoDB.
THE ISSUE
Multiple times per day, we get many errors on our server about the connection with MongoDB. Here are some examples:
VARIANT 1
"MongoNetworkTimeoutError: connection timed out
at connectionFailureError (/app/node_modules/mongodb/lib/cmap/connect.js:389:20)
at TLSSocket.<anonymous> (/app/node_modules/mongodb/lib/cmap/connect.js:310:22)
at Object.onceWrapper (node:events:627:28)
at TLSSocket.emit (node:events:513:28)
at TLSSocket.emit (node:domain:489:12)
at Socket._onTimeout (node:net:568:8)
at listOnTimeout (node:internal/timers:564:17)
at process.processTimers (node:internal/timers:507:7) {"
VARIANT 2
"MongoServerSelectionError: connection <monitor> to 192.168.248.2:27017 timed out
at Timeout._onTimeout (/app/node_modules/mongodb/lib/sdam/topology.js:285:38)
at listOnTimeout (node:internal/timers:564:17)
at process.processTimers (node:internal/timers:507:7) {"
VARIANT 3
MongoServerSelectionError: connection 1 to 35.233.114.132:27017 closed
at .listOnTimeout ( node:internal/timers:564 )
at process.processTimers ( node:internal/timers:507 )
VARIANT 4
MongoNetworkError: connection 96 to 192.168.248.3:27017 closed
at .TLSSocket.emit ( node:events:513 )
at .TLSSocket.emit ( node:domain:489 )
at undefined. ( node:net:313 )
at .TCP.done ( node:_tls_wrap:587 )
VARIANT 5
PoolClearedError [MongoPoolClearedError]: Connection pool for production-shard-00-01-pri.xxxxx.mongodb.net:27017 was cleared because another operation failed with: "connection <monitor> to 192.168.248.3:27017 timed out"
at .Server.emit ( events.js:400 )
at .Server.emit ( domain.js:475 )
at .Monitor.emit ( events.js:400 )
VARIANT 6
MongoPoolClearedError: Connection pool for production-shard-00-01-pri.xxxxx.mongodb.net:27017 was cleared because another operation failed with: "connection <monitor> to 192.168.248.3:27017 timed out"
at .Server.emit ( node:events:513 )
at .Server.emit ( node:domain:489 )
at .Monitor.emit ( node:events:513 )
VARIANT 7
PoolClearedOnNetworkError: Connection to production-shard-00-02-pri.xxxxx.mongodb.net:27017 interrupted due to server monitor timeout
The issues happen on our QA and development environment too, but much less frequently due to much lower usage.
In short
Unstable connection between Cloud Run and MongoDB. The connection is closed, or the connection timed out, or the connection pool was cleared.
The IP address of the GCP VPC Network (subnet) is whitelisted on MongoDB side.
What we tried
- Using both node:18-alpine and node:18 docker images
- Using a Cloud NAT to get a fixed IP and go over public internet: this was even more unstable (see original post)
- On our DEV environment, I lowered the minimum TLS Protocol Version from “TLS 1.2 and above” to “TLS 1.0 and above”. It was unclear if this improved the situation. I did not yet try this on our production environment because it is strongly advised by MongoDB to use 1.1+ or 1.2+.
Any help or ideas would be appreciated!
Thanks!