No connection to MongoDB from Replit

I am getting this connection error from replit since weekend.

const serverSelectionError = new ServerSelectionError();
                               ^

MongooseServerSelectionError: getaddrinfo ENOTFOUND ac-otj2fk4-shard-00-01.jqf3ndr.mongodb.net
    at NativeConnection.Connection.openUri (/home/runner/boilerplate-project-exercisetracker/node_modules/mongoose/lib/connection.js:824:32)
    at /home/runner/boilerplate-project-exercisetracker/node_modules/mongoose/lib/index.js:412:10
    at /home/runner/boilerplate-project-exercisetracker/node_modules/mongoose/lib/helpers/promiseOrCallback.js:41:5
    at new Promise (<anonymous>)
    at promiseOrCallback (/home/runner/boilerplate-project-exercisetracker/node_modules/mongoose/lib/helpers/promiseOrCallback.js:40:10)
    at Mongoose._promiseOrCallback (/home/runner/boilerplate-project-exercisetracker/node_modules/mongoose/lib/index.js:1265:10)
    at Mongoose.connect (/home/runner/boilerplate-project-exercisetracker/node_modules/mongoose/lib/index.js:411:20)
    at Object.<anonymous> (/home/runner/boilerplate-project-exercisetracker/index.js:9:10)
    at Module._compile (node:internal/modules/cjs/loader:1105:14)
    at Object.Module._extensions..js (node:internal/modules/cjs/loader:1159:10) {
  reason: TopologyDescription {
    type: 'ReplicaSetNoPrimary',
    servers: Map(3) {
      'ac-otj2fk4-shard-00-00.jqf3ndr.mongodb.net:27017' => ServerDescription {
        address: 'ac-otj2fk4-shard-00-00.jqf3ndr.mongodb.net:27017',
        type: 'RSSecondary',
        hosts: [
          'ac-otj2fk4-shard-00-00.jqf3ndr.mongodb.net:27017',
          'ac-otj2fk4-shard-00-01.jqf3ndr.mongodb.net:27017',
          'ac-otj2fk4-shard-00-02.jqf3ndr.mongodb.net:27017'
        ],
        passives: [],
        arbiters: [],
        tags: {
          provider: 'AWS',
          workloadType: 'OPERATIONAL',
          nodeType: 'ELECTABLE',
          region: 'US_EAST_1'
        },
        minWireVersion: 0,
        maxWireVersion: 13,
        roundTripTime: 217.72000000000003,
        lastUpdateTime: 6177723,
        lastWriteDate: 2022-11-29T08:19:43.000Z,
        error: null,
        topologyVersion: {
          processId: ObjectId { [Symbol(id)]: [Buffer [Uint8Array]] },
          counter: 5
        },
        setName: 'atlas-u7gv3z-shard-0',
        setVersion: 11,
        electionId: null,
        logicalSessionTimeoutMinutes: 30,
        primary: 'ac-otj2fk4-shard-00-01.jqf3ndr.mongodb.net:27017',
        me: 'ac-otj2fk4-shard-00-00.jqf3ndr.mongodb.net:27017',
        '$clusterTime': {
          clusterTime: Timestamp { low: 10, high: 1669709983, unsigned: true },
          signature: { hash: [Binary], keyId: [Long] }
        }
      },
      'ac-otj2fk4-shard-00-01.jqf3ndr.mongodb.net:27017' => ServerDescription {
        address: 'ac-otj2fk4-shard-00-01.jqf3ndr.mongodb.net:27017',
        type: 'Unknown',
        hosts: [],
        passives: [],
        arbiters: [],
        tags: {},
        minWireVersion: 0,
        maxWireVersion: 0,
        roundTripTime: -1,
        lastUpdateTime: 6186787,
        lastWriteDate: 0,
        error: MongoNetworkError: getaddrinfo ENOTFOUND ac-otj2fk4-shard-00-01.jqf3ndr.mongodb.net
            at connectionFailureError (/home/runner/boilerplate-project-exercisetracker/node_modules/mongoose/node_modules/mongodb/lib/cmap/connect.js:387:20)
            at TLSSocket.<anonymous> (/home/runner/boilerplate-project-exercisetracker/node_modules/mongoose/node_modules/mongodb/lib/cmap/connect.js:310:22)
            at Object.onceWrapper (node:events:642:26)
            at TLSSocket.emit (node:events:527:28)
            at emitErrorNT (node:internal/streams/destroy:157:8)
            at emitErrorCloseNT (node:internal/streams/destroy:122:3)
            at processTicksAndRejections (node:internal/process/task_queues:83:21) {
          cause: Error: getaddrinfo ENOTFOUND ac-otj2fk4-shard-00-01.jqf3ndr.mongodb.net
              at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:71:26) {
            errno: -3008,
            code: 'ENOTFOUND',
            syscall: 'getaddrinfo',
            hostname: 'ac-otj2fk4-shard-00-01.jqf3ndr.mongodb.net'
          },
          [Symbol(errorLabels)]: Set(1) { 'ResetPool' }
        },
        topologyVersion: null,
        setName: null,
        setVersion: null,
        electionId: null,
        logicalSessionTimeoutMinutes: null,
        primary: null,
        me: null,
        '$clusterTime': null
      },
      'ac-otj2fk4-shard-00-02.jqf3ndr.mongodb.net:27017' => ServerDescription {
        address: 'ac-otj2fk4-shard-00-02.jqf3ndr.mongodb.net:27017',
        type: 'Unknown',
        hosts: [],
        passives: [],
        arbiters: [],
        tags: {},
        minWireVersion: 0,
        maxWireVersion: 0,
        roundTripTime: -1,
        lastUpdateTime: 6186661,
        lastWriteDate: 0,
        error: MongoNetworkError: getaddrinfo ENOTFOUND ac-otj2fk4-shard-00-02.jqf3ndr.mongodb.net
            at connectionFailureError (/home/runner/boilerplate-project-exercisetracker/node_modules/mongoose/node_modules/mongodb/lib/cmap/connect.js:387:20)
            at TLSSocket.<anonymous> (/home/runner/boilerplate-project-exercisetracker/node_modules/mongoose/node_modules/mongodb/lib/cmap/connect.js:310:22)
            at Object.onceWrapper (node:events:642:26)
            at TLSSocket.emit (node:events:527:28)
            at emitErrorNT (node:internal/streams/destroy:157:8)
            at emitErrorCloseNT (node:internal/streams/destroy:122:3)
            at processTicksAndRejections (node:internal/process/task_queues:83:21) {
          cause: Error: getaddrinfo ENOTFOUND ac-otj2fk4-shard-00-02.jqf3ndr.mongodb.net
              at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:71:26) {
            errno: -3008,
            code: 'ENOTFOUND',
            syscall: 'getaddrinfo',
            hostname: 'ac-otj2fk4-shard-00-02.jqf3ndr.mongodb.net'
          },
          [Symbol(errorLabels)]: Set(1) { 'ResetPool' }
2 Likes

I am glad you made this separate post with error details. Although the error family is the same, MongooseServerSelectionError, your is different than the one in the other post: getaddrinfo ENOTFOUND. This error comes up when there is a problem in the DNS server your app’s host uses.

However, I suspect the cause is the same: The container in which your app starts has the problem. Assuming your app is also a free one you may try the temporary solution I offered there: 8th answer in that post

also, check the given IP address along with the port test with this: curl http://portquiz.net:27017/. by the way, use the command in your repl’s shell.

I also suggest sending a bug report from within the repl (help button on bottom-left) about this problem so to make them replit team aware of the situation.

2 Likes

I’ve sent a bug report yesterday as you suggested. It looks like the problem has gone, at least for now.
Thank you for your swift response and helping out.

they have responded today saying they are aware of the problem :wink:

Hey there,
Sorry to hear you’re having issues with MongoDB!
We are aware of this issue and are working on a fix. We will follow up as soon as we have an update!

1 Like

Is there a bug tracker issue we can follow or any announcement about the downtime? This would seem to violate the SLA as my hosting provider cannot connect, but nothing is being reported https://status.cloud.mongodb.com/.

I did some debugging and it seems like a firewall issue on your end. All versions of MongoDB’s NodeJs driver are affected.

@Dave_Powers , this problem is not on MongoDB side. It is some new bug on some replit containers, probably arose on some cloud providers they use. Unfortunately, there is no estimate on how long this will bug us.
If it a free one, you can try your luck reloading your repl as many times and hope it starts into a container that can connect to MongoDB. I can’t say the same for powered repls as I don’t know how to stop/restart one (free ones stops when you leave the page).

They seem to think it’s a bug on your end, based on Discord discussions. What exactly is happening?

From the error it sounds like MongoDB is looking for a specific DNS record but can’t find it. It doesn’t seem to be looking for an A record, is it perhaps a AAAA/TXT? Could this be caused by an overly optimized DNS server like Cloudflare’s 1.1.1.1? Are there any known workarounds to get our apps up again?

ReplIt uses Google Cloud Platform for hardware, with NixOS containers. NixOS is not the issue, I run the same version on my desktop. You can restart always-on containers with kill 1.

Hi @Ray_Foss, It is not about just restarting the repl, you need to do that until you hit a working IP range. I am not aware of discussions on discord. can you link us there?

by the way, we opened this post after another discussion about a very similar problem (just a small difference in the error message). And replit teams had a response to my bug report saying they are aware. You can find the link to the other discussion, and their response in my above answers.

Found the workaround thank you. The connection is actively being blocked based on IP address, as every ReplIt container runs the same software… Therefore the likely primary source of the issue is the Atlas firewall/fail2ban configuration. To test this story, we could ssh from ReplIt to an SSH server with a known working IPv4 address and reverse port forward port 22. We can then use that as a dynamic proxy for a local MongoDB connection.

Kubernetes nodes rarely have stable IP addresses in general, this should also sporadically affect Kubernetes users on GCP… Or anyone unfortunate enough to get a banned IP.

Long term, giving Atlas IPv6 support should make it easier to avoid this situation… That way one rogue customer plugin/container/wasm module/function on your Kubernetes Node doesn’t get the whole Node banned.

Would moving our Atlas cluster from AWS to GCP help?

@Ray_Foss I am not sure if that is true after using “allow access everywhere”. do you have time to test your theory? especially if you hit bugging IP addresses frequently. mine was (un)-lucky shot to get one of those addresses.

  • create 3 Atlas clusters, each on a different cloud provider.
  • use the following script to test connections to them all.
    • it is very simple. you can even use mongodb nodejs driver instead of mongoose. or try them both for fancy :slight_smile:
    • save to file and run in shell: node mongoconnets.js. you can edit so it test all 3 urls at once. (in the “shell”, not “console”)
    • if all fails, it is on replit’s side
    • if 1-2 fails, it is on database’s cloud provider’s side.
const mongoose = require('mongoose');
const mySecret = process.env['mongoUrl']
const intialDbConnection = async () => {
  try {
    console.log("connecting?")
    await mongoose.connect(mySecret, {
      useNewUrlParser: true,
      useUnifiedTopology: true
    })
    console.log("db connected")
    // await mongoose.disconnect()
    // console.log("test complete. db disconnected")
  }
  catch (error) {
    console.error(error);
  }
}
console.log(new Date(Date.now()))
intialDbConnection()
setInterval(()=>console.log(new Date(Date.now())),5000)

PS: tunneling would work no matter what you do because that is what tunneling does :wink: it is like trying to connect from your own pc (remember I tried side-by-side with Compass), it would just work. so, not a good test/prod method.

That seems like the most efficient way to potentially deal with this problem. I was using the MongoDB driver directly. This test would only rule out the cloud provider’s firewall, not any additional Atlas DoS attack protection, fail2ban config or firewall. I’ll still give this a try as it’s a quick test.

By reverse port forwarding port 22, I meant tunneling to the Replit container so that connections from Compass would go through the container using a socks proxy.

Good news, it has been fixed in the last 12 hours or so
Replit reproduction that checks AWS and GCP
Screenshot from 2022-12-03 10-58-55

Bad news, I have no idea what happened. The nature of it lends credence to it not being a DoS issue, but a staged release issue where by 50% of the nodes had a bad DNS config.

1 Like

It appears the problem is still occurring sporadically, it’s as if there is a maximum number of connections per IP on a global basis? I certainly have enough connections available for my database.

Can Atlas handle 100 projects connecting to different databases from the same IP?

Absolutely, but if you are using the free or on of the shared tier you get what you pay for, a cluster that is affected by what the other people/applications are doing.

I have seen some post about people doing performance assessment on the free/shared tiers. It sure slows you down if you are on the same shared tier.

But so far, nothing shows the issue is on Atlas side. It might be replit. The error ENOTFOUND is strictly DNS related. The resolution is not made by Atlas. It is highly distributed and cached. I did not looked at the TTL values but if one side received ENOTFOUND then it looks like the resolver on this erring side is the culprit.

By the way, are you using the SRV style URI or the old style where replica set hosts are individually specified.

The following

is a big misunderstanding of DNS. Atlas is supplying the DNS information. Your application, using the driver, is not able to find the DNS information. Your side is not able to find the information. DNS is the pillar of the internet that allows everything to work with names rather than numbers. When we query on our side with a reliable resolver it works.

One quick fix is to use google’s 8.8.8.8 and 8.8.4.4 free DNS servers. I just do not know how to bypass replit DNS resolvers to use google’s.

The other fix, if you really think that Atlas is the issue, is to have manage your own mongod servers.

I always used paid stuff.

I’ve traced the source of the issue… bad DNS servers. If your container was created with the bad DNS server, there is no remedy but to delete it and start over.

A quick test is checking what happens when you run dig google.com. If you only get one response like Cloudflare likes to give, you have the bad unreliable DNS configuration. I’m very familiar with DNS… DNS can be used as a database and for auto configuration. The mongodb driver actually gets its replica information from a TXT record.

1 Like

@Ray_Foss, how often does your app fall into this problem? As I said, mine was just a bit of luck to get one of those to see this error. may I ask if you can try this: when you hit a host with the problem, can you try to connect with mongodb:// URL format instead of mongodb+srv://? I wonder if this has any relevance.
(from the same page on Atlas where you get the srv string, select oldest driver version)

Nice to know the following.

It actually confirms

It gets part of it. The list of hosts is actually a SRV records.

I paid for a hacker plan and after two months it’s still not working.
It’s unbelievable…