Client experienced a timeout when connecting to ‘m103-repl’

Hi All,

If anyone comes across the following error, please try the below given possible solutions.
If nothing works, please feel free to reach out to post in the discussion forum.

Error: Client experienced a timeout when connecting to ‘m103-repl’ - check that mongod/mongos
processes are running on the correct ports, and that the ‘m103-admin’ user
authenticates against the admin database.

Possible Solutions:

  1. Check sh.status(), replica set status: rs.status() to see if all ports are as asked
  2. User m103-admin should authenticate against admin database
  3. Replica set name should be exactly the same as mentioned in lab notes
  4. Check if running out of memory and if yes, allocate more to vagrant.
    To do so, modify the Vagrantfile in m103-vagrant-env directory at line 16 that says vb.memory = 2048 to say vb.memory = 4096.
    Alternate way - Here is a useful link to change memory setting through Virtual Box.

Happy Learning!

Thanks,
Muskan
Curriculum Support Engineer

1 Like

I also hit the same error:

Client experienced a timeout when connecting to 'm103-repl-2' - check that mongod/mongos
processes are running on the correct ports, and that the 'm103-admin' user authenticates against the admin database.

I resolved after doing the following:

  1. created authenticated user for new replica set
  2. added the additional nodes, 5 and 6, to the new replica set
  3. stopped all my mongo processes
  4. started all the mongo processes again

No need for me to delete anything.

Walking through each of the steps…

Checking the log confirmed the script was having a problem as trying to connect with the m103-admin user:

vagrant@m103:~$ grep log mongod-repl-4.conf
  path: /var/mongodb/db/4/mongod.log
  logAppend: true
vagrant@m103:~$
vagrant@m103:~$ tail -2 /var/mongodb/db/4/mongod.log
2020-03-22T13:11:33.816+0000 I ACCESS   [conn235] SCRAM-SHA-1 authentication failed for m103-admin on admin from client 192.168.103.100:43495 ; UserNotFound: Could not find user m103-admin@admin
2020-03-22T13:11:33.816+0000 I NETWORK  [conn235] end connection 192.168.103.100:43495 (17 connections now open)
vagrant@m103:~$

The config for node 4 didn’t have authorization enabled (as per lab config), so there was previously no need to create that user for the new replica set.

So I created the user:

>     mongo admin --host localhost:27004 --eval '
>       db.createUser({
>         user: "m103-admin",
>         pwd: "m103-pass",
>         roles: [
>           {role: "root", db: "admin"}
>         ]
>       })
>     '

Authenticated access was now successful:
mongo --host "m103-repl-2/192.168.103.100:27004" -u "m103-admin" -p "m103-pass" --authenticationDatabase "admin"

validate_lab_shard_collection still failed, but not for the previous reason.

I saw the suggestion of exhausted resources, so I went to shut down all my nodes.
The first attempt failed:
mongo admin --port 27004 --username m103-admin --password m103-pass --eval 'db.shutdownServer()'
->
No electable secondaries caught up as of 2020-03-22T13:35:59.395+0000Please use the replSetStepDown command with the argument {force: true} to force node to step down

As I was sticking faithfully to the lab exercise, I had not added the other 2 nodes to the replica set to date. So I resolved that:
mongo --port 27004 -u "m103-admin" -p "m103-pass"

rs.add("192.168.103.100:27005")
rs.add("192.168.103.100:27006")

MongoDB Enterprise m103-repl-2:PRIMARY> rs.isMaster()
{
        "hosts" : [
                "192.168.103.100:27004",
                "192.168.103.100:27005",
                "192.168.103.100:27006"
        ],`

Now shutdown of each of my nodes was successful.
Then when I started them all up again, the validation script completed successfully.

A general comment is that these labs can be really time-consuming, possibly taking numerous hours to complete. Overall, it means some of these modules can take an age to complete. Some of it is productive because it helps with improved understandings. But some of it doesn’t feel like a productive use of our time, unfortunately.

1 Like

Hi @Andrew_01236,

Thanks for sharing your experience as well as the feedback.

We are currently in the process of upgrading the labs in the M103 course. I would love to know your thoughts on this. I’ve inboxed you with few questions. Please let me know your thoughts :slight_smile: .

Thanks,
Shubham Ranjan
Curriculum Services Engineer

1 Like

Hi, I did increase the ram for the vagrant environment but i did get the same result:
Client experienced a timeout when connecting to ‘m103-repl’ - check that mongod/mongos
processes are running on the correct ports, and that the ‘m103-admin’ user
authenticates against the admin database.

I re-check all port numbers and status of the replica cluster, csrs cluster and the mongos conf. but I still get the sameresult.

this is the combined feed from runnig the lab. Any suggestion is welcome:

C:\Users\m103>cd m103-vagrant-env

C:\Users\m103\m103-vagrant-env>vagrant up
Bringing machine ‘mongod-m103’ up with ‘virtualbox’ provider…
==> mongod-m103: Clearing any previously set forwarded ports…
==> mongod-m103: Clearing any previously set network interfaces…
==> mongod-m103: Preparing network interfaces based on configuration…
mongod-m103: Adapter 1: nat
mongod-m103: Adapter 2: hostonly
==> mongod-m103: Forwarding ports…
mongod-m103: 22 (guest) => 2222 (host) (adapter 1)
==> mongod-m103: Running ‘pre-boot’ VM customizations…
==> mongod-m103: Booting VM…
==> mongod-m103: Waiting for machine to boot. This may take a few minutes…
mongod-m103: SSH address: 127.0.0.1:2222
mongod-m103: SSH username: vagrant
mongod-m103: SSH auth method: private key
mongod-m103: Warning: Connection aborted. Retrying…
mongod-m103: Warning: Connection reset. Retrying…
==> mongod-m103: Machine booted and ready!
==> mongod-m103: Checking for guest additions in VM…
mongod-m103: The guest additions on this VM do not match the installed version of
mongod-m103: VirtualBox! In most cases this is fine, but in rare cases it can
mongod-m103: prevent things such as shared folders from working properly. If you see
mongod-m103: shared folder errors, please make sure the guest additions within the
mongod-m103: virtual machine match the version of VirtualBox you have installed on
mongod-m103: your host and reload your VM.
mongod-m103:
mongod-m103: Guest Additions Version: 4.3.40
mongod-m103: VirtualBox Version: 6.1
==> mongod-m103: Setting hostname…
==> mongod-m103: Configuring and enabling network interfaces…
==> mongod-m103: Mounting shared folders…
mongod-m103: /shared => C:/Users/m103/m103-vagrant-env/shared
mongod-m103: /vagrant => C:/Users/m103/m103-vagrant-env
mongod-m103: /dataset => C:/Users/m103/m103-vagrant-env/dataset
==> mongod-m103: Machine already provisioned. Run vagrant provision or use the --provision
==> mongod-m103: flag to force provisioning. Provisioners marked to run always will still run.

C:\Users\m103\m103-vagrant-env>vagrant ssh
Welcome to Ubuntu 14.04.6 LTS (GNU/Linux 3.13.0-170-generic x86_64)

System information disabled due to load higher than 2.0

New release ‘16.04.6 LTS’ available.
Run ‘do-release-upgrade’ to upgrade to it.

Last login: Mon Jun 29 00:18:22 2020 from 10.0.2.2
Welcome to Ubuntu 14.04.6 LTS (GNU/Linux 3.13.0-170-generic x86_64)

System information disabled due to load higher than 2.0

New release ‘16.04.6 LTS’ available.
Run ‘do-release-upgrade’ to upgrade to it.

Last login: Mon Jun 29 00:18:22 2020 from 10.0.2.2

vagrant@m103:~ ps -ef | grep mongo vagrant 2089 1 5 00:30 ? 00:00:33 mongod -f csrs_1.conf vagrant 2174 1 5 00:30 ? 00:00:32 mongod -f csrs_2.conf vagrant 2266 1 5 00:30 ? 00:00:31 mongod -f csrs_3.conf vagrant 2374 1 5 00:31 ? 00:00:29 mongod -f /data/mongod-repl-1.conf vagrant 2474 1 5 00:31 ? 00:00:28 mongod -f /data/mongod-repl-2.conf vagrant 2584 1 5 00:32 ? 00:00:27 mongod -f /data/mongod-repl-3.conf vagrant 2744 1 1 00:34 ? 00:00:03 mongos -f mongos.conf vagrant 2814 2063 0 00:36 pts/1 00:00:00 mongo --port 26000 --username m103-admin --password xxxxxxxxx --authenticationDatabase admin vagrant 2903 1945 0 00:40 pts/0 00:00:00 grep --color=auto mongo vagrant@m103:~ validate_lab_first_sharded_cluster

Replica set ‘m103-repl’ not configured correctly - make sure each node is started with
a wiredTiger cache size of 0.1 GB. Your cluster will crash in the following lab
if you don’t do this!
vagrant@m103:~$ validate_lab_first_sharded_cluster

Client experienced a timeout when connecting to ‘m103-repl’ - check that mongod/mongos
processes are running on the correct ports, and that the ‘m103-admin’ user
authenticates against the admin database.
vagrant@m103:~$

=================

from mongos=>

Microsoft Windows [Version 10.0.19041.329]
© 2020 Microsoft Corporation. All rights reserved.

C:\Users\a>cd …

C:\Users>cd m103

C:\Users\m103>cd m103-vagrant-env

C:\Users\m103\m103-vagrant-env>vagrant ssh
Welcome to Ubuntu 14.04.6 LTS (GNU/Linux 3.13.0-170-generic x86_64)

System information disabled due to load higher than 2.0

New release ‘16.04.6 LTS’ available.
Run ‘do-release-upgrade’ to upgrade to it.

Last login: Mon Jun 29 00:47:12 2020 from 10.0.2.2
vagrant@m103:~$ mongo --port 26000 --username m103-admin --password m103-pass --authenticationDatabase admin
MongoDB shell version v3.6.18
connecting to: mongodb://127.0.0.1:26000/?authSource=admin&gssapiServiceName=mongodb
Implicit session: session { “id” : UUID(“19b69984-b361-4fea-94ef-1b9f7e4aee58”) }
MongoDB server version: 3.6.18
MongoDB Enterprise mongos> rs.status()
{
“info” : “mongos”,
“ok” : 0,
“errmsg” : “replSetGetStatus is not supported through mongos”,
“operationTime” : Timestamp(1593391916, 1),
“$clusterTime” : {
“clusterTime” : Timestamp(1593391916, 1),
“signature” : {
“hash” : BinData(0,“nyCHD6a4VfuUARxXfN/HCleaSiQ=”),
“keyId” : NumberLong(“6843469386137731096”)
}
}
}
MongoDB Enterprise mongos> sh.status()
— Sharding Status —
sharding version: {
“_id” : 1,
“minCompatibleVersion” : 5,
“currentVersion” : 6,
“clusterId” : ObjectId(“5ef8e326abe2e1ab6ec59763”)
}
shards:
{ “_id” : “m103-repl”, “host” : “m103-repl/192.168.103.100:27001,m103:27002,m103:27003”, “state” : 1 }
active mongoses:
“3.6.18” : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 5
Last reported error: Could not find host matching read preference { mode: “primary” } for set m103-repl
Time of Reported error: Mon Jun 29 2020 00:31:55 GMT+0000 (UTC)
Migration Results for the last 24 hours:
No recent migrations
databases:
{ “_id” : “config”, “primary” : “config”, “partitioned” : true }
config.system.sessions
shard key: { “_id” : 1 }
unique: false
balancing: true
chunks:
m103-repl 1
{ “_id” : { “$minKey” : 1 } } -->> { “_id” : { “$maxKey” : 1 } } on : m103-repl Timestamp(1, 0)

MongoDB Enterprise mongos>

Please show output of rs.status()
You have mixed type configuration as seen above.Use only IP

Hi Ramachandra, thanks for the suggestion. I did change the all the configuration to the ip format; however this did not solved the problems.

Your first validate gave some Wiredtiger errors
Is that taken care?
If resources are less you will get timeout

I asked for rs.status()
Can you connect to each node of m103-repl and replicaset as a whole?
Are all ports and replicaset name as per lab

Hi Ramachandra.

  1. If you are talking about the cacheSizeGB: .1, yes I did take care off it.

  2. The manes of all the players that I an runnig on the lab are:
    vagrant 1966 1 6 14:47 ? 00:00:38 mongod -f /data/csrs_1.conf
    vagrant 2050 1 6 14:47 ? 00:00:36 mongod -f /data/csrs_2.conf
    vagrant 2136 1 6 14:47 ? 00:00:35 mongod -f /data/csrs_3.conf
    vagrant 2244 1 5 14:48 ? 00:00:32 mongod -f /data/mongod-repl-1.conf
    vagrant 2338 1 5 14:48 ? 00:00:31 mongod -f /data/mongod-repl-3.conf
    vagrant 2443 1 5 14:48 ? 00:00:31 mongod -f /data/mongod-repl-2.conf
    vagrant 2573 1 1 14:49 ? 00:00:05 mongos -f /data/mongos.conf

  3. I can connect the whole replicaset, see text:
    rs.status()
    {
    “set” : “m103-repl”,
    “date” : ISODate(“2020-06-29T14:54:07.812Z”),
    “myState” : 1,
    “term” : NumberLong(5),
    “syncingTo” : “”,
    “syncSourceHost” : “”,
    “syncSourceId” : -1,
    “heartbeatIntervalMillis” : NumberLong(2000),
    “optimes” : {
    “lastCommittedOpTime” : {
    “ts” : Timestamp(1593442442, 1),
    “t” : NumberLong(5)
    },
    “readConcernMajorityOpTime” : {
    “ts” : Timestamp(1593442442, 1),
    “t” : NumberLong(5)
    },
    “appliedOpTime” : {
    “ts” : Timestamp(1593442442, 1),
    “t” : NumberLong(5)
    },
    “durableOpTime” : {
    “ts” : Timestamp(1593442442, 1),
    “t” : NumberLong(5)
    }
    },
    “members” : [
    {
    “_id” : 0,
    “name” : “192.168.103.100:27001”,
    “health” : 1,
    “state” : 1,
    “stateStr” : “PRIMARY”,
    “uptime” : 348,
    “optime” : {
    “ts” : Timestamp(1593442442, 1),
    “t” : NumberLong(5)
    },
    “optimeDate” : ISODate(“2020-06-29T14:54:02Z”),
    “syncingTo” : “”,
    “syncSourceHost” : “”,
    “syncSourceId” : -1,
    “infoMessage” : “”,
    “electionTime” : Timestamp(1593442130, 1),
    “electionDate” : ISODate(“2020-06-29T14:48:50Z”),
    “configVersion” : 3,
    “self” : true,
    “lastHeartbeatMessage” : “”
    },
    {
    “_id” : 1,
    “name” : “192.168.103.100:27002”,
    “health” : 1,
    “state” : 2,
    “stateStr” : “SECONDARY”,
    “uptime” : 304,
    “optime” : {
    “ts” : Timestamp(1593442442, 1),
    “t” : NumberLong(5)
    },
    “optimeDurable” : {
    “ts” : Timestamp(1593442442, 1),
    “t” : NumberLong(5)
    },
    “optimeDate” : ISODate(“2020-06-29T14:54:02Z”),
    “optimeDurableDate” : ISODate(“2020-06-29T14:54:02Z”),
    “lastHeartbeat” : ISODate(“2020-06-29T14:54:05.911Z”),
    “lastHeartbeatRecv” : ISODate(“2020-06-29T14:54:07.291Z”),
    “pingMs” : NumberLong(1),
    “lastHeartbeatMessage” : “”,
    “syncingTo” : “192.168.103.100:27003”,
    “syncSourceHost” : “192.168.103.100:27003”,
    “syncSourceId” : 2,
    “infoMessage” : “”,
    “configVersion” : 3
    },
    {
    “_id” : 2,
    “name” : “192.168.103.100:27003”,
    “health” : 1,
    “state” : 2,
    “stateStr” : “SECONDARY”,
    “uptime” : 321,
    “optime” : {
    “ts” : Timestamp(1593442442, 1),
    “t” : NumberLong(5)
    },
    “optimeDurable” : {
    “ts” : Timestamp(1593442442, 1),
    “t” : NumberLong(5)
    },
    “optimeDate” : ISODate(“2020-06-29T14:54:02Z”),
    “optimeDurableDate” : ISODate(“2020-06-29T14:54:02Z”),
    “lastHeartbeat” : ISODate(“2020-06-29T14:54:07.310Z”),
    “lastHeartbeatRecv” : ISODate(“2020-06-29T14:54:06.184Z”),
    “pingMs” : NumberLong(1),
    “lastHeartbeatMessage” : “”,
    “syncingTo” : “192.168.103.100:27001”,
    “syncSourceHost” : “192.168.103.100:27001”,
    “syncSourceId” : 0,
    “infoMessage” : “”,
    “configVersion” : 3
    }
    ],
    “ok” : 1,
    “operationTime” : Timestamp(1593442442, 1),
    “$gleStats” : {
    “lastOpTime” : Timestamp(0, 0),
    “electionId” : ObjectId(“7fffffff0000000000000005”)
    },
    “$configServerState” : {
    “opTime” : {
    “ts” : Timestamp(1593442440, 1),
    “t” : NumberLong(4)
    }
    },
    “$clusterTime” : {
    “clusterTime” : Timestamp(1593442442, 1),
    “signature” : {
    “hash” : BinData(0,“TnnJ9wyyjAlRKB9wt6zXCcPHA6o=”),
    “keyId” : NumberLong(“6843618335603556363”)
    }
    }
    }
    MongoDB Enterprise m103-repl:PRIMARY>

Please show latest sh.status() after you changed hostname to IP
Also addShard output

I was not asking about config file names
Check if replicasetname is same or not in all 3 config files,clusterrole etc

Hi @alejandro_28935,

Please share the following information :

Also, what steps did you follow for changing the hostname to IP Address ?

~ Shubham

Hi Ramachandra. Here is the info you asked for:

Please show latest sh.status() after you changed hostname to IP

Also addShard output

Also, what steps did you follow for changing the hostname to IP Address ?

  1. I did fix the ip problem by dropping the nodes using the comand “rs.remove()” and then adding back the replica servers using rs.add(“192.168.103.100:27002”) instead of rs.add(“m103:27002”)

https://docs.mongodb.com/manual/tutorial/remove-replica-set-member/

  1. Here is the result of adding the shard:

vagrant@m103:/data$ mongo --port 26000 --username m103-admin --password m103-pass --authenticationDatabase admin

MongoDB shell version v3.6.18

connecting to: mongodb://127.0.0.1:26000/?authSource=admin&gssapiServiceName=mongodb

Implicit session: session { “id” : UUID(“6ce23bba-df4c-4426-a48a-a061a0ea164a”) }

MongoDB server version: 3.6.18

MongoDB Enterprise mongos> sh.addShard(“m103-repl/192.168.103.100:27001”)

{

“shardAdded” : “m103-repl”,

“ok” : 1,

“operationTime” : Timestamp(1593526735, 1),

“$clusterTime” : {

“clusterTime” : Timestamp(1593526735, 1),

“signature” : {

“hash” : BinData(0,“lJbgktDcH89G6lYdld0MNumDzc0=”),

“keyId” : NumberLong(“6843618335603556363”)

}

}

}

MongoDB Enterprise mongos> sh.status()

— Sharding Status —

sharding version: {

“_id” : 1,

“minCompatibleVersion” : 5,

“currentVersion” : 6,

“clusterId” : ObjectId(“5ef96a9ef35a5f1f01cd3c9e”)

}

shards:

{ “_id” : “m103-repl”, “host” : “m103-repl/192.168.103.100:27001,192.168.103.100:27002,192.168.103.100:27003”, “state” : 1 }

active mongoses:

“3.6.18” : 1

autosplit:

Currently enabled: yes

balancer:

Currently enabled: yes

Currently running: no

Failed balancer rounds in last 5 attempts: 5

Last reported error: Could not find host matching read preference { mode: “primary” } for set m103-repl

Time of Reported error: Tue Jun 30 2020 14:17:16 GMT+0000 (UTC)

Migration Results for the last 24 hours:

No recent migrations

databases:

{ “_id” : “config”, “primary” : “config”, “partitioned” : true }

config.system.sessions

shard key: { “_id” : 1 }

unique: false

balancing: true

chunks:

m103-repl 1

{ “_id” : { “$minKey” : 1 } } -->> { “_id” : { “$maxKey” : 1 } } on : m103-repl Timestamp(1, 0)

MongoDB Enterprise mongos>

even now when i do the validation of the lab i get the same message about the connection problem. I do not known what else I can do to solve this one.

Atte.

Sh.status() and addshard looks fine
Are your keyfiles consistent?
Any authentication params missed in config
Does your mongos.log show any errors
Discrepancy in /etc/hosts

Need to be sure that your Replicaset name is the correct.
Ensure to at least add the ip or hostname of the master in the replicaset.
Ensure to add the correct port of the master node in the replicaset.
Check that you are using the correct keyfile defined in mongos and rs.
After that you just need to use:

sh.addShard(“your_rs_name/rs_master_ip_or_hostname:port”)