My local sharded Mongo cluster has been running without issue on version 4.4 for some time. I am attempting to upgrade to 5.0.13 with the ultimate goal of upgrading to 6.0.
When I run docker-compose up, the containers start up, but the shards have the following error messages. Note that “a8f4b836a4fc” is the container id for the old config server with the 4.4 image that is no longer running.
{"t":{"$date":"2023-03-14T16:45:32.342+00:00"},"s":"I", "c":"CONNPOOL", "id":22576, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"Connecting","attr":{"hostAndPort":"a8f4b836a4fc:27017"}}
{"t":{"$date":"2023-03-14T16:45:32.348+00:00"},"s":"I", "c":"-", "id":4333222, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"RSM received error response","attr":{"host":"a8f4b836a4fc:27017","error":"HostUnreachable: Error connecting to a8f4b836a4fc:27017 :: caused by :: Could not find address for a8f4b836a4fc:27017: SocketException: Host not found (authoritative)","replicaSet":"rs0","response":"{}"}}
{"t":{"$date":"2023-03-14T16:45:32.348+00:00"},"s":"I", "c":"NETWORK", "id":4712102, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"Host failed in replica set","attr":{"replicaSet":"rs0","host":"a8f4b836a4fc:27017","error":{"code":6,"codeName":"HostUnreachable","errmsg":"Error connecting to a8f4b836a4fc:27017 :: caused by :: Could not find address for a8f4b836a4fc:27017: SocketException: Host not found (authoritative)"},"action":{"dropConnections":true,"requestImmediateCheck":true}}}
{"t":{"$date":"2023-03-14T16:45:32.852+00:00"},"s":"I", "c":"-", "id":4333222, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"RSM received error response","attr":{"host":"a8f4b836a4fc:27017","error":"HostUnreachable: Error connecting to a8f4b836a4fc:27017 :: caused by :: Could not find address for a8f4b836a4fc:27017: SocketException: Host not found (authoritative)","replicaSet":"rs0","response":"{}"}}
{"t":{"$date":"2023-03-14T16:45:32.852+00:00"},"s":"I", "c":"NETWORK", "id":4712102, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"Host failed in replica set","attr":{"replicaSet":"rs0","host":"a8f4b836a4fc:27017","error":{"code":6,"codeName":"HostUnreachable","errmsg":"Error connecting to a8f4b836a4fc:27017 :: caused by :: Could not find address for a8f4b836a4fc:27017: SocketException: Host not found (authoritative)"},"action":{"dropConnections":true,"requestImmediateCheck":false,"outcome":
If I remove the container’s volumes, then the upgrade works fine. Based on that, I grepped the volumes for “a8f4b836a4fc”, and it was there in several places like this: configsvrConnectionStringrs0/a8f4b836a4fc:27017. I.e. the shard’s storage volumes are retaining the container id of the old config server and possibly using it for connection purposes.
If I remove the volumes, I can do the upgrade without issue. When I check the WiredTiger files in the new volumes, I see that the configsvrConnectionString does not reference the config server by container id, but by the container name, which makes a lot more sense to me. I couldn’t find anything online about Mongo 4.4 referencing the config server by container id, or even why that would be cached or stored in the first place since container ids change. I would like to understand why this is happening and how I can prevent my 4.4 setup from “caching” the container id in its storage volumes and trying to use the old ids to connect after the upgrade is complete.
Thanks in advance.
Here is the docker-compose.yaml that works for 4.4 (written by someone who has since left the company)
version: '3.7'
services:
mongodb1:
container_name: mongodb1
image: mongo:4.4.0
command: mongod --shardsvr --dbpath /data/db --port 27017
ports:
- 27017:27017
expose:
- "27017"
environment:
TERM: xterm
volumes:
- ~/mongo-shard/mongodata1:/data/db
mongodb2:
container_name: mongodb2
image: mongo:4.4.0
command: mongod --shardsvr --dbpath /data/db --port 27017
ports:
- 27027:27017
expose:
- "27017"
environment:
TERM: xterm
volumes:
- ~/mongo-shard/mongodata2:/data/db
mongodb3:
container_name: mongodb3
image: mongo:4.4.0
command: mongod --shardsvr --dbpath /data/db --port 27017
ports:
- 27037:27017
expose:
- "27017"
environment:
TERM: xterm
volumes:
- ~/mongo-shard/mongodata3:/data/db
mongocfg:
container_name: mongocfg
image: mongo:4.4.0
command: mongod --configsvr --replSet rs0 --dbpath /data/db --port 27017
environment:
TERM: xterm
expose:
- "27017"
volumes:
- ~/mongo-shard/mongodatacfg:/data/db
mongos:
container_name: mongos
image: mongo:4.4.0
depends_on:
- mongocfg
command: mongos --configdb rs0/mongocfg:27017 --bind_ip_all --port 27017
ports:
- 27022:27017
expose:
- "27017"
Here is the docker file for 5.0.13 (based on his 4.4 file)
version: '2'
services:
mongos:
image: mongo:5.0.13
container_name: mongos
command: mongos --port 27017 --configdb rs0/mongocfg:27017 --bind_ip_all
ports:
- 27022:27017
mongocfg:
image: mongo:5.0.13
container_name: mongocfg
command: mongod --port 27017 --configsvr --replSet rs0 --bind_ip_all
volumes:
- ~/mongo-shard/mongodatacfg:/data/db
mongodb1:
image: mongo:5.0.13
container_name: mongodb1
command: mongod --port 27017 --shardsvr --replSet rs-shard-01 --bind_ip_all
volumes:
- ~/mongo-shard/mongodata1:/data/db
ports:
- 27027:27017
mongodb2:
image: mongo:5.0.13
container_name: mongodb2
command: mongod --port 27017 --shardsvr --replSet rs-shard-02 --bind_ip_all
volumes:
- ~/mongo-shard/mongodata2:/data/db
ports:
- 27037:27017
mongodb3:
image: mongo:5.0.13
container_name: mongodb3
command: mongod --port 27017 --shardsvr --replSet rs-shard-03 --bind_ip_all
volumes:
- ~/mongo-shard/mongodata3:/data/db
ports:
- 27047:27017