Hi,
Long time mongodb user, this is the first time I’m stuck on a weird issue…
I have a cluster with 5 shards, as:
{ "_id" : "reports-z1-0", "host" : "reports-z1-0/mongodb-shard-reports-0-0.mongodb-shard-reports-0.default:27018,mongodb-shard-reports-0-1.mongodb-shard-reports-0.default:27018,mongodb-shard-reports-0-2.mongodb-shard-reports-0.default:27018", "state" : 1, "tags" : [ "z1" ] }
{ "_id" : "reports-z1-1", "host" : "reports-z1-1/mongodb-shard-reports-1-0.mongodb-shard-reports-1.default:27018,mongodb-shard-reports-1-1.mongodb-shard-reports-1.default:27018,mongodb-shard-reports-1-2.mongodb-shard-reports-1.default:27018", "state" : 1, "tags" : [ "z1" ] }
{ "_id" : "shard-z0-0", "host" : "shard-z0-0/mongodb-shard-data-0-0.mongodb-shard-data-0.default:27018,mongodb-shard-data-0-1.mongodb-shard-data-0.default:27018,mongodb-shard-data-0-2.mongodb-shard-data-0.default:27018", "state" : 1, "tags" : [ "z0" ] }
{ "_id" : "shard-z0-1", "host" : "shard-z0-1/mongodb-shard-data-1-0.mongodb-shard-data-1.default:27018,mongodb-shard-data-1-1.mongodb-shard-data-1.default:27018,mongodb-shard-data-1-2.mongodb-shard-data-1.default:27018", "state" : 1, "tags" : [ "z0" ] }
{ "_id" : "shard-z0-2", "host" : "shard-z0-2/mongodb-shard-data-2-0.mongodb-shard-data-2.default:27018,mongodb-shard-data-2-1.mongodb-shard-data-2.default:27018,mongodb-shard-data-2-2.mongodb-shard-data-2.default:27018", "state" : 1, "tags" : [ "z0" ] }
I did upgrade from 4.2 to latest 4.4.2
When setting the
db.adminCommand( { setFeatureCompatibilityVersion: "4.4" } )
The command return
{
"operationTime" : Timestamp(1607534165, 15),
"ok" : 0,
"errmsg" : "No chunks were found for the collection",
"code" : 117,
"codeName" : "ConflictingOperationInProgress",
"$gleStats" : {
"lastOpTime" : {
"ts" : Timestamp(1607534165, 15),
"t" : NumberLong(79)
},
"electionId" : ObjectId("7fffffff000000000000004f")
},
"lastCommittedOpTime" : Timestamp(1607534165, 15),
"$configServerState" : {
"opTime" : {
"ts" : Timestamp(1607534165, 4),
"t" : NumberLong(73)
}
},
"$clusterTime" : {
"clusterTime" : Timestamp(1607534165, 15),
"signature" : {
"hash" : BinData(0,"nQDtxjtK93fj0wnyN8Wy19Phb9U="),
"keyId" : NumberLong("6860912798809980930")
}
}
}
I checked all nodes by hand and some are showing:
The server generated these startup warnings when booting:
2020-12-09T17:13:20.379+00:00: A featureCompatibilityVersion upgrade did not complete. To fix this, use the setFeatureCompatibilityVersion command to resume upgrade to 4.4
2020-12-09T17:13:20.379+00:00: currentfeatureCompatibilityVersion: upgrading to 4.4
and
db.adminCommand( { getParameter: 1, featureCompatibilityVersion: 1 } )
{
"featureCompatibilityVersion" : {
"version" : "4.2",
"targetVersion" : "4.4"
},
"ok" : 1,
"$gleStats" : {
"lastOpTime" : Timestamp(0, 0),
"electionId" : ObjectId("7fffffff0000000000000049")
},
"lastCommittedOpTime" : Timestamp(1607534584, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1607534584, 1),
"signature" : {
"hash" : BinData(0,"MfD/ygGU2rnKml3T/d91iImtIdk="),
"keyId" : NumberLong("6860912798809980930")
}
},
"operationTime" : Timestamp(1607534584, 1)
}
The config node and shard-z0-0 are stuck in that state.
shard-z0-1 and shard-z0-2 still show FCV set to 4.2
the reports-z1-0 and reports-z1-1 show the correct FCV of 4.4.
I tried to restart everything it doesnt help.
Due to that all chunk splitting are stuck:
"ConflictingOperationInProgress: Chunks cannot be split while a feature compatibility version upgrade or downgrade is in progress"
I did spot that message on data nodes:
"ctx":"initandlisten","msg":"A featureCompatibilityVersion upgrade did not complete. To fix this, use the setFeatureCompatibilityVersion command to resume upgrade to 4.4","attr":{"currentfeatureCompatibilityVersion":"upgrading to 4.4"},"tags":["startupWarnings"]}
How can I recover from that state?
Thanks