Sharded cluster deployment pods in unhealthy state forever

krishna_shedbalkar · April 2, 2023, 6:48am

Im deploying shard cluster using config as follows:

apiVersion: mongodb.com/v1
kind: MongoDB
metadata:
name: mongo-shard
spec:
shardCount: 2
mongodsPerShardCount: 3
mongosCount: 1
configServerCount: 3
version: “4.2.2-ent”
opsManager:
configMapRef:
name: mongodb-project
credentials: mongo-api-keys
type: ShardedCluster
persistent: true

But status of my deployment is:
$ kubectl get pods -n mongodb

$ kubectl describe pod/mongo-shard-0-0 -n mongodb
Type Reason Age From Message

Warning Unhealthy 2m1s (x8190 over 10h) kubelet Readiness probe failed:

$ kubectl describe pod/mongo-shard-mongos-0 -n mongodb
Events:
Type Reason Age From Message

Warning Unhealthy 3m8s (x8180 over 10h) kubelet Readiness probe failed:

Someone please help

Aasawari · April 4, 2023, 4:22am

Hi @krishna_shedbalkar and welcome to the MongoDB Community forum!!

As mentioned in the Kubernetes documentations:

Sometimes, applications are temporarily unable to serve traffic. For example, an application might need to load large data or configuration files during startup, or depend on external services after startup. In such cases, you don’t want to kill the application, but you don’t want to send it requests either. Kubernetes provides readiness probes to detect and mitigate these situations. A pod with containers reporting that they are not ready does not receive traffic through Kubernetes Services.

The Readiness probe failure could possibly be resolved by increasing the readiness timeout value set for the pods in the deployment.yaml files.
So, if you could increase the value to higher value and see if the nodes/pods come up and start.

However, to understand the issue in detail, could you help me with some information regarding the deployment:

Do you see any error log messages in the pod logs which would be helpful in identifying the issue?
Are you following any script or documentation for the deployment. If yes, could you share the link or documentation?
Has this issue started abruptly or was there some change in the deployment or service files?

Lastly, I would recommend you to check the resource utilisation of the pods to ensure that they have enough resources allocated to them. If the pods are running out of memory or CPU, they may not be able to respond to readiness probes.

Regards
Aasawari

Piyush_Harshwal · June 13, 2023, 7:44am

Hi Aasawari ,

We are also facing same readiness probe problem .

Below is the yaml file used for cluster deplpyment

apiVersion: mongodb.com/v1
kind: MongoDB
metadata:
name: nwcc-sharded-cluster
spec:
shardCount: 1
mongodsPerShardCount: 3
mongosCount: 2
configServerCount: 3
version: “5.0.7-ent”
opsManager:
configMapRef:
name: nwcc-lab
credentials: nwcc-organization-secret
type: ShardedCluster
persistent: true

mongosPodSpec:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/mongodb
operator: In
values:
- nwcc
podTemplate:
spec:
containers:
- name: mongodb-enterprise-database
resources:
limits:
memory: 16G
podAntiAffinityTopologyKey: “kubernetes.io/hostname”

shardPodSpec:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/mongodb
operator: In
values:
- nwcc
podTemplate:
spec:
containers:
- name: mongodb-enterprise-database
resources:
limits:
memory: 64G
podAntiAffinityTopologyKey: “kubernetes.io/hostname”

configSrvPodSpec:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/mongodb
operator: In
values:
- nwcc
podTemplate:
spec:
containers:
- name: mongodb-enterprise-database
resources:
limits:
memory: 16G
podAntiAffinityTopologyKey: “kubernetes.io/hostname”

==================================================

Below is the POD description:-

[root@nwcc1-servicenode1 mongodb-procedure-manifests]# oc describe pod nwcc-sharded-cluster-config-0
Name: nwcc-sharded-cluster-config-0
Namespace: mongodb
Priority: 0
Node: nwcc1-worker5.nwcc-wifi-analytics.wifi-analytics.singnet.com.sg/172.16.10.9
Start Time: Mon, 12 Jun 2023 18:58:45 +0800
Labels: app=nwcc-sharded-cluster-cs
controller=mongodb-enterprise-operator
controller-revision-hash=nwcc-sharded-cluster-config-6ccffbc774
pod-anti-affinity=nwcc-sharded-cluster-config
statefulset.kubernetes.io/pod-name=nwcc-sharded-cluster-config-0
Annotations: k8s.v1.cni.cncf.io/network-status:
[{
“name”: “openshift-sdn”,
“interface”: “eth0”,
“ips”: [
“10.131.1.191”
],
“default”: true,
“dns”: {}
}]
k8s.v1.cni.cncf.io/networks-status:
[{
“name”: “openshift-sdn”,
“interface”: “eth0”,
“ips”: [
“10.131.1.191”
],
“default”: true,
“dns”: {}
}]
openshift.io/scc: restricted-v2
seccomp.security.alpha.kubernetes.io/pod: runtime/default
Status: Running
IP: 10.131.1.191
IPs:
IP: 10.131.1.191
Controlled By: StatefulSet/nwcc-sharded-cluster-config
Init Containers:
mongodb-enterprise-init-database:
Container ID: cri-o://56abcca8beaa4dcb09a1ac266af32b06379bb15f651d4ceecba5b325fa1f27f6
Image: registry.nwcc-wifi-analytics.wifi-analytics.singnet.com.sg:5000/mongodb/mongodb-enterprise-init-database-ubi:1.0.15
Image ID: registry.nwcc-wifi-analytics.wifi-analytics.singnet.com.sg:5000/mongodb/mongodb-enterprise-init-database-ubi@sha256:50ad43c3172b335148ff9174426ccb20a8452e3bc8347ea419a3015fa65f390a
Port:
Host Port:
State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 12 Jun 2023 18:58:58 +0800
Finished: Mon, 12 Jun 2023 18:58:58 +0800
Ready: True
Restart Count: 0
Environment:
Mounts:
/opt/scripts from database-scripts (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lx8q9 (ro)
Containers:
mongodb-enterprise-database:
Container ID: cri-o://4b647be008a6242df1cf61c66b9f4ff208081f6d041c78c530a0df3c5520e7d7
Image: registry.nwcc-wifi-analytics.wifi-analytics.singnet.com.sg:5000/mongodb/mongodb-enterprise-database-ubi:2.0.2
Image ID: registry.nwcc-wifi-analytics.wifi-analytics.singnet.com.sg:5000/mongodb/mongodb-enterprise-database-ubi@sha256:10eda5c39dda93a2d00ebbfd28e2c3cc5ea5e92337bd8ad539795affaad16d82
Port: 27017/TCP
Host Port: 0/TCP
Command:
/opt/scripts/agent-launcher.sh
State: Running
Started: Mon, 12 Jun 2023 18:58:59 +0800
Ready: False
Restart Count: 0
Limits:
memory: 16G
Requests:
memory: 16G
Liveness: exec [/opt/scripts/probe.sh] delay=10s timeout=30s period=30s #success=1 #failure=6
Readiness: exec [/opt/scripts/readinessprobe] delay=5s timeout=1s period=5s #success=1 #failure=4
Startup: exec [/opt/scripts/probe.sh] delay=1s timeout=30s period=20s #success=1 #failure=10
Environment:
AGENT_FLAGS: -logFile,/var/log/mongodb-mms-automation/automation-agent.log,
BASE_URL: http://nwcc-opsmanager-svc.mongodb.svc.cluster.local:8080
GROUP_ID: 6483123e672c203f9e99fa42
LOG_LEVEL:
MULTI_CLUSTER_MODE: false
SSL_REQUIRE_VALID_MMS_CERTIFICATES: true
USER_LOGIN: yidelwie
Mounts:
/data from data (rw,path=“data”)
/journal from data (rw,path=“journal”)
/mongodb-automation from agent (rw,path=“mongodb-automation”)
/mongodb-automation/agent-api-key from agent-api-key (rw)
/opt/scripts from database-scripts (ro)
/tmp from agent (rw,path=“tmp”)
/var/lib/mongodb-mms-automation from agent (rw,path=“mongodb-mms-automation”)
/var/log/mongodb-mms-automation from data (rw,path=“logs”)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lx8q9 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-nwcc-sharded-cluster-config-0
ReadOnly: false
agent:
Type: EmptyDir (a temporary directory that shares a pod’s lifetime)
Medium:
SizeLimit:
agent-api-key:
Type: Secret (a volume populated by a Secret)
SecretName: 6483123e672c203f9e99fa42-group-secret
Optional: false
database-scripts:
Type: EmptyDir (a temporary directory that shares a pod’s lifetime)
Medium:
SizeLimit:
kube-api-access-lx8q9:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
ConfigMapName: openshift-service-ca.crt
ConfigMapOptional:
QoS Class: Burstable
Node-Selectors:
Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message

Warning Unhealthy 4m13s (x15507 over 20h) kubelet Readiness probe failed:

Following error received in pod logs in next comment :-

Please suggest a work around

Piyush_Harshwal · June 13, 2023, 7:45am

{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:22.279+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:22.279] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:22.380+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:22.380] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:22.381+0000] [.warn] [metrics/collector/util.go:getPingStatus:84] [06:38:22.380] Failed to fetch replStatus for nwcc-sharded-cluster-config-0 : [06:38:22.380] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:23.301+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:23.301] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:23.402+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:23.402] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:23.402+0000] [.warn] [metrics/collector/util.go:getPingStatus:84] [06:38:23.402] Failed to fetch replStatus for nwcc-sharded-cluster-config-0 : [06:38:23.402] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:24.292+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:24.292] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:24.394+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:24.394] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:24.394+0000] [.warn] [metrics/collector/util.go:getPingStatus:84] [06:38:24.394] Failed to fetch replStatus for nwcc-sharded-cluster-config-0 : [06:38:24.394] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:25.293+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:25.293] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:25.395+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:25.395] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:25.395+0000] [.warn] [metrics/collector/util.go:getPingStatus:84] [06:38:25.395] Failed to fetch replStatus for nwcc-sharded-cluster-config-0 : [06:38:25.395] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:25.648+0000] [.info] [src/director/director.go:computePlan:278] [06:38:25.648] … process has a plan : Download,DownloadMongosh,Start,WaitAllRsMembersUp,RsInit,WaitFeatureCompatibilityVersionCorrect”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:25.649+0000] [.info] [src/director/director.go:tracef:806] [06:38:25.649] Running step: ‘Download’ of move ‘Download’”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:25.649+0000] [.info] [src/director/director.go:tracef:806] [06:38:25.649] because “}
{“logType”:“automation-agent-verbose”,“contents”:”[‘desiredState.FullVersion’ is not a member of ‘currentState.VersionsOnDisk’ (‘desiredState.FullVersion’={"trueName":"5.0.7-ent","gitVersion":"b977129dc70eed766cbee7e412d901ee213acbda","modules":["enterprise"],"major":5,"minor":0,"patch":7}, ‘currentState.VersionsOnDisk’=)]”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:25.649+0000] [.info] [src/action/helpers.go:touchMarkerFile:793] [06:38:25.649] Marker file /var/lib/mongodb-mms-automation created”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:25.649+0000] [.info] [src/action/downloadmongo.go:downloadUngzipUntarMongoDb:294] [06:38:25.649] Starting to download and extract http://172.16.10.28:8080/mongodb/linux/mongodb-linux-x86_64-enterprise-rhel80-5.0.7.tgz into /var/lib/mongodb-mms-automation”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:25.653+0000] [.error] [src/util/download.go:downloadCustomClient:272] [06:38:25.653] Got 404 status code for url=http://172.16.10.28:8080/mongodb/linux/mongodb-linux-x86_64-enterprise-rhel80-5.0.7.tgz.”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:25.653+0000] [.error] [src/action/downloadmongo.go:downloadUngzipUntarMongoDb:313] [06:38:25.653] Error downloading url=http://172.16.10.28:8080/mongodb/linux/mongodb-linux-x86_64-enterprise-rhel80-5.0.7.tgz to /var/lib/mongodb-mms-automation/mongodb-linux-x86_64-5.0.7-ent : [06:38:25.653] Got 404 status code for url=http://172.16.10.28:8080/mongodb/linux/mongodb-linux-x86_64-enterprise-rhel80-5.0.7.tgz.”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:25.653+0000] [.info] [src/action/downloadmongo.go:downloadMongoBinary:241] [06:38:25.653] Error downloading http://172.16.10.28:8080/mongodb/linux/mongodb-linux-x86_64-enterprise-rhel80-5.0.7.tgz : sleeping for 30 seconds and trying the download again.”}
{“logType”:“automation-agent-verbose”,“contents”:“err = [06:38:25.653] Error downloading url=http://172.16.10.28:8080/mongodb/linux/mongodb-linux-x86_64-enterprise-rhel80-5.0.7.tgz to /var/lib/mongodb-mms-automation/mongodb-linux-x86_64-5.0.7-ent : [06:38:25.653] Got 404 status code for url=http://172.16.10.28:8080/mongodb/linux/mongodb-linux-x86_64-enterprise-rhel80-5.0.7.tgz.”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:25.953+0000] [.info] [src/config/config.go:ReadClusterConfig:440] [06:38:25.953] Retrieving cluster config from http://nwcc-opsmanager-svc.mongodb.svc.cluster.local:8080/agents/api/automation/conf/v1/6483123e672c203f9e99fa42?av=12.0.14.7630&aos=linux&aa=x86_64&ab=64&ad=rhel83&ah=nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local&ahs=nwcc-sharded-cluster-config-0&at=1686567541084…”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:25.991+0000] [.info] [main/components/agent.go:LoadClusterConfig:277] [06:38:25.991] clusterConfig unchanged”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:26.293+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:26.293] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:26.394+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:26.394] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:26.394+0000] [.warn] [metrics/collector/util.go:getPingStatus:84] [06:38:26.394] Failed to fetch replStatus for nwcc-sharded-cluster-config-0 : [06:38:26.394] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:27.026+0000] [.info] [src/config/config.go:ReadClusterConfig:440] [06:38:27.026] Retrieving cluster config from http://nwcc-opsmanager-svc.mongodb.svc.cluster.local:8080/agents/api/automation/conf/v1/6483123e672c203f9e99fa42?av=12.0.14.7630&aos=linux&aa=x86_64&ab=64&ad=rhel83&ah=nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local&ahs=nwcc-sharded-cluster-config-0&at=1686567541084…”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:27.034+0000] [.info] [main/components/agent.go:LoadClusterConfig:277] [06:38:27.034] clusterConfig unchanged”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:27.293+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:27.293] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:27.395+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:27.395] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:27.395+0000] [.warn] [metrics/collector/util.go:getPingStatus:84] [06:38:27.395] Failed to fetch replStatus for nwcc-sharded-cluster-config-0 : [06:38:27.395] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:28.045+0000] [.info] [src/config/config.go:ReadClusterConfig:440] [06:38:28.045] Retrieving cluster config from http://nwcc-opsmanager-svc.mongodb.svc.cluster.local:8080/agents/api/automation/conf/v1/6483123e672c203f9e99fa42?av=12.0.14.7630&aos=linux&aa=x86_64&ab=64&ad=rhel83&ah=nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local&ahs=nwcc-sharded-cluster-config-0&at=1686567541084…”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:28.052+0000] [.info] [main/components/agent.go:LoadClusterConfig:277] [06:38:28.052] clusterConfig unchanged”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:28.294+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:28.294] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:28.396+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:28.396] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:28.396+0000] [.warn] [metrics/collector/util.go:getPingStatus:84] [06:38:28.396] Failed to fetch replStatus for nwcc-sharded-cluster-config-0 : [06:38:28.396] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:29.054+0000] [.info] [src/config/config.go:ReadClusterConfig:440] [06:38:29.054] Retrieving cluster config from http://nwcc-opsmanager-svc.mongodb.svc.cluster.local:8080/agents/api/automation/conf/v1/6483123e672c203f9e99fa42?av=12.0.14.7630&aos=linux&aa=x86_64&ab=64&ad=rhel83&ah=nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local&ahs=nwcc-sharded-cluster-config-0&at=1686567541084…”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:29.061+0000] [.info] [main/components/agent.go:LoadClusterConfig:277] [06:38:29.061] clusterConfig unchanged”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:29.295+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:29.295] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:29.396+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:29.396] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:29.396+0000] [.warn] [metrics/collector/util.go:getPingStatus:84] [06:38:29.396] Failed to fetch replStatus for nwcc-sharded-cluster-config-0 : [06:38:29.396] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:30.082+0000] [.info] [src/config/config.go:ReadClusterConfig:440] [06:38:30.082] Retrieving cluster config from http://nwcc-opsmanager-svc.mongodb.svc.cluster.local:8080/agents/api/automation/conf/v1/6483123e672c203f9e99fa42?av=12.0.14.7630&aos=linux&aa=x86_64&ab=64&ad=rhel83&ah=nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local&ahs=nwcc-sharded-cluster-config-0&at=1686567541084…”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:30.091+0000] [.info] [main/components/agent.go:LoadClusterConfig:277] [06:38:30.091] clusterConfig unchanged”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:30.295+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:30.295] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:30.396+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:30.396] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:30.396+0000] [.warn] [metrics/collector/util.go:getPingStatus:84] [06:38:30.396] Failed to fetch replStatus for nwcc-sharded-cluster-config-0 : [06:38:30.396] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:31.123+0000] [.info] [src/config/config.go:ReadClusterConfig:440] [06:38:31.122] Retrieving cluster config from http://nwcc-opsmanager-svc.mongodb.svc.cluster.local:8080/agents/api/automation/conf/v1/6483123e672c203f9e99fa42?av=12.0.14.7630&aos=linux&aa=x86_64&ab=64&ad=rhel83&ah=nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local&ahs=nwcc-sharded-cluster-config-0&at=1686567541084…”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:31.160+0000] [.info] [main/components/agent.go:LoadClusterConfig:277] [06:38:31.160] clusterConfig unchanged”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:31.296+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:31.296] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:31.397+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:31.397] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:31.397+0000] [.warn] [metrics/collector/util.go:getPingStatus:84] [06:38:31.397] Failed to fetch replStatus for nwcc-sharded-cluster-config-0 : [06:38:31.397] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:32.163+0000] [.info] [src/config/config.go:ReadClusterConfig:440] [06:38:32.163] Retrieving cluster config from http://nwcc-opsmanager-svc.mongodb.svc.cluster.local:8080/agents/api/automation/conf/v1/6483123e672c203f9e99fa42?av=12.0.14.7630&aos=linux&aa=x86_64&ab=64&ad=rhel83&ah=nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local&ahs=nwcc-sharded-cluster-config-0&at=1686567541084…”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:32.171+0000] [.info] [main/components/agent.go:LoadClusterConfig:277] [06:38:32.171] clusterConfig unchanged”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:32.297+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:32.297] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:32.398+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1105] [06:38:32.398] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:32.398+0000] [.warn] [metrics/collector/util.go:getPingStatus:84] [06:38:32.398] Failed to fetch replStatus for nwcc-sharded-cluster-config-0 : [06:38:32.398] Server at nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local:27017 (local=false) is down”}
{“logType”:“automation-agent-verbose”,“contents”:“[2023-06-13T06:38:33.217+0000] [.info] [src/config/config.go:ReadClusterConfig:440] [06:38:33.217] Retrieving cluster config from http://nwcc-opsmanager-svc.mongodb.svc.cluster.local:8080/agents/api/automation/conf/v1/6483123e672c203f9e99fa42?av=12.0.14.7630&aos=linux&aa=x86_64&ab=64&ad=rhel83&ah=nwcc-sharded-cluster-config-0.nwcc-sharded-cluster-cs.mongodb.svc.cluster.local&ahs=nwcc-sharded-cluster-config-0&at=1686567541084…”}

Aasawari · June 20, 2023, 7:09am

Hi @Piyush_Harshwal

In general it is preferable to start a new discussion to keep the details of different environments/questions separate and improve visibility of new discussions. That will also allow you to mark your topic as “Solved” when you resolve any outstanding questions.

Mentioning the url of an existing discussion on the forum will automatically create links between related discussions for other users to follow.

Please have a look at How to write a good post/question for some ideas on best practices.

I also recommend reading Getting Started with the MongoDB Community: README.1ST for some tips to help improve your community outcomes.

Regards,
Aasawari