Hi everyone,
I have a 3 member PSS replica set on Kubernetes that experiences memory growth over a period of several days (ranging from 2-5 typically). When the primary approaches the k8s statefulset memory limit, it is either OOMKilled or becomes unresponsive. The typical connectivity to the db is via two backend NodeJS processes.
rs0:PRIMARY> db.stats(1024*1024)
{
"db" : "XXXX",
"collections" : 17,
"views" : 0,
"objects" : 157290,
"avgObjSize" : 636.4013287558014,
"dataSize" : 95.46238422393799,
"storageSize" : 44.02734375,
"numExtents" : 0,
"indexes" : 35,
"indexSize" : 7.859375,
"scaleFactor" : 1048576,
"fsUsedSize" : 1546.35546875,
"fsTotalSize" : 10015.26953125,
"ok" : 1,
"$clusterTime" : {
"clusterTime" : Timestamp(1640188489, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1640188489, 1)
}
rs0:PRIMARY> db.serverStatus().tcmalloc.tcmalloc.formattedString
------------------------------------------------
MALLOC: 224431808 ( 214.0 MiB) Bytes in use by application
MALLOC: + 8695808 ( 8.3 MiB) Bytes in page heap freelist
MALLOC: + 5014824 ( 4.8 MiB) Bytes in central cache freelist
MALLOC: + 2499968 ( 2.4 MiB) Bytes in transfer cache freelist
MALLOC: + 39351960 ( 37.5 MiB) Bytes in thread cache freelists
MALLOC: + 5636096 ( 5.4 MiB) Bytes in malloc metadata
MALLOC: ------------
MALLOC: = 285630464 ( 272.4 MiB) Actual memory used (physical + swap)
MALLOC: + 3829760 ( 3.7 MiB) Bytes released to OS (aka unmapped)
MALLOC: ------------
MALLOC: = 289460224 ( 276.1 MiB) Virtual address space used
MALLOC:
MALLOC: 12497 Spans in use
MALLOC: 107 Thread heaps in use
MALLOC: 4096 Tcmalloc page size
------------------------------------------------
The k8s cluster has 2 nodes, each with 1.7G total allocatable memory.
As you can see, the complete working set easily fits within the WiredTiger cache:
rs0:PRIMARY> db.serverStatus().wiredTiger.cache["maximum bytes configured"]
268435456
rs0:PRIMARY> db.serverStatus().connections
{
"current" : 58,
"available" : 838802,
"totalCreated" : 51153,
"active" : 2
}
The k8s mongodb statefulset memory spec:
resources:
limits:
memory: 384M
requests:
cpu: 10m
memory: 128M
The nightly scheduled mongodump params are:
--readPreference primary --forceTableScan --numParallelCollections=2
mongodb 4.2.17
mongodb-tools 100.5.1
I have searched extensively to understand the minimum memory requirements for such a tiny replicaset configuration, but cannot find any recommendations. I have also tried to marginal increases to memory beyond 384M but it only seems to delay the inevitable OOMKilled.
Hints or help are greatly appreciated.