MongoDB container exceeds k8s RAM limit

Hi!
I have this problem, my MongoDB container always exceeds my k8s memory limits after a while.
For example after working an hour on load, it can reach even 4/6GiB of ram.

My k8s yaml contains:

        resources:
          requests:
            memory: "1Gi"
          limits:
            memory: "3Gi

Also, i specify MongoDB to use config file: command:

["mongod","--config","/etc/mongod.conf"]

Which has:

storage:
  wiredTiger:
    engineConfig:
      cacheSizeGB: 1

On start up i even get log that shows limitations are working on k8s: “availableMemSizeMB”:3072 and on “cacheSizeGB”:1.0…

{"t":{"$date":"2020-11-20T10:20:54.897+00:00"},"s":"W",  "c":"CONTROL",  "id":20720,   "ctx":"initandlisten","msg":"Available memory is less than system memory","attr":{"availableMemSizeMB":3072,"systemMemSizeMB":15847}}
{"t":{"$date":"2020-11-20T10:20:54.897+00:00"},"s":"I",  "c":"CONTROL",  "id":23403,   "ctx":"initandlisten","msg":"Build Info","attr":{"buildInfo":{"version":"4.4.0","gitVersion":"563487e100c4215e2dce98d0af2a6a5a2d67c5cf","openSSLVersion":"OpenSSL 1.1.1  11 Sep 2018","modules":[],"allocator":"tcmalloc","environment":{"distmod":"ubuntu1804","distarch":"x86_64","target_arch":"x86_64"}}}}
{"t":{"$date":"2020-11-20T10:20:54.897+00:00"},"s":"I",  "c":"CONTROL",  "id":51765,   "ctx":"initandlisten","msg":"Operating System","attr":{"os":{"name":"Ubuntu","version":"18.04"}}}
{"t":{"$date":"2020-11-20T10:20:54.897+00:00"},"s":"I",  "c":"CONTROL",  "id":21951,   "ctx":"initandlisten","msg":"Options set by command line","attr":{"options":{"config":"/etc/mongod.conf","net":{"bindIp":"*"},"storage":{"wiredTiger":{"engineConfig":{"cacheSizeGB":1.0}}}}}}

On Grafana i can see fallowing use of RAM:

Extra info that i think might be useful:
MongoDb version: 4.4.0

db.stats()
{
        "db" : "YYY",
        "collections" : 2,
        "views" : 1,
        "objects" : 370687,
        "avgObjSize" : 31445.03214302093,
        "dataSize" : 11656264630,
        "storageSize" : 2233819136,
        "indexes" : 33,
        "indexSize" : 929951744,
        "totalSize" : 3163770880,
        "scaleFactor" : 1,
        "fsUsedSize" : 10017492992,
        "fsTotalSize" : 161033994240,
        "ok" : 1
}
db.serverStatus().wiredTiger.cache
{
        "application threads page read from disk to cache count" : 3321,
        "application threads page read from disk to cache time (usecs)" : 9010883,
        "application threads page write from cache to disk count" : 958,
        "application threads page write from cache to disk time (usecs)" : 363838,
        "bytes allocated for updates" : 7305804,
        "bytes belonging to page images in the cache" : 170520520,
        "bytes belonging to the history store table in the cache" : 554,
        "bytes currently in the cache" : 193956579,
        "bytes dirty in the cache cumulative" : 173682490,
        "bytes not belonging to page images in the cache" : 23436058,
        "bytes read into cache" : 44543321,
        "bytes written from cache" : 206773479,
        "cache overflow score" : 0,
        "checkpoint blocked page eviction" : 0,
        "eviction calls to get a page" : 7724,
        "eviction calls to get a page found queue empty" : 87,
        "eviction calls to get a page found queue empty after locking" : 189,
        "eviction currently operating in aggressive mode" : 0,
        "eviction empty score" : 0,
        "eviction passes of a file" : 3626,
        "eviction server candidate queue empty when topping up" : 23,
        "eviction server candidate queue not empty when topping up" : 421,
        "eviction server evicting pages" : 0,
        "eviction server slept, because we did not make progress with eviction" : 1870,
        "eviction server unable to reach eviction goal" : 0,
        "eviction server waiting for a leaf page" : 0,
        "eviction state" : 64,
        "eviction walk target pages histogram - 0-9" : 1940,
        "eviction walk target pages histogram - 10-31" : 1310,
        "eviction walk target pages histogram - 128 and higher" : 0,
        "eviction walk target pages histogram - 32-63" : 292,
        "eviction walk target pages histogram - 64-128" : 84,
        "eviction walk target strategy both clean and dirty pages" : 0,
        "eviction walk target strategy only clean pages" : 0,
        "eviction walk target strategy only dirty pages" : 3626,
        "eviction walks abandoned" : 316,
        "eviction walks gave up because they restarted their walk twice" : 1084,
        "eviction walks gave up because they saw too many pages and found no candidates" : 820,
        "eviction walks gave up because they saw too many pages and found too few candidates" : 41,
        "eviction walks reached end of tree" : 3153,
        "eviction walks started from root of tree" : 1955,
        "eviction walks started from saved location in tree" : 1671,
        "eviction worker thread active" : 4,
        "eviction worker thread created" : 0,
        "eviction worker thread evicting pages" : 7448,
        "eviction worker thread removed" : 0,
        "eviction worker thread stable number" : 0,
        "files with active eviction walks" : 0,
        "files with new eviction walks started" : 2069,
        "force re-tuning of eviction workers once in a while" : 0,
        "forced eviction - history store pages failed to evict while session has history store cursor open" : 0,
        "forced eviction - history store pages selected while session has history store cursor open" : 0,
        "forced eviction - history store pages successfully evicted while session has history store cursor open" : 0,
        "forced eviction - pages evicted that were clean count" : 0,
        "forced eviction - pages evicted that were clean time (usecs)" : 0,
        "forced eviction - pages evicted that were dirty count" : 0,
        "forced eviction - pages evicted that were dirty time (usecs)" : 0,
        "forced eviction - pages selected because of too many deleted items count" : 0,
        "forced eviction - pages selected count" : 12,
        "forced eviction - pages selected unable to be evicted count" : 0,
        "forced eviction - pages selected unable to be evicted time" : 0,
        "forced eviction - session returned rollback error while force evicting due to being oldest" : 0,
        "hazard pointer blocked page eviction" : 7,
        "hazard pointer check calls" : 7460,
        "hazard pointer check entries walked" : 40419,
        "hazard pointer maximum array length" : 2,
        "history store key truncation calls that returned restart" : 0,
        "history store key truncation due to mixed timestamps" : 0,
        "history store key truncation due to the key being removed from the data page" : 9,
        "history store score" : 0,
        "history store table insert calls" : 0,
        "history store table insert calls that returned restart" : 0,
        "history store table max on-disk size" : 0,
        "history store table on-disk size" : 4096,
        "history store table out-of-order resolved updates that lose their durable timestamp" : 0,
        "history store table out-of-order updates that were fixed up by moving existing records" : 0,
        "history store table out-of-order updates that were fixed up during insertion" : 0,
        "history store table reads" : 0,
        "history store table reads missed" : 0,
        "history store table reads requiring squashed modifies" : 0,
        "history store table remove calls due to key truncation" : 0,
        "history store table writes requiring squashed modifies" : 0,
        "in-memory page passed criteria to be split" : 24,
        "in-memory page splits" : 12,
        "internal pages evicted" : 0,
        "internal pages queued for eviction" : 3,
        "internal pages seen by eviction walk" : 1153,
        "internal pages seen by eviction walk that are already queued" : 29,
        "internal pages split during eviction" : 0,
        "leaf pages split during eviction" : 255,
        "maximum bytes configured" : 1073741824,
        "maximum page size at eviction" : 0,
        "modified pages evicted" : 7440,
        "modified pages evicted by application threads" : 0,
        "operations timed out waiting for space in cache" : 0,
        "overflow pages read into cache" : 0,
        "page split during eviction deepened the tree" : 0,
        "page written requiring history store records" : 32,
        "pages currently held in the cache" : 5169,
        "pages evicted by application threads" : 0,
        "pages queued for eviction" : 36108,
        "pages queued for eviction post lru sorting" : 66063,
        "pages queued for urgent eviction" : 2,
        "pages queued for urgent eviction during walk" : 2,
        "pages read into cache" : 3368,
        "pages read into cache after truncate" : 1,
        "pages read into cache after truncate in prepare state" : 0,
        "pages requested from the cache" : 458124,
        "pages seen by eviction walk" : 408760,
        "pages seen by eviction walk that are already queued" : 68447,
        "pages selected for eviction unable to be evicted" : 7,
        "pages selected for eviction unable to be evicted as the parent page has overflow items" : 0,
        "pages selected for eviction unable to be evicted because of active children on an internal page" : 0,
        "pages selected for eviction unable to be evicted because of failure in reconciliation" : 0,
        "pages walked for eviction" : 911582,
        "pages written from cache" : 10154,
        "pages written requiring in-memory restoration" : 32,
        "percentage overhead" : 8,
        "tracked bytes belonging to internal pages in the cache" : 1770576,
        "tracked bytes belonging to leaf pages in the cache" : 192186003,
        "tracked dirty bytes in the cache" : 380,
        "tracked dirty pages in the cache" : 1,
        "unmodified pages evicted" : 0
}
1 Like

Hello @Bartosz_Wlazlo!

Are you running more than one mongod in the same machine?

Hi @santimir.
No, its singe mongod, one k8s pod (1 container).

Did you know how to solve this issue? I am getting the same issue.

@Bartosz_Wlazlo did you get this issue fixed? I also get the same issue and I tried to config wiredTigerSizeCache, but it didn’t work.

Hi guys

this is a well known issue and is not related to mongodb.

The container where the OS is running does not see the memory settings for the Pod but instead of the Node. So Pod->request/limit=1GB and Node->16GB, then the container and therefore mongodb assumes it has 16GB of memory available. And there you have the OOM.

You can check the log of mongod and you will see it reports avail mem for the Node. Interestingely the setting for the cache is correct (regarding the Pod request/limit)