WiredTiger cache status (Working Set)

Hi Community,

Can someone help and analyze below WiredTIger.cache stats and suggest if there wiredTiger cache size need to increase ?

{
        "application threads page read from disk to cache count" : 327399071,
        "application threads page read from disk to cache time (usecs)" : 30961681878,
        "application threads page write from cache to disk count" : 3957777,
        "application threads page write from cache to disk time (usecs)" : 879859200,
        "bytes belonging to page images in the cache" : 5962097784,
        "bytes belonging to the cache overflow table in the cache" : 182,
        "bytes currently in the cache" : 6297119974,
        "bytes not belonging to page images in the cache" : 335022189,
        "bytes read into cache" : NumberLong("10697633891413"),
        "bytes written from cache" : 96058542714,
        "cache overflow cursor application thread wait time (usecs)" : 0,
        "cache overflow cursor internal thread wait time (usecs)" : 0,
        "cache overflow score" : 0,
        "cache overflow table entries" : 0,
        "cache overflow table insert calls" : 0,
        "cache overflow table max on-disk size" : 0,
        "cache overflow table on-disk size" : 0,
        "cache overflow table remove calls" : 0,
        "checkpoint blocked page eviction" : 55,
        "eviction calls to get a page" : 336695779,
        "eviction calls to get a page found queue empty" : 9160503,
        "eviction calls to get a page found queue empty after locking" : 4009923,
        "eviction currently operating in aggressive mode" : 0,
        "eviction empty score" : 0,
        "eviction passes of a file" : 1666156000,
        "eviction server candidate queue empty when topping up" : 1468129,
        "eviction server candidate queue not empty when topping up" : 2532493,
        "eviction server evicting pages" : 0,
        "eviction server slept, because we did not make progress with eviction" : 14927017,
        "eviction server unable to reach eviction goal" : 0,
        "eviction state" : 64,
        "eviction walk target pages histogram - 0-9" : 1660912885,
        "eviction walk target pages histogram - 10-31" : 1167311,
        "eviction walk target pages histogram - 128 and higher" : 0,
        "eviction walk target pages histogram - 32-63" : 606182,
        "eviction walk target pages histogram - 64-128" : 3469622,
        "eviction walks abandoned" : 4168376,
        "eviction walks gave up because they restarted their walk twice" : 1645756428,
        "eviction walks gave up because they saw too many pages and found no candidates" : 8456934,
        "eviction walks gave up because they saw too many pages and found too few candidates" : 17760,
        "eviction walks reached end of tree" : 3300161895,
        "eviction walks started from root of tree" : 1657176952,
        "eviction walks started from saved location in tree" : 8979048,
        "eviction worker thread active" : 4,
        "eviction worker thread created" : 0,
        "eviction worker thread evicting pages" : 323658140,
        "eviction worker thread removed" : 0,
        "eviction worker thread stable number" : 0,
        "failed eviction of pages that exceeded the in-memory maximum count" : 6607,
        "failed eviction of pages that exceeded the in-memory maximum time (usecs)" : 5902,
        "files with active eviction walks" : 0,
        "files with new eviction walks started" : 1654405467,
        "force re-tuning of eviction workers once in a while" : 0,
        "hazard pointer blocked page eviction" : 659884,
        "hazard pointer check calls" : 327250926,
        "hazard pointer check entries walked" : 9220828506,
        "hazard pointer maximum array length" : 42,
        "in-memory page passed criteria to be split" : 19000,
        "in-memory page splits" : 9509,
        "internal pages evicted" : 1055038,
        "internal pages split during eviction" : 114,
        "leaf pages split during eviction" : 43773,
        "maximum bytes configured" : 7874805760,
        "maximum page size at eviction" : 26877,
        "modified pages evicted" : 1021176,
        "modified pages evicted by application threads" : 0,
        "operations timed out waiting for space in cache" : 0,
        "overflow pages read into cache" : 0,
        "page split during eviction deepened the tree" : 0,
        "page written requiring cache overflow records" : 0,
        "pages currently held in the cache" : 216173,
        "pages evicted because they exceeded the in-memory maximum count" : 22800,
        "pages evicted because they exceeded the in-memory maximum time (usecs)" : 6083073,
        "pages evicted because they had chains of deleted items count" : 3886695,
        "pages evicted because they had chains of deleted items time (usecs)" : 3303170,
        "pages evicted by application threads" : 60,
        "pages queued for eviction" : 398428510,
        "pages queued for urgent eviction" : 11843046,
        "pages queued for urgent eviction during walk" : 19765,
        "pages read into cache" : 327416121,
        "pages read into cache after truncate" : 5061,
        "pages read into cache after truncate in prepare state" : 0,
        "pages read into cache requiring cache overflow entries" : 0,
        "pages read into cache requiring cache overflow for checkpoint" : 0,
        "pages read into cache skipping older cache overflow entries" : 0,
        "pages read into cache with skipped cache overflow entries needed later" : 0,
        "pages read into cache with skipped cache overflow entries needed later by checkpoint" : 0,
        "pages requested from the cache" : 20182969265,
        "pages seen by eviction walk" : 2277363640,
        "pages selected for eviction unable to be evicted" : 686461,
        "pages walked for eviction" : 80676236195,
        "pages written from cache" : 3998879,
        "pages written requiring in-memory restoration" : 20088,
        "percentage overhead" : 8,
        "tracked bytes belonging to internal pages in the cache" : 163573705,
        "tracked bytes belonging to leaf pages in the cache" : 6133546269,
        "tracked dirty bytes in the cache" : 17379297,
        "tracked dirty pages in the cache" : 52,
        "unmodified pages evicted" : 324927301
}

I have checked below parameter , does they indicate wiredTiger cache need to increase ? Is there any other parameter need to review to make a decision for wiredTiger cache increase ?
``|
“maximum bytes configured” : 7874805760,
“bytes currently in the cache” : 6297119974,
“tracked dirty bytes in the cache” : 17379297,
“pages written from cache” : 3998879,
“pages read into cache” : 327416121,

Regards
SS

Hi @satvant_sandhu welcome to the community!

Generally the WiredTiger cache is set to ~50% of RAM by default, and this default was chosen since it works well in most cases, and also leave about half of the RAM for OS use (other processes, filesystem cache, etc.). In short, there’s typically no need to change this value unless your needs is very, very specific, and also after a deep troubleshooting by a MongoDB engineer.

It’s generally more beneficial to increase the amount of RAM in the machine, instead of changing the WiredTiger cache size.

Having said that, what issue are you seeing? Are you seeing slow queries, extremely busy disk, or similar performance issues?

Best regards
Kevin

2 Likes

Hi Kevinadi,

Thanks for the update .

We have customer complaint for slow queries and we want to ensure our current workingSet is fitting in WiredTiger cache . We can increase RAM size but need help to under stand the wiredTIger cache stats . Which stats in wiredTiger cache we need to look to conclude if current WiredTigerCacheSize is sufficient to hold working set.

Would you please able to analyze the WiredTIger cache stats give in the query and recommend do we need to increase the RAM or not ? Also what are the main parameter in cache stats which we should look and what is the threshold value per recommendation .

Thanks in advanced .

Regards
Satvant Singh

We are

Hi @satvant_sandhu

I’m afraid analysing WiredTiger performance and other deep performance-related issues is not as simple as checking some statistics. The reason why MongoDB records thousands of statistics each second in full-time diagnostic data capture is because all of them have to be examined holistically in relation to one another. There is no single metric that can tell you what’s happening in the system. Note that these metrics are just one of the important tools to troubleshoot a deployment, but it’s not the only tool. In a typical investigation, FTDC data, disk data, OS data, and all relevant information are collected and examined thoroughly, so doing that on a public forum is extra challenging :slight_smile:

Having said that, we can sort of see if a deployment is overworked by checking some things:

  1. The output of mongostat specifically dirty and used. If dirty is consistently above 5% and used is consistently above 80%, then it’s one sign that the working set doesn’t fit in RAM (or the disk is too slow, which is another possibility).
  2. The output of iostat. This would vary from system to system, but if you see that there’s long wait for IO, or if the IO is heavily utilized, that’s another sign that the database is hitting disk more often, and thus more RAM may be needed.
  3. If you see a lot of slow queries in the MongoDB logs, and all those queries are correctly optimized, then perhaps the hardware is not enough to service the work.

Ideally you don’t want to see high dirty in mongostat (high used is ok – and healthy, and you want this to be about 80% – but this also depends on the use case), low delay in IO operation, and no slow queries in the logs.

Best regards
Kevin

1 Like