How to know if a document "lives" in the working set? (looking at totalDocsExamined)

Hi (-:

So I know about the concept of the “working set” in MongoDB. From what I’ve read, the working set is a pool of some “hot” documents that sit in RAM for quick access. MongoDB has an “internal statistics special sauce” to determine which docs will be there.

So now, if the user runs a query, and a document is in the working set, it can be fetched quickly from the RAM, and skip hitting the DISK.

So I’ve run this query and look at totalDocsExamined:

From what I’ve learned totalDocsExamined of 1, means that 1 doc was fetched from DISK.

So my question is, does totalDocsExamined of 1 mean that this document was actually fetched from DISK and not from the working set?

So if I continue with the query above, and try different userIds, would I be able to sometime get lucky and “catch” a document which sits in the working set, and will return totalDocsExamined of 0?

I am asking this, because no matter how much I’ve tried, even with very “hot” documents that are being accessed by all users all the time, I always see totalDocsExamined of 1.

According to documentation totalDocsExamined has nothing to do about reading the document from disk or not and had nothing to do about being in the working set or not.

More or less a document is examined if the server cannot determined, using the index only, if the said document should be returned or not.

For example, assuming you have the index {foo:1,bar:1}.

If your query is something like {foo:123,bar:456}, the document does not have to be examined since the index has both fields. However if your query is {foo:123,bar:456,isActive:false}, all documents with foo:123 and bar:456 will be examined to see if isActive is true or false.

I am pretty sure that you will get a totalDocsExamined:0 and nReturned not 0 only for covered queries. That is when all queried fields and projected fields are part of the index.

Reading and writing from and to disk is the storage engine job. I really do not know what kind of statistics you can get from wiredtiger.

1 Like

@ steevej should be correct.

Think from design principal, the metric totalDocsExamined is an important number indicating how efficient your query is. If this number will be 0 when all docs examined are fetch from ram, it will be confusing developers making them believe (wrongly) that their query is very efficient. Also in that case, the number will highly depend on the available memory size, which is not expected by developers.

Only covered queries will have 0 as totalDocsExamined.

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.