Does execulding fields reduce working set size?

Khaled_ElNaggar · August 10, 2022, 1:39pm

Hello, say I have a collection called article which has author, views and content

For some APIs I want to return the whole document, so, I query db.article.find({})

In others I exclude content, so, I query db.article.find({}, {content: 0})

From the database engine perspective, is this optimization helpful with reducing the working set’s size (compared to returning the full document)?

Thank you

NeNaD · August 10, 2022, 1:42pm

Hi,

Yes, it’s optimized since content information will not be passed through network, which means faster response.

Stennie_X · August 10, 2022, 3:29pm

Hi @Khaled_ElNaggar,

As confirmed by @NeNaD, projection can reduce the size of results returned over the network by removing unnecessary fields from result documents.

Reducing the size of result documents will likely benefit the working set for your client application that has to manipulate result documents, but it generally does not reduce the working set for your MongoDB deployment.

There are two cases to consider with respect to working set impact and projections for a query:

The special case of a covered query that can be satisfied entirely using an index (so the original document doesn’t need to be in memory to satisfy this query). Projection can be useful to ensure a query is covered (for example, specifying _id:0 for a secondary index that does not include the _id field). However, the size of the covering index adds to your working set and documents that are frequently or recently accessed will also be in the working set (possibly leading to more memory usage than without the covering index).
The general case of a query that is not covered (such as your example above). Projection will not reduce the size of the working set: the full document will be loaded in memory (uncompressed) in order to select the required fields. Large documents where you are frequently working with a small subset of data are typically a schema design anti-pattern.

For more information on document size and working set impact, please see:

One more projection caveat (and a common misstep) is using $project early in an aggregation pipeline as an attempted optimisation. The aggregation framework automatically does dependency analysis to determine which fields are needed for subsequent stages. Early projection is redundant and can lead to less optimal memory usage for pipeline execution.

Regards,
Stennie

Khaled_ElNaggar · August 11, 2022, 9:09am

Thank you for your thorough answer. Really helped.

system · August 16, 2022, 9:10am

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.