Mongodb - combining multiple collections, but each individual query might not need all columns

Here is the scenario : We have 2 tables (issues, anomalies) in BigQuery, which we plan to combine into a single document in MongoDB, since the 2 collections (issues, anomalies) is data about particular site.

[
{
  "site": "abc",
  "issues": {
    --- issues data --
  },
  "anomalies": {
   -- anomalies data --  
 } 

}

]

There are some queries which require the ‘issues’ data, while others require ‘anomalies’ data. In the future, we might need to show ‘issues’ & ‘anomalies’ data together, which is the reason why i’m planning to combine the two in a single document.

Questions on the approach above, wrt performance/volume of data read:

When we read the combined document, is there a way to read only specific columns (so the data volume read is not huge) ? Or does this mean that when we read the document, the entire document is loaded in memory ?

Pls let me know.

tia!

A general rule is to keep together data that is used together.

So having issues and anomalies within the same site document would be the way to go.

Performance wise, if you need both most of the time, it is better since you avoid $lookup. I am not sure about what you mean by reading.

  1. reading by the server when the requested data is not in RAM
  2. reading by the application when the data is manipulated

You cannot really control the reading part of the server, documents are read and written as a whole. What ever if your application only need issues for a particular use-case, anomalies will be read from disk into cache if it is not already present. But that should not be an issue unless your document are very big as data on disk is compressed.

However, you may control the data sent from the server to be read by your application with $project.

Edited to add the following link.