estimating how much RAM a MongoDB system will need is not an easy task, As a rule of thumb we have: size of all indexes plus number of all active documents multiplied with the average document size. So far so good.
But how can I verify any numbers when my system is up and running? How do I know when I reach limits?
Let’s do this by an example:
- we have 128GB RAM
- for simplicity all indexes take 23,5 GB
- MongoDB will allocate per default 50 % of (RAM - 1GB), so we have in this example 63,5 GB RAM for MongoDB
- 63,5 GB minus 23,5 GB for the indexes will make 40 GB remaining for documents
- from the mongod.log we get that the average document size is 4 MB
- 40 GB divided by 4 MB would mean to be able to have appr. max 10000 documents in RAM
Here comes the part I want to verify.
- How system be checked is still doing well?
- How to see that the documents are provided from RAM not from disk ?
- How to get number of actively used documents? In the above example we could have 5000 documents used lately and 5000 documents which have been used long time before. I would take this as: all fine kind of 40% room to grow size indexes will grow too… How can this be measured?
- How are “active documents” defined, which time frame do we us? I assume that this depends on the use case. In the line before I just mentioned “lately”
- maybe the ration dirty RAM vs currently allocated RAM can be an indicator, but this would not deliver a warning, it can just alarm when it is already too late
- on the other hand, on a higher level: it would be interesting to know the “age” of an document which is push out of RAM (the older the doc the more RAM is available for my active documents).
Anyone there who knows how to measure this?
The initial question could be rephrased as:
How to measure the minimal size needed for recent documents in cache?
So this is not like in the design phase where we estimate document sizes, connections, index sizes and try to get a feeling, this is about to measuring in a running system.