Storage estimation based on existing collections

Hi,

I need to add more storage to our mongodb server as the disk is getting full.
I am using mongodb community edition version 4.2 with WiredTiger Storage Engine

When I check my large collection size I see there is a large difference between the storageSize and the data size, where the Storage size is much smaller then the size (about a quarter).

I read that WiredTiger Storage Engine is compressing the data, hence the difference.

My question is:
When I add more disk space, shouId base my calculation on the storageSize or the size?
Although I understand that at the end the data will be compressed by WiredTiger Storage Engine, I wonder if Mongo need the full size capacity for doing its internals?

Thanks,
Tamar

Hi @Tamar_Nirenberg and welcome to MongoDB community forums!!

In practice, it’s advisable to evaluate the compressibility of your documents. This assessment should guide the decision of how much disk space to add, taking into account the anticipated growth of your data.

As mentioned in the Wired Tiger documentation, the data on the wired tiger cache is uncompressed while the data on the disk will be in the compressed format. And if I understand the question correctly, I believe you should be basing their calculation on storageSize (compressed versions of their documents)

Let us know if you have any further questions.

Warm Regards
Aasawari