Shrinked data size after upgrading 5yo MongoDB.

Hey guys,

My colleagues and I are working with inherited infra so please don’t judge :slight_smile: Today we were upgrading Mongo on one of the servers - from 3.2 to 4.2. After mongodump > mongorestore we have noticed that the size of the collections has decreased by about 30% (~4GB less). We’re not that familiar with the stored data since we’re new hires but it does not look anything is missing.

Do you have any idea what could have caused that reduction in storage size? Do recent versions of Mongo store data in a more efficient way than back in 2015?

Thanks

Hi @Josh_White welcome to the community.

If you think all your data are there, I don’t think there’s anything to worry about. This is a side effect of how WiredTiger manages its storage. If a document was deleted, WiredTiger doesn’t necessarily release the space back to the OS, with the thinking that a typical database usually will have more data in the future, not less. Thus, space left by deleted documents are left to be reused.

If WiredTiger keep releasing space to the OS and reallocate them again, this release-reallocate cycle does no useful work and will be a net negative to performance. Hence WiredTiger does not do this.

This is outlined briefly in How do I reclaim disk space in WiredTiger. In your case, the number from the old database’s wiredTiger.block-manager.file bytes available for reuse from the output of db.collection.stats() should add up to the “missing” size in the new database.

Best regards,
Kevin

1 Like