Huge database usage with auto-scaling enabled

Ben_Lewis · January 24, 2023, 1:22am

Hi All,

I’m trying to diagnose an issue with our mongodb installation (hosted with mongodb) - we enabled auto scaling on our account as we’re syncing a few gb of data and then our account ran out of space cutting off access to the db.

However since doing this, the auto-scaling seems to be triggering constantly and constantly increasing the size of our system.

So while we’re only storing around ~2gb of data in our system, mongo is reporting ~130gb of disk usage overall. In mongosh / db stats, our filestorage is reporting a datasize of 1662mb but then a fsUsedSize of 131,548Mb

I’m not quite sure how to fix this as now we’re paying a really high rate for a system that’s only using 2gb but if we reduce the size it’s “out of space” again.

All of this happened within a 24h period where it went from a ~2gb instance to 130gb. Any ideas would be appreciated.

Thanks
~Ben

Jason_Tran · January 25, 2023, 5:07am

Hi @Ben_Lewis,

I assume this is an Atlas deployment based on the “auto scaling” mention. If this is the case, I would definitely raise this with the Atlas in-app chat support team (or raise a support case if you have a support subscription) if the auto-scaling upgrade got stuck here.

The Atlas support team will have more insight into your cluster / Atlas project.

Regards,
Jason

Ben_Lewis · January 25, 2023, 11:00am

Hey Jason,

Thanks for the reply. Yes its a mongodb managed solution.

Chat support has been less than helpful and sent me here unfortunately.

I can confirm the problem is with the oplog which is reporting 170gb, which is insane given our system is around 2gb stored.

But I can’t seem to change the oplog size, or clear it, or do anything… it’s super frustrating.

Ben_Lewis · January 25, 2023, 11:42am

OK, so I managed to figure out some things.

for some reason the oplog went haywire and was holding 170gb of data despite being configured for 990mb max (not sure how this is possible). You can’t set the max size in mb via the atlas web GUI.
editing the max oplog window in the GUI of atlas did nothing. The cli still reported 24h
editing the max oplog window via shell is not possible due to “replSetResizeOplog” not seemingly available on their cloud servers

BUT

you can set them via atlas CLI which when I set a max size different to the default, immediately started purging data from the local table and removing its filesize.

> atlas clusters advancedSettings update Cluster0 --oplogSizeMB 2500
> atlas clusters advancedSettings update Cluster0 --oplogMinRetentionHours 0.25

Then compact oplog via shell

> db.runCommand({ compact: "oplog.rs", "force":true })

Thanks for your help Jason - it was actually another post you commented on about the oplog that really helped hone in on this one.

Seems crazy to me that Atlas support couldn’t have recommended looking at oplog when I reported this problem.

system · January 30, 2023, 11:43am

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.