MongoDB single node replica set - localDB size bloats and container restarts

Gayathri_Prasad · March 22, 2023, 12:23pm

I am trying to bring up single node mongoDB replica set on K8s cluster with the following profile
Server version : 4.4.13
Wired tiger and journal enabled
CPU : 4
Memory : 4 Gi
PV size : 30GB

Once the pod is up and running , I am running a restore script from a mongodump backup (mongorestore command) which is loading 13GB of data to a test database. While the data is loading , noticing pod restarts and the terminated container logs had Wired Tiger Disk quota exceeded error.

Before the restore mongo did not have any data and used disk size was negligible
Noticed that while the restore is running disk size shoots up and causes pod to restart
After the restore fails, mongodb settles down and the disk size goes back to expected size

After this experiment, Show dbs command shows that local.oplog.rs collection is bloated with 10-13GB data which is same as testDB data … Also noticed that wiredTiger.wt file size is not 0
Final disk size show 22GB ( local + test DB)

Now any further attempts of restore causes mongo to continue to crash … The container logs also contain fassert errors and wired tiger recovery logs

Please help as why the mongo local DB is bloating up so much and restore is causing huge disk usage

Thanks
Gayathri

steevej · March 22, 2023, 4:27pm

The collection local.oplog.rs is there for replication purpose (and change stream). It holds as much operations (up to a configurable limit) as possible. Since you run mongorestore with 13GB of data, mongod will try to keep all those inserts in the oplog so that they could be replicated or streamed using change stream.

If you do not want the overhead of oplog.rs simply do no run as a replica set.

If running a replica set is a must (unlikely since you have a single node) you have to live with oplog.rs. You may make it smaller. You may also use a disk snapshot to restore your DB rather than mongorestore.

The term

is quite negative because the oplog is really needed for replication and change stream. If you use neither, then do not run a replica set.

Gayathri_Prasad · March 23, 2023, 4:35am

Thanks steevej… You mentioned oplog.rs holds as much operations (up to a configurable limit) but I notice that it is going past the oplog limit that I have … Is this expected… I can workaround if somehow I can contain this collection size using any parameter.

MongoDB Enterprise rs0:PRIMARY> show dbs
admin                    0.000GB
config                   0.000GB
iam                      0.000GB
local                    2.228GB
maglev-ingress           0.000GB
managed-services-shared  0.695GB
sys-ops                  0.000GB

MongoDB Enterprise rs0:PRIMARY> db.printReplicationInfo()
configured oplog size:   1520.99560546875MB
log length start to end: 741secs (0.21hrs)
oplog first event time:  Wed Mar 22 2023 18:54:54 GMT+0000 (UTC)
oplog last event time:   Wed Mar 22 2023 19:07:15 GMT+0000 (UTC)
now:                     Wed Mar 22 2023 19:07:19 GMT+0000 (UTC)

steevej · March 23, 2023, 12:07pm

The oplog.rs is one of the collection from the local database.

Can you share the size of the other collections?

Gayathri_Prasad · March 23, 2023, 12:49pm

This output differs from the earlier example… But have captured all the metrics below
All other collection sizes apart from oplog.rs are minimal

MongoDB Enterprise rs0:PRIMARY> show dbs
admin                    0.000GB
config                   0.000GB
iam                      0.000GB
local                    7.663GB
maglev-ingress           0.000GB
managed-services-shared  9.026GB
sys-ops                  0.014GB
MongoDB Enterprise rs0:PRIMARY> use local
switched to db local
MongoDB Enterprise rs0:PRIMARY> show collections
oplog.rs
replset.election
replset.initialSyncId
replset.minvalid
replset.oplogTruncateAfterPoint
startup_log
system.replset
system.rollback.id
MongoDB Enterprise rs0:PRIMARY> db.oplog.rs.stats().storageSize
8227635200
MongoDB Enterprise rs0:PRIMARY> db.replset.election.stats().storageSize
36864
MongoDB Enterprise rs0:PRIMARY> db.replset.initialSyncId.stats().storageSize
20480
MongoDB Enterprise rs0:PRIMARY> db.replset.minvalid.stats().storageSize
36864
MongoDB Enterprise rs0:PRIMARY> db.replset.oplogTruncateAfterPoint.stats().storageSize
36864
MongoDB Enterprise rs0:PRIMARY> db.startup_log.stats().storageSize
36864
MongoDB Enterprise rs0:PRIMARY> db.system.replset.stats().storageSize
36864
MongoDB Enterprise rs0:PRIMARY> db.system.rollback.id.stats().storageSize
36864
MongoDB Enterprise rs0:PRIMARY> db.printRepliactionInfo
local.printRepliactionInfo
MongoDB Enterprise rs0:PRIMARY> db.printReplicationInfo()
configured oplog size:   1520.99560546875MB
log length start to end: 8735secs (2.43hrs)
oplog first event time:  Thu Mar 23 2023 10:18:45 GMT+0000 (UTC)
oplog last event time:   Thu Mar 23 2023 12:44:20 GMT+0000 (UTC)
now:                     Thu Mar 23 2023 12:44:26 GMT+0000 (UTC)

Another question I had…
When i run the mongorestore, I also notice that the disk usage is shooting upto 23GB even though the total size of data (including local DB is 15GB) This is transient and settles down to expected size once the restore operation is complete. This indicated there is some temp metadata file that is getting written and then cleaned up… Any light on this? Just FYI , I have journaling enabled if that matters in this case.

Thanks