Why are my databases different sizes across the replica set?

Trevor_Rydalch · September 25, 2020, 6:10am

I’m working on getting an old (v 2.6.9) mongo replica set healthy so that we can upgrade it. Unfortunately, there is some weird replication behavior.

Primary show databases output:

History             0.203GB
PublicRecords       0.203GB
Starfields         83.913GB
admin               0.078GB
config              0.078GB
data_sets           0.078GB
local             435.865GB
mongo_data_local    0.078GB
mongo_data_test     0.078GB
nagios              0.203GB
test                0.078GB

Secondary show databases output:

History          0.078GB
PublicRecords    0.203GB
Starfields      83.913GB
admin            (empty)
data_sets        0.078GB
local           50.054GB
nagios           0.078GB
test             (empty)

If you add together the DataSets and the local db sizes, they add up to almost exactly the same amount.

I can’t figure out why that is. DataSets on the Secondary is also missing a collection.

But rs.status() lists them both as healthy (along with an arbiter).

I want to get the data in a good state before attempting any sort of upgrade. What could cause this?

Pavel_Duchovny · September 25, 2020, 7:36am

Hi @Trevor_Rydalch,

Replica set sizes can vary as some replica members might go through more resyncs than others. Plus 2.6.9 historical version is using mmapv1 which is much more sensetive to fragmentation.

Additionally, the local database is also a subject of oplog size defined for each member. In 2.6.9 those are defined per member and the highest chance is that Primary has a much bigger oplog resulting in large local database compare to secondary.

If you compact or resync all nodes all non local databases should go to the same size. However its not critical for upgrades. You can do counts on collections to see you get same number of docs.

Best
Pavel

Trevor_Rydalch · September 26, 2020, 1:33am

I just performed a resync of one of the nodes. It’s nearly complete, but the average object size is considerably larger than on the primary and it’s increasing. Any idea why this might be?

Trevor_Rydalch · September 26, 2020, 1:33am

Additionally, what could cause document counts to not match across all members of a replica set? The other secondary in the set has fewer documents (a very small percentage) and is not behind on syncing.

Pavel_Duchovny · September 26, 2020, 11:41am

This is rather unexpected. When you run rs.printSlaveReplicationInfo() you see same info on both nodes?

Perhaps since 2.6 is far from being supported its best to take a 3.6 Mongodump from yiur primary and restore it to a new 3.6 replica set ? Thanks
Pavel