Index rebuilding in Atlas replica set

Our application workload requires regular mass deletes and inserts of new data in a specific collection (based on outside factors various subsets get deleted on a daily basis and replacement data is loaded in).

We have discovered that our indexes seem to be growing in time, even when total data size and document count does not. Running a reindex() cut them down to 1/3th of the size. That difference can be a make or break in terms of performance, we could fit much more indexes into RAM.

The problem we have on Atlas is:

  • when we run db.collection.reIndex() on our replica set, the reindexing happens only on the primary node
  • according to docs, we should put the other servers into standalone mode to run it
  • we are not able to directly connect to replica set hosts (might be our fault, but still we would need to remove them from the replica set)

Is there an operationally sustainable (automated, minimal or zero downtime) way to regularly rebuild specific indexes on all hosts in a replica set?

Hi @SamDaroczy,

when we run db.collection.reIndex() on our replica set, the reindexing happens only on the primary node

For replica sets, db.collection.reIndex() will not propagate from the primary to secondaries. db.collection.reIndex() will only affect a single mongod instance.

according to docs, we should put the other servers into standalone mode to run it

I presume this may be the v5.0 docs which you are referring to. The v4.4 docs (and lower) do not state this requirement.

we are not able to directly connect to replica set hosts (might be our fault, but still we would need to remove them from the replica set)

Have you tried by using the non-srv connection string and removing the replicaSet option? An example of this below:

mongo "mongodb://<SecondaryHostname>:27017/<DatabaseName>" --ssl --authenticationDatabase admin --username <username> --password <password>

Note: You can find the connection string with all host names generally by selecting an older version when going through the connect modal window

Additionally, it sounds like you have run this on your PRIMARY already but I would recommend running this on SECONDARY nodes only. You can run it on SECONDARY nodes one by one and then perform a Test Failover to step down the PRIMARY to a SECONDARY in which you can then run the last reIndex() on (For a default 3 node replica set on Atlas).

Hope this helps.

Kind Regards,
Jason

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.