Compaction Issue in MongoDB Atlas Replica Set

We have a MongoDB replica set in Atlas with a large number of documents being created and deleted. However, it seems that MongoDB is not reclaiming or reusing the storage freed by TTL deletions, as storage consumption continues to increase.

To address this, I wrote a script that iterates over all databases and collections, executing the compact command. However, I noticed that:

  1. Compaction only works on the primary node.
  2. To compact data on replicas, I had to promote each replica to primary by triggering a resiliency test via the Atlas UI.
  3. After performing this on two nodes, I was able to reclaim a substantial amount of storage.

The Issue

Now, I’m in a situation where two nodes have significantly more free storage than the third. The problem is:

  • I can’t compact the third server because the primary election always selects one of the two nodes that have already undergone compaction.
  • Atlas does not allow me to manually resync or force compaction on a replica.
  • Even when I connect directly to the third node, the compact command is still routed to the primary.
  • Since MongoDB uses logical replication, compaction commands are not replicated.

Questions:

  1. How can I compact my third server in this scenario?
  2. What happens if the third node reaches its storage limit while the other two still have free space due to compaction?

Would appreciate any insights or recommended approaches to resolve this. Thanks!

  1. How can I compact my third server in this scenario?

Complete step by step on how to perform compact operation on Atlas cluster.

All Atlas deployments utilize the WiredTiger (WT) storage engine, which is a no-overwrite data engine that only releases disk space when blocks available for re-use are checked and used for the writing of new blocks before files are extended. More often than not, there is no need to forcibly compact your data, as WT should manage this for you.

However, if you would like to reclaim disk space regardless of the information provided above, you can use compact() to try to release unused space that has been allocated for re-use by WT.
Taking the above into account, please see the procedure outlined below (which applies to both replica sets and sharded clusters).

Procedure

  1. Create a MongoDB user with the required privileges to run the compact() command.

  2. Log into a SECONDARY node via the mongo shell with the user that you just created. This can be done by specifying only the SECONDARY node’s hostname in the mongo shell command, similar to the following (where the 02 node is a current SECONDARY):

    mongo "mongodb://test-shard-00-02-abc123.mongodb.net:27017/test" --ssl --authenticationDatabase admin --username USERNAME --password PASSWORD
    
  3. Run the compact() command. See the documentation for required or optional fields.

    NOTE: Only run the command on one SECONDARY node at a time. Wait for it to complete before moving onto the next node!

  4. Confirm that you have completed the compact() command by downloading and reviewing the member’s mongod log. The following message can be seen when the compact() command is complete:

    2019-06-12T22:45:30.765+0000 I COMMAND [conn9382] compact <collection ns> end
    
  5. Perform steps 2-4 on all current SECONDARY nodes in the cluster.

  6. Once you have completed running compact() on all SECONDARY nodes in the cluster, use the cluster’s Test Failover functionality to step down the current PRIMARY member(s) and wait for the member(s) to transition to SECONDARY status.

  7. Perform the compact() command again on the remaining member(s) that stepped down in step 5.

For more background on the compact() command in MongoDB, please see our compact() command documentation.

  1. What happens if the third node reaches its storage limit while the other two still have free space due to compaction?

Enable storage auto-sclaing and you can always reach out to MongoDB support via chats or support cases if you encounter any issues.