Building a 4-field compound index on a 1.2B documents collection (MongoDB 7.0.20): expected impact on cluster?

Hello everyone,

We are planning to create a new compound index on a very large collection and I would like to get some advice or hear from people with similar experience.

Environment / background:

  • MongoDB version: 7.0.20
  • Collection size: ~1.2 billion documents (90 days TTL)
  • Index to be created: compound index on 4 fields (string + timestamp + int32 + int32)
  • Cluster type: replica set (3 nodes)
  • Hardware per node: 12 vCPU, 23.5 GB RAM, 2.5 TB fast SSD

The largest index we have ever built was on a collection of about 2 million documents. That build went relatively smoothly: the impact was noticeable only at the very beginning, then it continued in the background with little effect on cluster performance.

My questions are:

  1. With 1.2B documents and a 4-field compound index, how long can the build realistically take? Hours? Days?
  2. What level of impact on the cluster should we expect during the build (CPU, memory, I/O, replication lag)?
  3. Are there any best practices for reducing the impact? (e.g. maxIndexBuildMemoryUsageMegabytes, building on a secondary first, temporarily reducing write load, etc.). Normally we create indexes through MongoDB Compass, and so far that workflow has been smooth for smaller datasets.
  4. Any real-world cases / metrics from those who built indexes on billion+ document collections would be extremely helpful.

Thanks in advance for sharing your experience!