I’ve been investigating this further, and have even spun up a separate MongoDB 4.0 cluster (same specs) to test alongside the 6.0.2 cluster. I’ve found two things:
-
On both clusters, with the same data (1.3m documents), running the exact same query, same indexes, same specs, etc. the 4.0 database doesn’t resort to use disk for the $group at all (explicitly disabled with allowDiskUse: false). On the 6.0.2 cluster however, not only does it use disk but uses it even with 100,000 documents (tested by inserting a $limit before the $group). Why is resorting to disk so aggressive in 6.0? Is this configurable at all?
-
When it does resort to disk, the performance difference is significant (not surprising). I see that an “internal-xxx-xxxx.wt” file is created in my mongodb data directory, growing to 10MB size (tiny), but does so slowly - takes about 3 minutes just to get to this size, growing about 3MB per minute, then stopping at that size before decreasing to 8MB. I/O on the impacted volume is incredibly low (oplog actually sits on a separate disk). Meanwhile CPU on one of the cores of the 8-core, 64GB RAM machine is at 100%.
This feels to me like a bug but I’d appreciate confirmation either way to ensure I’m not wasting my time here.