Hello, I’m fairly new to MongoDB Aggregation Pipelines. I’m trying to test the performance of MongoDB pipelines for possible use in my company.
For this test, I have generated 50 million documents (using this amazing tool). The document structure it’s fairly simple:
- cost (float)
The main index is a composite index for
One of my requirements for aggregation queries is to sum
cost for a specific day (or month), filtering by
type. The pipeline I designed is quite simple, but it does the job and it’s also quite fast, except for the first “execution”.
To give an idea of the performances, on my 2.4Ghz laptop, using a 50 million documents collection, the first execution takes between 2 and 3 seconds and the following executions between 500 and 600 ms.
Overall, these are excellent response times, but I’m wondering why the first execution is always a bit slower?
Is MongoDB caching the “correct” exec plan? Or, am I getting cached results after the second execution?
I would love to understand more, and also how should I go about understanding what MongoDB is doing under the hood in cases like this.