Hi,
I have a very slow aggregation (240 sec), but an index exists, which should match the query perfectly.
If I add a hint to the aggregation with the index name it is blazing fast (0.2 sec) on 120M records.
Now, if I compare the explain output between the aggregation with and without hint, they look the same.
Both use a IXSCAN on the same index.
If your usual use-case is the query shared where you have equality match on paid_by and range match on timestamp, your index would be better with paid_by before timestamp.
I don’t think your $project helps but it may negatively impact.
Hi!
I can help you with some insights that I took from your query and your index.
The first think I noticed is that your index uses a lot of fields. This can be a problem because the indexes are documents just like the documents from your collections. So if your index is huge, this will increase the overall cost of the operations in your collections.
I would recommend you to create an index with less fields.
The second thing that I noticed is that you are not using the ESR rule for your index, which says that you should place first the equality, then sort and lastly the range field.
For your use case, action and paid_by are equality operators, timestamp and quality are range operators and you don’t have any sort operation, so for this case it would not be necessary.
Another topic that I would like to mention is that you should try to always use the more granular fields first, for the equality operators. It means that, in your cause, you should place paid_by after action, because I imagine that paid_by can have much more different values than action.
And since you only need the _id field and the quality field, I would recommend you inserting this field in the last position of your index, so your query will not need to fetch the document from disk, because the index will have everything that is needed.
I would recommend this index for this specific query: