Insanely slow aggregation query

The query below takes up to 10 seconds to aggregate data. I have around 10 million records in collection. I tried multiple approaches but nothing worked. Any ideas?

db.collection.aggregate([
            {
                $match: {
                    start: {
                        $gte: new Date(monthBefore),
                        $lte: new Date(today)
                    }
                }
            },
            {
                $group: {
                    _id: {
                        symbol: '$symbol'
                    },
                    unverifiedCount: {
                        $sum: {
                            $cond: {
                                if: { $eq: ['$isVerified', false] }, then: '$count', else: 0
                            }
                        }
                    },
                    verifiedCount: {
                        $sum: {
                            $cond: {
                                if: { $eq: ['$isVerified', true] }, then: '$count', else: 0
                            }
                        }
                    }
                }
            }, {
                $sort: {
                    unverifiedCount: -1
                }
            }
        ])

Would like to mention that creating indexes for the collection didn’t increase the performance either.
Indexes:

 "start" : -1

"symbol" : -1,
"start" : -1,
"isVerified" : -1

"symbol" : -1

Hi @Ali_Abdulhameed,

Welcome to the community :wave:

/// index 1
"start" : -1

/// index 2
"symbol" : -1,
"start" : -1,
"isVerified" : -1

/// index 3
"symbol" : -1

I presume these are 3 seperate indexes. Please correct me if I’m wrong here.

Only the first index "start" : -1 is relevant for your pipeline. Additionally, once the results are fetched via $match , the $group and $sort stage are processing results in-memory. Could you provide the explain() output for the aggregation you are running?

Please to refer to the pipeline operators and indexes documentation which you may find useful.

Regards,
Jason

2 Likes

It would be most helpful if you provide results for explain("executionStats") when asking about performance issue for a particular pipeline. Please also let us know what version of MongoDB this is and what server attributes are (like amount of RAM, etc).

Asya

1 Like

Here is the result of the query with explain()

{
        "stages" : [
                {
                        "$cursor" : {
                                "query" : {
                                        "start" : {
                                                "$gte" : ISODate("2021-12-11T00:00:00Z"),
                                                "$lte" : ISODate("2022-01-11T00:00:00Z")
                                        }
                                },
                                "fields" : {
                                        "count" : 1,
                                        "isVerified" : 1,
                                        "symbol" : 1,
                                        "_id" : 0
                                },
                                "queryPlanner" : {
                                        "plannerVersion" : 1,
                                        "namespace" : "Database.collection",
                                        "indexFilterSet" : false,
                                        "parsedQuery" : {
                                                "$and" : [
                                                        {
                                                                "start" : {
                                                                        "$lte" : ISODate("2022-01-11T00:00:00Z")
                                                                }
                                                        },
                                                        {
                                                                "start" : {
                                                                        "$gte" : ISODate("2021-12-11T00:00:00Z")
                                                                }
                                                        }
                                                ]
                                        },
                                        "winningPlan" : {
                                                "stage" : "FETCH",
                                                "inputStage" : {
                                                        "stage" : "IXSCAN",
                                                        "keyPattern" : {
                                                                "start" : -1
                                                        },
                                                        "indexName" : "start_-1",
                                                        "isMultiKey" : false,
                                                        "multiKeyPaths" : {
                                                                "start" : [ ]
                                                        },
                                                        "isUnique" : false,
                                                        "isSparse" : false,
                                                        "isPartial" : false,
                                                        "indexVersion" : 2,
                                                        "direction" : "forward",
                                                        "indexBounds" : {
                                                                "start" : [
                                                                        "[new Date(1641859200000), new Date(1639180800000)]"
                                                                ]
                                                        }
                                                }
                                        },
                                        "rejectedPlans" : [ ]
                                },
                                "executionStats" : {
                                        "executionSuccess" : true,
                                        "nReturned" : 7222420,
                                        "executionTimeMillis" : 14323,
                                        "totalKeysExamined" : 7222420,
                                        "totalDocsExamined" : 7222420,
                                        "executionStages" : {
                                                "stage" : "FETCH",
                                                "nReturned" : 7222420,
                                                "executionTimeMillisEstimate" : 13158,
                                                "works" : 7222421,
                                                "advanced" : 7222420,
                                                "needTime" : 0,
                                                "needYield" : 0,
                                                "saveState" : 56429,
                                                "restoreState" : 56429,
                                                "isEOF" : 1,
                                                "invalidates" : 0,
                                                "docsExamined" : 7222420,
                                                "alreadyHasObj" : 0,
                                                "inputStage" : {
                                                        "stage" : "IXSCAN",
                                                        "nReturned" : 7222420,
                                                        "executionTimeMillisEstimate" : 2720,
                                                        "works" : 7222421,
                                                        "advanced" : 7222420,
                                                        "needTime" : 0,
                                                        "needYield" : 0,
                                                        "saveState" : 56429,
                                                        "restoreState" : 56429,
                                                        "isEOF" : 1,
                                                        "invalidates" : 0,
                                                        "keyPattern" : {
                                                                "start" : -1
                                                        },
                                                        "indexName" : "start_-1",
                                                        "isMultiKey" : false,
                                                        "multiKeyPaths" : {
                                                                "start" : [ ]
                                                        },
                                                        "isUnique" : false,
                                                        "isSparse" : false,
                                                        "isPartial" : false,
                                                        "indexVersion" : 2,
                                                        "direction" : "forward",
                                                        "indexBounds" : {
                                                                "start" : [
                                                                        "[new Date(1641859200000), new Date(1639180800000)]"
                                                                ]
                                                        },
                                                        "keysExamined" : 7222420,
                                                        "seeks" : 1,
                                                        "dupsTested" : 0,
                                                        "dupsDropped" : 0,
                                                        "seenInvalidated" : 0
                                                }
                                        }
                                }
                        }
                },
                {
                        "$group" : {
                                "_id" : {
                                        "symbol" : "$symbol"
                                },
                                "unverifiedCount" : {
                                        "$sum" : {
                                                "$cond" : [
                                                        {
                                                                "$eq" : [
                                                                        "$isVerified",
                                                                        {
                                                                                "$const" : false
                                                                        }
                                                                ]
                                                        },
                                                        "$count",
                                                        {
                                                                "$const" : 0
                                                        }
                                                ]
                                        }
                                },
                                "verifiedCount" : {
                                        "$sum" : {
                                                "$cond" : [
                                                        {
                                                                "$eq" : [
                                                                        "$isVerified",
                                                                        {
                                                                                "$const" : true
                                                                        }
                                                                ]
                                                        },
                                                        "$count",
                                                        {
                                                                "$const" : 0
                                                        }
                                                ]
                                        }
                                }
                        }
                },
                {
                        "$sort" : {
                                "sortKey" : {
                                        "unverifiedCount" : -1
                                }
                        }
                }
        ],
        "ok" : 1
}

I included the results of explain(“executionStats”) in a reply to the original post.

I included the results of explain(“executionStats”) in a reply to the original post…

Thanks for providing the output @Ali_Abdulhameed,

Based off the output, the 7222420 (~7.2million) documents are being fetched via ranged index query which means indexing is not the issue here.

Could you provide the MongoDB version being utilised as well as the infrastructure resource details (RAM, CPU, etc) as suggested by Asya?

One suggestion would be to perform the $match stage by itself, monitor the amount of time this takes and compare this to the timing of a $match, $group and finally compared to a $match, $group, $sort (the original pipeline you have posted).

I would suggest also monitoring resource utilisation during each of these tests (RAM, I/O, and CPU) to see if there are any bottlenecks and additionally running each of the aggregations several times to allow indexes and documents to be cached in RAM

If the $match stage is slow, it could possibly be caused by RAM exhaustion (leading to swapping to/from disk with high I/O wait on CPU).

If the $group and $sort stage add significant overhead, it would suggest CPU might be the limiting factors since these stages are happening in-memory.

Regards,
Jason