Aggregate slow in a huge collection

Hello all,

I am working with mongodb 4.4 and i am faced with strange behavior.

I run a this very simple aggregate in a collection that contains actually 30 millions of documents :

db.ECRITURE.aggregate([
    { "$match": {
        "A_ECR_01": "01",
	      "A_ECR_02": "02"
    } },
    { "$group": {
        "_id": {"$concat": ["$A_ECR_01", "_", "$A_ECR_02" , "_", "$A_ECR_03"]},
        "A_CHN_ACH_011":{
            "$sum":{
               "$cond":[
                  {
                     "$in":[
                        "$codRegroupement",
                        [
                           "DEB_SST"
                        ]
                     ]
                  },
                  "$A_CHN_ACH_011",
                  null
               ]
            }
         }
    } }
])

On my collection i have a compound index on A_ECR_01, A_ECR_02, A_ECR_03. But this aggregate takes about 30s.

But if i execute this same aggregate on a collection containing only the documents for “A_ECR_01”: “01” and “A_ECR_02”: “02” it takes 1s.

Note : there is around 500K documents in the collections for “A_ECR_01”: “01” and “A_ECR_02”: “02”.

It’s like the match stage doesn’t work well.

Do you have an idea where to investigate ?

Best regards.

Hi @steph_xp,

Welcome to MongoDB community!

First of all have you tried using just the index with the 2 exact fields for the match stage on this collection? If that yield better results why not to use it.

I suspect this is since $group stage has concatenation in its grouping expression so the engine cannot use the index and needs to access each document to perform the grouping.

Why not to group based on the 3 values and add the concatenation in a next project stage?

"_id": {A: "$A_ECR_01", B:  "$A_ECR_02" ,C: "$A_ECR_03"},
        "A_CHN_ACH_011":{

If that won’t help, please run explain(“executionStats”). aggregate and show output.

Thanks
Pavel