Slow query with missing field

Hi,

I have the below query which is very slow on a collection of about 2M documents. I believe it is due to the ‘retracted=null’ filter (if I remove it the query is fast). I have the following indexes: retracted_1_date_-1_feedCode_1
and
retracted_1_date_-1_feedCodes_1_feedCode_1_status_1

Oddly enough the first one seems to be the one that’s used as the second one shows zero uses (which is strage since status is part of the query).
It should be note that the vast majority of documents have retracted missing null (the field is actually missing), but based on what I read that should not make a difference. Any ideas why this query is so slow? Thanks.

"find": "stories",
    "filter": {
      "retracted": {
        "$eq": null
      },
      "status": {
        "$in": [
          1,
          null
        ]
      },
      "$or": [
        {
          "feedCode": {
            "$in": [
              "EVERT"
            ]
          }
        },
        {
          "feedCodes": {
            "$in": [
              "EVERT"
            ]
          }
        }
      ]
    },
    "sort": {
      "date": -1
    },
    "projection": {
      "body": 0
    },
    "limit": 21,

Hello, welcome to the MongoDB community.

If the retracted field is generally missing, it might be ideal to validate the use of a Sparse index.

https://www.mongodb.com/docs/manual/core/index-sparse/#:~:text=Sparse%20indexes%20only%20contain%20entries,all%20documents%20of%20a%20collection.

Thanks. I looked into that, but need to be able to query specifically documents that are missing the field (or have it set as null) and meet other conditions. I don’t think a sparse index would help achieve that?

A filtered index perhaps?

Could you be more specific? I’m not sure how I would apply a filtered index to this need (where all my stories have a code, and I need to query on the code but include only stories with retracted = null).

More generally though, why is the query slow, and why is it not using the index I would expect?

I imagine the problem with retracted is because MongoDB needs to check each document to see if the field is missing or null, which can be slow if it’s not properly indexed or if the majority of documents meet the criteria. Ideally, provide an explanation.

Regarding the index, try creating a filtered index like this and see the result:

db.coll.createIndex(
    { fields ... }, 
    { "partialFilterExpression": { "retracted": { "$eq": null } } }
);