Why my aggregation is not using index? (Atlas Search configured)

Hello there,

I am developing an application that access a mongo database. We have created a collection with 455k documents for testing purposes. While developing, I realized that the performance was not that good (more than a second to perform the query). I know it sounds fast, but it doesn’t look good for future.

So, I was using Mongo Atlas Profiler and there was where I saw a strange behaviour. The planSummary is COLLSCAN, and the docsExamined are 455735. I can believe with that info my entire collection is being scaned. Here comes the problem: the field that I am using inside match has an index and I really don’t know why it is not being used. I use an Atlas search in another field in this collection (I didn’t create it as the default).

Below are the infos that Atlas profiler is showing to me. Sorry, I know it is big. Being more specific: the field tenantId in the first match in pipeline has an index. As I said, it is an aggregation, with some steps…

{
  "type": "command",
  "ns": "smartpos-dev-env.product",
  "command": {
    "aggregate": "product",
    "pipeline": [
      {
        "$match": {
          "tenantId": "bbb60d4e-212f-445e-97a7-ddad13395931"
        }
      },
      {
        "$lookup": {
          "from": "category",
          "localField": "category",
          "foreignField": "_id",
          "as": "category"
        }
      },
      {
        "$unwind": "$category"
      },
      {
        "$project": {
          "legacyId": "$productCode",
          "id": 1,
          "codAlfa": 1,
          "description": 1,
          "saleValue": 1,
          "promotionalValue": 1,
          "promotionalDisplayTimer": 1,
          "promotionalExpirationDate": 1,
          "lastUpdate": 1,
          "detail": 1,
          "googleProductCategory": 1,
          "hasVariant": 1,
          "featuredPosition": 1,
          "categoryId": "$category._id",
          "categoryDescription": "$category.description"
        }
      },
      {
        "$sort": {
          "featuredPosition": 1
        }
      },
      {
        "$skip": 0
      },
      {
        "$limit": 30
      }
    ],
    "cursor": {},
    "allowDiskUse": false,
    "collation": {
      "locale": "pt"
    },
    "$db": "smartpos-dev-env",
    "$clusterTime": {
      "clusterTime": {
        "$timestamp": {
          "t": 1657059071,
          "i": 20
        }
      },
      "signature": {
        "hash": {
          "$binary": {
            "base64": "XKFGrZDq3g2cs9HUQaoigMg67z0=",
            "subType": "00"
          }
        },
        "keyId": 7116985341772300000
      }
    },
    "lsid": {
      "id": {
        "$binary": {
          "base64": "56xHmGfTRCG9TAuLgbUhKg==",
          "subType": "04"
        }
      }
    }
  },
  "planSummary": "COLLSCAN",
  "keysExamined": 7,
  "docsExamined": 455735,
  "hasSortStage": true,
  "cursorExhausted": true,
  "numYields": 455,
  "nreturned": 8,
  "queryHash": "21B91BC1",
  "planCacheKey": "F49AA791",
  "reslen": 2575,
  "locks": {
    "Global": {
      "acquireCount": {
        "r": 473
      }
    },
    "Mutex": {
      "acquireCount": {
        "r": 18
      }
    }
  },
  "readConcern": {
    "level": "local",
    "provenance": "implicitDefault"
  },
  "writeConcern": {
    "w": "majority",
    "wtimeout": 0,
    "provenance": "implicitDefault"
  },
  "storage": {},
  "remote": "54.211.177.188:35965",
  "protocol": "op_msg",
  "durationMillis": 301,
  "v": "5.0.9"
}

Hi @Renan_Geraldo,

Actually, this behavior doesn’t seem to be due to the Atlas Search index configured on this collection. Since the first stage in the pipeline is a $match, and not a $search, the presence or absence of an Atlas Search index on this collection will make no difference.

I suspect that the behavior that you are observing might be due to the collation being used in the query, which is:

    "collation": {
      "locale": "pt"
    },

Does the index on tenantId have the same collation defined as the above? Note that an index with a collation cannot support an operation that performs string comparisons on the indexed fields if the operation specifies a different collation, as documented here. Also, the reverse is true - if the index does not have a collation defined then a query like the above that specifies a collation won’t be able to use that index, and will resort to a COLLSCAN.

Please check if that’s the case here.

Thanks,
Harshad

3 Likes

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.