Why Query doesnt use index with ESR rule here?

Hi below query has been executed in Mongo

  "command": {
    "getMore": 7229634113631845000,
    "collection": "data",
    "batchSize": 4899,

.......


  "originatingCommand": {
    "find": "data",
    "filter": {
      "accountId": "AAA-367YTGSA",
      "customIterator": {
        "$gte": {
          "$date": "2072-11-05T01:41:58.041Z"
        }
      },
      "startTime": {
        "$lte": {
          "$date": "2022-12-06T17:00:00Z"
        }
      },
      "type": {
        "$in": [
          "TYPE_A",
          "TYPE_B"
        ]
      }
    },
    "sort": {
      "accountId": 1,
      "customIterator": 1
    },
    "limit": 5000,
    "maxTimeMS": 300000,

.....



  "planSummary": [
    {
      "IXSCAN": {
        "accountId": 1,
        "customIterator": 1,
        "startTime": 1,
        "type": 1
      }
    }
  ],

I have two indexes as below:

First Index:

accountId_customIterator_startTime_type
accountId:1 customIterator:1 startTime:1 type:1  

Second Index:

accountId_type_customIterator_startTime
accountId:1  type:1  customIterator:1  startTime:1  

As per my understanding, the query should be using the second Index as per ESR rule but plan summary states the story otherwise.

"planSummary": [
    {
      "IXSCAN": {
        "accountId": 1,
        "customIterator": 1,
        "startTime": 1,
        "type": 1
      }

What I am missing here?

Hi @Abhinav_27971 ,

We show the ESR rule as a good starting point, but its not necessarily the only consideration an optimiser takes.

If a query shape can be fulfilled with an index the database will try to exmine it. Possibly it will also be chosen over another index even if it results with worse performance eventually or over time (as plans get cached for queries).

In your case I believe the database treated the type $in as a range operator while respecting your sorting fields which require

"accountId": 1,
 "customIterator": 1,

If you will use an index that does not have those fields consecutively, it will not be able to use it for the sort. This is a heavy penalty to sort in memory compare to sort on index and I beilive this is the main incentive of choosing the index.

I would say the best index for this query is actually:

{
        "accountId": 1,
        "customIterator": 1,
        "type": 1,
        "startTime": 1

      }

So we have the equality and sort bundled as first part then we go to any other equality/range fields.

Again, to make sure that this index is the best one you can use a hint clause and run it to check if performance is indeed better. Otherwise, use the most performant index even if it is not exactly according to ESR…

Thanks
Pavel

1 Like