MultiKey Index (Wildcard) not used in range query

Hi,

Long time reader first time poster! thanks for all the questions and answers so far.

I have an interesting issue, where a range query avoids the index but a single $gte or $lte will use the multi key index and return in sub-second.

I have setup a basic test that mimics the basic properties of my actual scenario:

{
  "_id": "",
  "groupId": "da65de02-35f0-4885-8c67-8b68109cd2a3",
  "data": [
    {
      "parent": [
        {
          "child": {
            "intValue": 1,
            "dateValue": "2025-01-04T15:14:16.086Z",
            "stringValue": "1"
          }
        }
      ]
    }
  ]
}

I created 200K records, grouped into two 100K groups, to better represent my data. The data I am using in production is more than 400K for each document so this test does not completely mimic the results but the test is to ensure the index is used.

collection.createIndex(
{
    groupId: 1,
    "data.$**": 1,
  },
  {
    name: "groupId_data_wildcard",
  }
);

Then when we run an aggregation on the data as follows:

[{
    $match:
    {
      groupId: "da65de02-35f0-4885-8c67-8b68109cd2a3",
      "data.parent.child.intValue": {
        $gte: 1,
        $lt: 20
      }
    }
}]

The index is used but the wildcard part of the index is not used at all.


If I remove the range query and change to $gte or $lte then the wildcard index is used (Multi Key Index is now yes in the explain where as before it is no.

[{
  $match:
    /**
     * query: The query in MQL.
     */
    {
      groupId:
        "da65de02-35f0-4885-8c67-8b68109cd2a3",
      "data.parent.child.intValue": {
        $lt: 20,
      }
    }
}]


Final Thought

The crazy thing is if I remap the parent object to a new field in the aggregation and then do the additional filter I can get the time down to a more respectable number, however it stil is not ideal and feels very hacky.

[
  {
    $match:
      {
        groupId: "da65de02-35f0-4885-8c67-8b68109cd2a3",
        "data.parent.child.intValue": {
          $gte: 0
        }
      }
  },
  {
    $project:
      {
        newData: "$data.parent"
      }
  },
  {
    $unwind:
      {
        path: "$newData",
        preserveNullAndEmptyArrays: true
      }
  },
  {
    $sort:
      {
        "newData.child.intValue": 1
      }
  },
  {
    $sort:
      {
        "newData.child.intValue": 1
      }
  },
  {
    $match:
      {
        "newData.child.intValue": {
          $lt: 20
        }
      }
  }
]


Questions

  1. Is this expected behaviour?
  2. Can I optimise this query at all?
  3. Any suggestions?

Thanks in advance,