Why compound wildcard index still do fetch?

Hi @steevej ,

I tried method 2 group lookup, but it took about 2 minutes to get all data:

db.device.aggregate(
[
  {
    $match: {
      year: 2024,
      month: 8,
      day: 1,
      "attributes.country": {
        $gte: "AA",
        $lte: "ZZ"
      }
    }
  },
  {
    $group: {
      _id: {
        appId: "$appId",
        os: "$os",
        country: "$attributes.country"
      }
    }
  },
  {
    $lookup: {
      from: "device",
      let: {
        id_appId: "$_id.appId",
        id_os: "$_id.os",
        id_country: "$_id.country"
      },
      as: "result",
      pipeline: [
        {
          $match: {
            year: 2024,
            month: 8,
            day: 1,
            $expr: {
              $and: [
                {
                  $eq: ["$appId", "$id_appId"]
                },
                { $eq: ["$os", "$id_os"] },
                {
                  $eq: [
                    "$attributes.country",
                    "$id_country"
                  ]
                }
              ]
            },
            "attributes.lang": {$gte: "aa", $lte: "zz"}
          },
        },
                  {
            $group: {
              _id: {
                appId: "$id_appId",
                os: "$id_os",
                country: "$id_country",
                lang: "$attributes.lang"
              },
              value: {
                $sum: 1
              }
            }
          }
      ]
    }
  },
  {$project: {
    _id: 0,
    result: "$result"
  }},
  {$unwind: {
    path: "$result",
    preserveNullAndEmptyArrays: true
  }},
  {$group: {
    _id: {
      appId: "$result._id.appId",
      os: "$result._id.os",
      country: "$result._id.country",
      lang: "$result._id.lang",
    },
    value: {
      $sum: "$result.value"
    }
  }}
]
)

I think the method 2 could not be a good solution.
I added appId in first $match, but it still needs 7 seconds:

  {
    $match: {
      year: 2024,
      month: 8,
      day: 1,
      appId: 37,
      "attributes.country": {
        $gte: "AA",
        $lte: "ZZ"
      }
    }
  },

Besides, if I need more attributes to group, I guess more $lookup are needed.

The reason why I put country, lang… to a attributes field is because I have many reports that will group by different fields, for example:

  • report by country
  • report by lang
  • report by appVersion
  • report by contient, country, province
  • report by new_field_1, new_field2

In the future, new fields will be added to attributes field to support new reports, and I won’t need to recreate my compound wildcard index.
So I guess I should put country, lang, … to a attributes field.
But if there is a new report, like new_field_report by country, I need to add this new_field to my compound wildcard index.
This is the disadvantage.

Please let me know if any better ideas!
Thank you very much.