Querying for all unique key-value pairs in an array property across all documents

steevej · January 20, 2023, 1:50pm

For

I will simply remove the $sort. You still get an IXSCAN without it. The $match and $project seems sufficient to use the index.

Note that performance issues are not strictly caused by the code logic. It also depends on the hardware you use. If the working set does not fit in RAM, you will be I/O bound and that is slow.

Read Atlas Search to return all the applicable filters for a given search query without specifying - #2 by Erik_Hatcher, the idea is smart as you may be able to add an extra field to speed up things rather than a new collection. Your documents could look like:

[
{
  "Tags": [
    {
      "k": "Type",
      "v": "Asset"
    },
	{
      "k": "Code",
      "v": "DEV"
    },
  ], 
  "k" : [ "Type" , "Code" ]
  "Owner": "Test"
},
{
  "Tags": [
    {
      "k": "Type",
      "v": "Metric"
    },
	{
      "k": "Code",
      "v": "POL"
    },
  ],
 "k" : [ "Type" , "Code" ] ,
  "Owner": "Test"
},
{
  "Tags": [
    {
      "k": "Type",
      "v": "Asset"
    },
	{
      "k": "Asset",
      "v": "asset1.json"
    },
  ],
  "k" : [ "Type" , "Asset" ] ,
  "Owner": "Test"
}
]

The index Owner:1,k:1 would be a much smaller working set.

Another thing you could try is to use the index Owner:1,Tags:1 to see if the index size is smaller it may help.

Schema wise, do you really need the attribute pattern for tags. The range of value of “k” seems to be quite small so having direct attributes “Type”, “Asset” and “Code” might be much better.