$searchMeta Facets - Return only filtered elements

Hi,
I’m having trouble with facets when working with arrays. The facet returns all elements of matching arrays, and I would like to return only those elements that matched my condition.

Given the following collection that contains a single document:

{"tags":["aaa","bbb","ccc"]}

I would like to run the following query and find all tags that contain the character “a”:

{
    "index": "test",
    "facet": {
      "operator": {
        "autocomplete": {
          "query": "a",
          "path": "tags"
        }
      },
      "facets": {
        "titleFacet": {
          "type": "string",
          "path": "tags",
          "numBuckets": 100
        }
      }
    }
  }

This is the search index:

{
  "analyzer": "search_keyword_lowercaser",
  "searchAnalyzer": "search_keyword_lowercaser",
  "mappings": {
    "dynamic": false,
    "fields": {
      "tags": [
        {
          "analyzer": "search_keyword_lowercaser",
          "searchAnalyzer": "search_keyword_lowercaser",
          "type": "string"
        },
        {
          "type": "stringFacet"
        },
        {
          "analyzer": "search_keyword_lowercaser",
          "foldDiacritics": false,
          "maxGrams": 15,
          "minGrams": 1,
          "tokenization": "nGram",
          "type": "autocomplete"
        }
      ]
    }
  },
  "analyzers": [
    {
      "charFilters": [],
      "name": "search_keyword_lowercaser",
      "tokenFilters": [
        {
          "type": "lowercase"
        }
      ],
      "tokenizer": {
        "type": "keyword"
      }
    }
  ]
}

The problem is that I am getting all tags, instead of only “aaa”.
What can be done to solve this?

Hi @Shai_Binyamin and welcome to MongoDB community forums!!

Based on the above information that you have shared, I tried to create some sample data which looks like:

Atlas atlas-cihc7e-shard-0 [primary] test> db.sample.find()
[
  {
    _id: ObjectId("6526396b928f922719d4fa65"),
    tags: [ 'bbb', 'bbb', 'ccc', 'ddd', 'ddd' ]
  },
  {
    _id: ObjectId("6526396b928f922719d4fa66"),
    tags: [ 'ddd', 'ccc', 'fff', 'jjj', 'ccc' ]
  },
  {
    _id: ObjectId("6526396b928f922719d4fa67"),
    tags: [ 'bbb', 'fff', 'aaa', 'yyy', 'bbb' ]
  },
  {
    _id: ObjectId("6526396b928f922719d4fa68"),
    tags: [ 'yyy', 'yyy', 'bbb', 'bbb', 'aaa' ]
  },
  {
    _id: ObjectId("6526396b928f922719d4fa6a"),
    tags: [ 'ddd', 'yyy', 'ccc', 'fff', 'yyy' ]
  },
  {
    _id: ObjectId("6526396b928f922719d4fa6b"),
    tags: [ 'fff', 'yyy', 'aaa', 'bbb', 'jjj' ]
  },
  {
    _id: ObjectId("6526396b928f922719d4fa6c"),
    tags: [ 'aaa', 'fff', 'yyy', 'fff', 'aaa' ]
  },
  {
    _id: ObjectId("6526396b928f922719d4fa6d"),
    tags: [ 'jjj', 'ccc', 'fff', 'ccc', 'ddd' ]
  },
  {
    _id: ObjectId("6526396b928f922719d4fa6e"),
    tags: [ 'bbb', 'ddd', 'fff', 'ccc', 'ddd' ]
  }
]

I used the same index definition and the search query as:

Atlas atlas-cihc7e-shard-0 [primary] test> db.sample.aggregate([{ $search: { facet: { operator: { autocomplete: { query: "a", path: "tags", }, }, facets: { titleFacet: { type: "string", path: "tags", numBuckets: 100, }, }, }, }, }])
[
  {
    _id: ObjectId("6526396b928f922719d4fa6c"),
    tags: [ 'aaa', 'fff', 'yyy', 'fff', 'aaa' ]
  },
  {
    _id: ObjectId("6526396b928f922719d4fa67"),
    tags: [ 'bbb', 'fff', 'aaa', 'yyy', 'bbb' ]
  },
  {
    _id: ObjectId("6526396b928f922719d4fa6b"),
    tags: [ 'fff', 'yyy', 'aaa', 'bbb', 'jjj' ]
  },
  {
    _id: ObjectId("6526396b928f922719d4fa68"),
    tags: [ 'yyy', 'yyy', 'bbb', 'bbb', 'aaa' ]
  }
]

The above query provides me the output for all the tags that contains “a” in the list.

If this is not what you are seeking for, could you help me with some sample data along with expected output and current output that you are receiving.

Regards
Aasawari

Hi @Aasawari, thanks for your response.
My goal is to only return the tag “aaa” in this case, and not to show other tags.
I would like to return only the matching elements in the array, and not to return non matching tags which are in the same array as matching tags.

Thank you for the clarification @Shai_Binyamin

If I understand correctly and based on the example shared above, you would need the output as:

[
  {
    _id: ObjectId("6526396b928f922719d4fa6c"),
    tags: [ 'aaa' ]
  },
  {
    _id: ObjectId("6526396b928f922719d4fa67"),
    tags: [ 'aaa']
  },
....
]

and so on, considering the fact that ‘aaa’ occurs only once in the tags array.

To accomplish this, there can be more than one way to achieve the desired response.

  1. You can make use of $reduce along with the conditional operator inside the $addFields to get the desired response.
Atlas atlas-cihc7e-shard-0 [primary] test> db.sample.aggregate([ { $search: { facet: { operator: { autocomplete: { query: "a", path: "tags" } }, facets: { titleFacet: { type: "string", path: "tags", numBuckets: 100 } } } } }, { $addFields: { tags: { $reduce: { input: "$tags", initialValue: [], in: { $cond: { if: { $eq: ["$$this", "aaa"] }, then: { $concatArrays: [ "$$value", ["$$this"]] }, else: "$$value" } } } } } }])
[
  { _id: ObjectId("6526396b928f922719d4fa6c"), tags: [ 'aaa', 'aaa' ] },
  { _id: ObjectId("6526396b928f922719d4fa67"), tags: [ 'aaa' ] },
  { _id: ObjectId("6526396b928f922719d4fa6b"), tags: [ 'aaa' ] },
  { _id: ObjectId("6526396b928f922719d4fa68"), tags: [ 'aaa' ] }
]
  1. The other method would be use $filter in the $addFields aggregation stage:
Atlas atlas-cihc7e-shard-0 [primary] test> db.sample.aggregate([ { $search: { facet: { operator: { autocomplete: { query: "a", path: "tags" } }, facets: { titleFacet: { type: "string", path: "tags", numBuckets: 100 } } } } }, { $addFields: { tags: { $filter: { input: "$tags", as: "thisTag", cond: { $eq: ["$$thisTag", "aaa"] } } } } }])
[
  { _id: ObjectId("6526396b928f922719d4fa6c"), tags: [ 'aaa', 'aaa' ] },
  { _id: ObjectId("6526396b928f922719d4fa67"), tags: [ 'aaa' ] },
  { _id: ObjectId("6526396b928f922719d4fa6b"), tags: [ 'aaa' ] },
  { _id: ObjectId("6526396b928f922719d4fa68"), tags: [ 'aaa' ] }
]

P.S. Please note that the first document has multiple “aaa” in the “tags” array in my sample document.
Also, the above query is based on the sample document I have created and would recommend you to go through thorough testing and evaluate in terms of performance before using in the production environment.

Please feel free to reach out in case of any further concerns.

Warm regards
Aasawari