Atlas Search for fields that are array of strings

Hi, I’m also having issues using Atlas Search for fields that are array of strings…

I have a very simple index and it won’t do even basic search that a regex would.

For example, I have in my collection around 32.000 documents on this format:

    {
        "_id": "59d400b6d8c987b0196efe50",
        "name": "Natura",
        "domains": [
            "natura.net",
            "natura.com"
        ]
    }

I made a very simple Atlas Search index like this:

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "domains": {
        "type": "string"
      }
    }
  }
}

And even with a very basic search like this, it won’t return correct results:

Search:

{
        $search: {
          index: 'companyDomains',
          text: {
            query: "natura",
            path: "domains",
            fuzzy: {},
          },
        },
}

Result:

[
    {
        "_id": "5f6210981923bf120bdac7b7",
        "name": "Laura",
        "domains": [
            "laura-br.com"
        ]
    }
]

Which is very weird, because the only result returned has almost nothing to do with the search string.

If I make a simple regex $match stage with domains: /natura/i, I get 14 results!

I’m still trying to understand what’s the issue with Atlas Search and array of strings…

Hey @Rafael_Levy,

Welcome to the MongoDB Community Forums! :leaves:

Since your query contains natura and you want Atlas Search to return results that include names like natura.net, and natura.com, I would recommend trying to use autocomplete instead of Text Search Operator to see if it suits your use case / requirements.

I tried to reproduce this on my end as well to confirm this. I inserted the following documents:


{
 "_id":"59d400b6d8c987b0196efe50",
 "name":"Natura",
 "domains":["natura.net","natura.com"]
},

{
 "_id":"5f6210981923bf120bdac7b7",
 "name":"Laura",
 "domains":["laura-br.com"]
}

This is my index definiton:

{
  "analyzer": "lucene.whitespace",
  "searchAnalyzer": "lucene.whitespace",
  "mappings": {
    "dynamic": false,
    "fields": {
      "domains": {
        "type": "autocomplete"
      }
    }
  }
}

Then, when I used the following search query:

{
  index: 'default',
  autocomplete: {
    query: 'natura',
    path: 'domains'
  }

it only returned the document having domains natura.net and natura.com and then the laura one.

You can read more about setting up and using autocomplete from the documentation: how to Index fields for Autocomplete

Hope this helps. Feel free to reach out for anything else as well.

Regards,
Satyam

@Rafael_Levy is your search index using the standard anaylzer? If so then you will find that “natura.com” gets ‘tokenized’ to “natura.com”. Whereas if you use the ‘simple’ analyzer it would get tokenized to “natura” and “com”. This would allow you to search on “natura”.

You can read more about tokenization here: https://www.mongodb.com/docs/atlas/atlas-search/analyzers/

Autocomplete will work in this case but you can read about other advanced options, such as tokenizing email addresses, here: https://www.mongodb.com/docs/atlas/atlas-search/analyzers/tokenizers/#uaxurlemail

1 Like