Atlas search not working properly when searching for a string with space in it

Hi, I’m working on a feature in which we need to search by name in our database. The search works fine for single word queries, but when I input a string with spaces in it (2 or more words), the results are not accurate, because it returns different documents which match only one of the words that I inputted. From what I have read on a similar post on this forum, it might be a tokenization issue and the person which fixed it made a fix which works for exact matches, here I need an autocomplete solution for this issue.

Hi @Dan_Muntean1! Can you share your index definition, an example query and sample document?

Hi, here is my index:

{
  "mappings": {
    "dynamic": true,
    "fields": {
      "groupTag": {
        "type": "autocomplete"
      },
      "name": {
        "type": "autocomplete"
      }
    }
  }
}

Here is my document:

{
  "_id": {
    "$oid": "63e4f8a261f74736f0fcc8b6"
  },
  "_v": 4,
  "groupTag": "T-2",
  "createdAt": {
    "$date": {
      "$numberLong": "1675950242663"
    }
  },
  "updatedAt": {
    "$date": {
      "$numberLong": "1676022397735"
    }
  },
  "name": "Dan test group",
}

And if I search for the name “Dan test group”, the search feature returns me other documents which contains in the name field any of the words “Dan”, “test”, “group”, and I end getting information that I didn’t search for. I need the search to return documents which name contains the entire string that I typed (“Dan test group”).

Hi Dan,

This is happening because you are using the autocomplete field mapping, which allows you to return results which partially match your search query. If you are interested in return exact matches, you might want to consider using a string field mapping type with the lucene.keyword analyzer. Your index definition would look something like this:

{
  "mappings": {
    "dynamic": true,
    "fields": {
      "groupTag": {
        "type": "string",
        "analyzer": "lucene.keyword"
      },
      "name": {
        "type": "string",
        "analyzer": "lucene.keyword"
      }
    }
  }
}

You can learn more about exact matching in this blog post.

1 Like

Thank you for help, but this solution does not help me. I need to still use autocomplete, but when I type 2 words, I need the response to contain both words. Currently it returns me different responses for each word.

Hi @amyjian , this seems to work as per the document. However, I’m still not having the expected results. Instead of getting a zero count.
Tried with the below index and neither one working

  1. "task_name": { "analyzer": "lucene.simple", "multi": { "keywordAnalyzer": { "analyzer": "lucene.keyword", "type": "string" } }, "searchAnalyzer": "lucene.simple", "type": "string" }

  2. "task_name": { "analyzer": "lucene.keyword", "type": "string" }

Hi @Dan_Muntean1 did you find any solution for your issue? As i m having the same issue and not able to fix it.

Hello, I got an idea how to solve it, but didn’t manage to implement it. I will come back with an answer the following weeks. I understood that we need to add “tokenOrder:sequential” into the autocomplete operator. Maybe this will help you https://www.mongodb.com/docs/atlas/atlas-search/autocomplete/

Hi @Dan_Muntean1 , I have tried with that as well. However it’s not responding any results.

{
  "analyzer": "lucene.simple",
  "searchAnalyzer": "lucene.whitespace",
  "mappings": {
    "dynamic": true,
    "fields": {
      "task_name": [
        {
          "type": "stringFacet"
        },
        {
          "type": "string"
        },
        {
          "foldDiacritics": false,
          "maxGrams": 7,
          "minGrams": 3,
          "type": "autocomplete"
        }
      ]
    }
  }
}

Can you please help here, if I’m doing anything wrong here?

I am using lucene.standard for both analyzer si search analyzer, and for minGrams I am using 2 and for max grams I am using 15. This is the default what I am getting and from what I’ve understood minGrams 2 is the best practice to use. Also the search is working for 2+ letters when using minGrams 2. Also, can you share the code where you are querying this?

Right now i’m using the mongodb cloud platform to check.

share please that query that you use in mongo platform

db.collection.aggregate([
                {
                    $search: {
                        index: 'tasks_search',
                        compound: {
                            ...query,
                        },
                        count: {
                            type: 'total',
                        },
                        sort: sort
                            ? sort
                            : {
                                  due_date: 1,
                              },
                    },
                },
                {
                    $skip: skip || 0,
                },
                {
                    $limit: limit,
                },
                {
                    $group: {
                        _id: null,
                        data: { $push: '$$ROOT' },
                        count: { $first: '$$SEARCH_META.count.total' },
                    },
                },
                {
                    $project: {
                        _id: 0,
                        data: 1,
                        count: 1,
                    },
                },
            ])

Firstly try to get just the result without the count of the items to identify the problem. { $group: { _id: null, total: { $sum: 1 }, items: { $push: '$$ROOT' }, }, }, { $project: { _id: 0, total: 1, items: { $slice: ['$items', skip, limit], }, }, }
Here is a snippet of code that I use to get the total count of items.