Our team is trying to implement search using MongoDB Atlas and got a problem with getting expected results when our query contains a stop word. The problem occurs when we want to use AND condition via compound operator.
By default it seems that when you use text or autocomplete operator then Atlas looks for a match for each term in the string separately. So it means if you type more words you will get more results, so this is kinda OR condition.
To handle that we are splitting each term into separate autocomplete operator, this gives us an AND condition behavior, but this is not working when stop-word occurs as one of the terms.
For example, we want all words (with/without stop word) to be included in found document: word1 stopWord word2
{
"_id": "111111111122",
"ProductId": "111111111111",
"Name": "Testdokument Jelena",
"Url": "/test-portal/test-page-jelena",
"Content": "Testdokument Jelena Vidare vill regeringen införa ändringar som medför skyldighet för Försäkringskassan och kommunerna att informera Inspektionen för vård och omsorg när en enskild kan antas bedriva verksamhet för personlig assistans utan tillstånd.",
"Description": "Testdokument Jelena Vidare vill regeringen införa ändringar som medför skyldighet för Försäkringskassan och kommunerna att informera Inspektionen för vård och omsorg när en enskild...",
"AccessItems": [
"Admin",
],
"FilterRoute": "test-page-jelena",
"TypeOfContent": "page"
}
I can find this document if I search for: Testdokument kommuner
But cannot find it if I search for: Testdokument kommuner att Testdokument kommuner och Testdokument kommuner för
We search in Content field using index that is mentioned in the first post.
Thank you for your fast reply, but we are splitting in separate terms to have AND condition behavior, it means we want all words to be in the document (with/without stop word). With your query we find the documents with OR condition, that means a found document contains at least one from the words.
must works like an AND statement – see docs here, does this work when we don’t take stop words into consideration? I wonder if that is where the issue is specifically. In other words, that you specifically WANT to index stop words?
As is mentioned in the example above, with must we can find documents if not to use stop word in the search, but if we use it then document is not found even though it has all words including a stop word.
Looks like we need to index stop words. Is it possible and how?
If you want to match also stopwords, you can use the simple , whitespace or even keyword analyzers (If you are already storing the individual words as a string)