Atlas fulltext searches - "contains" query

Prasad_Kini · March 3, 2023, 9:20pm

Hello,
What is most optimal way to run a “contains” query on all fields in a collection? The search can be on a partial word or a partial phrase.

I have tried using wildcard and regex queries, but they are quite slow and do not meet the expected performance.

Thanks,
Prasad

Elle_Shwer · March 7, 2023, 2:34pm

Have you considered using autocomplete with an ngram?

Prasad_Kini · March 7, 2023, 2:55pm

Yes, but it doesn’t support wildcard paths. My requirement is to be able to search for a given string in any field.

Prasad_Kini · March 7, 2023, 3:17pm

Also, I have observed that wildcard queries (contains search in all fields) are slow for small (<500 docs) datasets as well. It seems that other clauses in the same query that is supposed to reduce the number of documents to be searched have no effect when wildcards are specified.

@Elle_Shwer any thoughts on why the wildcard (contains) search would be slow on filtered datasets? I have tested it with a dataset with only 8 documents with the query returning in 7+ seconds.

Thanks much,
Prasad

Elle_Shwer · March 8, 2023, 4:15pm

In general, it is well known that wildcards are slow and computationally expensive. Especially if you are doing both a wildcard query and a wildcard path. That is generally why we recommend users to use autocomplete if they can afford to do so.

Without seeing your exact query, it’s hard to know why it took so long. But if you’re doing a contains of very few characters over very large documents, I am not surprised at all.

Prasad_Kini · March 8, 2023, 8:38pm

The documents are not large. I have been able to get around this for now by limiting the fields for the searches by specifying them explicitly in the regex operator.

While this is not optimal, I think that I might be able to get away with for the time being.

N171155_GUNTURI_BHAVYASRI · November 27, 2023, 9:45am

@Prasad_Kini , have you find the solution that works for both autocomplete and wildcard? I am searching this from long time. please reply