Questions about the text operator for Atlas search - how to build an effective search for usernames?

Hi fellow users. My objective is to build an effective $search pipeline, for a collection with usernames, but I have so many questions. I have watched videos and read documentation, but if anyone could help me out here it would be awesome. I’m simply looking for a Atlas search setup, where users will get expected results. But with the following;

Data

[{name:  "John Doe"}, {"John Eriksen", {"Lara Croft"} ]

Index

{ "mappings": { "dynamic": true } }

Pipeline

[
  {
    '$search': {
      'text': {
        'query': 'John Doe', 
        'path': 'name'
      }
    }
  }, {
    '$addFields': {
      'score': {
        '$meta': 'searchScore'
      }
    }
  }
]

I get the following results

  1. John Doe → 0.66
  2. John Eriksen" → 0.21

Questions

  1. Why is “John Eriksen” part of the result set at all - when the user did a query for John Doe? I’t would make sense if searching for “John” only, but when we have an exact match why is John Eriksen even there?
  2. Why is the score of “John Doe” not 1? It’s an exact match?
  3. Now that “John Eriksen” is part of the result set, why is the score so relatively high?

Thanks for the help - really trying to build some great UX for the end users here :sunglasses:

The text operator considers all terms in a query individually. If you want matches for only John Doe, you have two options, phrase or space delimited terms. It sounds like you are looking for a phrase query:

{
    '$search': {
      'phrase': {
        'query': 'John Doe', 
        'path': 'name'
      }
    }
  }

Let me know if that solves your issue.

Hi @Marcus - thanks for taking your time to answer. Really trying to build what I would think is basic functionality, but really hard to find good examples on.

The phrase operator is somehow better, but not optimal (in it’s raw form at least).

If we now have the following docs:

[{name: "Jonas Jespersen"}, {name:  "John Doe"}, {"John Eriksen", {"Lara Croft"} ]

and pipeline:

[
  {
    '$search': {
      'phrase': {
        'query': 'Jonas Jes', 
        'path': 'name'
      }
    }
  }
]

I get 0 results :scream: - not very good for a username search where there is pretty much a complete match :grin:

a query for “Jonas Jes” is actually a partial match and different from your first question. For partial match, you need something autocomplete or regex, which is slower than autocomplete. Phrase will work if it is an exact match of “Jonas Jespersen.” I hope that makes sense.

Thanks again for answering @Marcus - seems like you know MongoDb full text search pretty well. What would you recommend for an effective username search pipeline?

1 Like

I’m a Product Manager for the product, but I don’t know anything. Permanent learner.

As for the pipeline, that depends on your use case. My simple recommendation would be to create an index with autocomplete on username, with minGram of 2 and maxGram of 7 for a small dataset. If the collection is >500,000 documents maybe make minGram larger or prepare to pay for beefy boxes.

The autocomplete query operator is straightforward to use, also. However, if you need to support diacritics in username like ö or å it won’t work well for that use case for a few more weeks. If this works for you let me know / accept the answer. If not, someone will likely follow up from the community or from the company.

The collection is way below 500.000 docs, so I will look into autocomplete.

We have usernames from around the world, so looking very much forward to ö, æ suppport etc :+1:

Would be really awesome with a deep-dive article/blogpost on making user search; maybe a simple example where you start searching for username and then expand search to username + email :innocent:

1 Like

What a great idea Alex! I will talk with the team and see if we can schedule something like that. As you know, there is a lot going on. We have not done a ton of blog posts for Atlas Search that are so use case specific. If we have the bandwidth to do it, we will do it.

We actually had another couple customers ask for guidance around building a feature like this for their apps about 6 and 12 months back. Thanks. Please consider to either accept the initial answer, the follow up answer, or clarify the scope of the question. I hope that other users can look at this conversation and derive some actionable guidance.

2 Likes

We have usernames from around the world, so looking very much forward to ö, æ suppport etc

I will let you know when we release this feature.

1 Like

I’m also wishing there was more material on $search