Refining Text Search Results in MongoDB 5.0 - Specificity Over General Matche

adhishthite · July 14, 2023, 5:29pm

Greetings,

I am currently managing a MongoDB 5.0 collection that stores data in the following structure:

{
"_id": "<id>",
"name": "James Smith",
"aliases": [
"Jimmy Smith",
"Jammy Smith",
"Johnny Smith"
]
}

This collection encompasses more than 100k records. To facilitate text search, I have established an index as follows:

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "aliases": {
        "type": "string"
      },
      "name": {
        "type": "string"
      }
    }
  },
  "storedSource": {
    "include": [
      "aliases",
      "names",
    ]
  }
}

My search function executes a query on this index in the following manner:

[
            {
                '$search': {
                    'index'       : 'my-index',
                    'text'        : {
                        'query': "<NAME>",
                        'path' : ['name', 'aliases'],
                    },
                    'scoreDetails': true,
                }
            }, {
                "$project": {
                    "_id"          : 0,
                    "name"         : "$name",
                    "aliases"      : "$aliases",
                    'score': {'$meta': 'searchScore'}
                }
            }, {
                "$sort": {
                    "score": -1
                }
            }
]

However, I’m facing an issue where a search for “Gunther Lopez Smith” returns results that include “James Smith”. This issue extends to returning results for all instances of “Smith” in the database, resulting in approximately 2k records.

While I am currently using lucene.standard, attempts to switch to lucene.keyword yielded no records for a “Smith James” query and also seemed to introduce case-sensitivity.

My objective is to refine the search such that a query for “Johnny Smith” returns “James Smith”, and potentially other matches for the phrase “smith James”, but without including all entries with “Smith”. How can I accomplish this level of specificity in my text search results?

Thank you for your assistance!

amyjian · July 14, 2023, 6:30pm

Hi @adhishthite , welcome to the MongoDB Community! Have you seen this article on Partial Matching in Atlas Search? I think the phrase operator with the slop parameter specifying the allowable distance between words may help achieve the functionality you are looking for.

Let me know if this helps!

adhishthite · July 14, 2023, 7:56pm

Hi @amyjian,

Thanks for your response.

I have already tried using slop and phrase operators, but they do not work in cases where order is reversed. At least for me, they didn’t.