Greetings,
I am currently managing a MongoDB 5.0 collection that stores data in the following structure:
{
"_id": "<id>",
"name": "James Smith",
"aliases": [
"Jimmy Smith",
"Jammy Smith",
"Johnny Smith"
]
}
This collection encompasses more than 100k records. To facilitate text search, I have established an index as follows:
{
"mappings": {
"dynamic": false,
"fields": {
"aliases": {
"type": "string"
},
"name": {
"type": "string"
}
}
},
"storedSource": {
"include": [
"aliases",
"names",
]
}
}
My search function executes a query on this index in the following manner:
[
{
'$search': {
'index' : 'my-index',
'text' : {
'query': "<NAME>",
'path' : ['name', 'aliases'],
},
'scoreDetails': true,
}
}, {
"$project": {
"_id" : 0,
"name" : "$name",
"aliases" : "$aliases",
'score': {'$meta': 'searchScore'}
}
}, {
"$sort": {
"score": -1
}
}
]
However, I’m facing an issue where a search for “Gunther Lopez Smith” returns results that include “James Smith”. This issue extends to returning results for all instances of “Smith” in the database, resulting in approximately 2k records.
While I am currently using lucene.standard
, attempts to switch to lucene.keyword
yielded no records for a “Smith James” query and also seemed to introduce case-sensitivity.
My objective is to refine the search such that a query for “Johnny Smith” returns “James Smith”, and potentially other matches for the phrase “smith James”, but without including all entries with “Smith”. How can I accomplish this level of specificity in my text search results?
Thank you for your assistance!