Low searchScore on 100% match

I have defined a autocomplete index like

    {
      "mappings": {
        "dynamic": false,
        "fields": {
          "presentationSeries": {
            "fields": {
              "title": {
                "analyzer": "lucene.danish",
                "type": "autocomplete"
              }
            },
            "type": "document"
          }
        }
      }
    }

And run a aggregation search like

    [{$search: {
      index: 'season-title',
      autocomplete: {
        query: 'deadline*',
        path: 'presentationSeries.title'
      }
    }}, {$project: {
      _id: 0,
      presentationSeriesId: "$_id",
      title: "$presentationSeries.title",
       score: {
        "$meta": "searchScore"
      }
    }}]

when running this agains documents where presentationSeries.title is “deadline” a 100% equality is present so i would expect the searchScore to be higher than a partial match. Eg
full match →

    {
                "title": "Deadline",
                "seasonId": "751967870000",
                "seriesId": "756177704000",
                "score": 4.8309083
            }

partial match->

    {
                "title": "Deadline 17:00",
                "seasonId": "751946357000",
                "seriesId": "756177704000",
                "score": 5.6943293
    }

When sorting on score the full match is shown last which is counter intuitive. Is there a reason why full match is scored lower than partial. And is there a way to boost full match ?

Hi @Preben_Asmussen,

Is there a reason why full match is scored lower than partial.

As noted in the doucementation for the Atlas search autocomplete operator:

autocomplete offers less fidelity in score in exchange for faster query execution.

And is there a way to boost full match ?

You can attempt a possible work around utilising the autocomplete operator in conjunction with the phrase operator in which you could boost the score for an exact match in the phrase operator.

Example of scoring prior to the possible workaround suggested below where exact match is scored lower:

db.collection.aggregate([ { $search: { "index": "titleindex", "autocomplete": { "query": "Test title", "path": "title" } } }, { $project: { "title": 1, "sccore": { "$meta": "searchScore" } } }])
[
  {
    _id: ObjectId("6215952b58c5cdce879f80e0"),
    title: 'this is a test title 100 10 10 10',
    sccore: 1.5060590505599976
  },
  {
    _id: ObjectId("6215952458c5cdce879f80df"),
    title: 'Test title 100 10 10 10',
    sccore: 1.4570086002349854
  },
  {
    _id: ObjectId("6215951c58c5cdce879f80de"),
    title: 'Test title', /// <--- Exact match scored the lowest
    sccore: 1.3464860916137695
  }
]

An example search index definition used in the possible workaround and the $search aggregation execution with scoring output below:

The search index named titleindex on the title field:

{
  "mappings": {
    "dynamic": true,
    "fields": {
      "title": [
        {
          "type": "string"
        },
        {
          "tokenization": "nGram",
          "type": "autocomplete"
        }
      ]
    }
  }
}

The $search aggregation using autocomplete and phrase within a compound operator:

db.collection.aggregate([
	{
		$search: {
			"index": "titleindex",
			"compound": {
				"should" : [
					{
					"phrase":{
						"query": "Test title",
						"path": "title",
						"score": {"boost":{"value":5}} /// <--- Score boosting on exact match
						}
					},
					{
					"autocomplete": {
						"query": "Test title",
						"path": "title"
						}
					}
				]
			}
		}
	},
	{
		$project: {
		"title": 1,
		"score": {"$meta": "searchScore"}
		}
	}
])

The output where the exact match for the document with {title :"Test title"} is scored the highest:

[
  {
    _id: ObjectId("6215951c58c5cdce879f80de"),
    title: 'Test title', /// <--- Exact match scoring the highest
    score: 5.011984825134277
  },
  {
    _id: ObjectId("6215952458c5cdce879f80df"),
    title: 'Test title 100 10 10 10',
    score: 3.9012293815612793
  },
  {
    _id: ObjectId("6215952b58c5cdce879f80e0"),
    title: 'this is a test title 100 10 10 10',
    score: 3.379971742630005
  }
]

Please note that I’ve only tested on sample documents i’ve created so results may differ for your environment.

It is highly recommend to always test this out in your own test environment beforehand and seeing if the possible workaround suits your use case with regards to autocomplete and scoring for exact matches.

Hope this helps.

Regards,
Jason

1 Like