Prioritize exact match using Search Indexes

Hello

I’m currently working on a project with a collection of foods. Each food has a name field and I would like to use the Full Text Search capabilities to query the collection. For example, in the collection, there are 3 documents with the exact name “APPLE”. However, there are a few hundred other food documents which contain APPLE in the name field. Some of these documents have “APPLE” show up multiple times in the name. My question is how do I set up a search index and query to prioritize exact matches. So if I want the top 10 results, how do I ensure the 3 documents with exactly “APPLE” as the name show up at the top. To be clear I do not want to do only an exact match. I still want other documents that contain APPLE in it. The exact matches just need to be scored highest. Currently, based off a relevance, documents that contain APPLE in the name multiple times are scored higher.

Hey @Shivam_Patel,

Welcome to the MongoDB Community Forums! :leaves:

In order to better understand your use case, can you please provide us with the following details:

  • sample documents
  • the search index you’re currently using
  • your query
  • the expected output

This would help us understand your issue and help you better.

Regards,
Satyam

Hi

Here are some sample documents in my collection:

[
  {
    "_id": "6407a138694fa2f8499bcf1f",
    "name": "BARRILITOS, APPLE SODA, APPLE, APPLE"
  },
  {
    "_id": "6407a174694fa2f8499c2458",
    "name": "JUMEX, APPLE NECTAR, APPLE, APPLE"
  },
  {
    "_id": "6407a0ba694fa2f8499aec10",
    "name": "APPLE"
  },
  {
    "_id": "6407a0ba694fa2f8499aed86",
    "name": "APPLE PIE"
  },
  {
    "_id": "6407a0ae694fa2f8499ad42e",
    "name": "APPLE CIDER"
  }
]

I’m currently using the lucene.standard analyzer. This is the current query I’m using.

[
  {
    $search: {
      index: "foodSearch",
      text: {
        query: "APPLE",
        path: "name"
      }
    }
  }
]

The desired output is

[
  {
    "_id": "6407a0ba694fa2f8499aec10",
    "name": "APPLE"
  },
  {
    "_id": "6407a0ae694fa2f8499ad42e",
    "name": "APPLE CIDER"
  },
  {
    "_id": "6407a0ba694fa2f8499aed86",
    "name": "APPLE PIE"
  },
  {
    "_id": "6407a138694fa2f8499bcf1f",
    "name": "BARRILITOS, APPLE SODA, APPLE, APPLE"
  },
  {
    "_id": "6407a174694fa2f8499c2458",
    "name": "JUMEX, APPLE NECTAR, APPLE, APPLE"
  }
]

How do I modify my query or search index to achieve something like this?

Hey @Shivam_Patel,

So as per my understanding, you have a lot of documents containing the word ‘APPLE’. You want to do a search that should first display all the exact matches, followed by the rest of the matches.

I created a sample collection using the sample documents you provided. My collection looked like this:

[{
  "_id":  "6412e7eff819dac5b09cabe2",
  "name": "BARRILITOS, APPLE SODA, APPLE, APPLE"
},
{
  "_id":  "6412e809f819dac5b09cabe3",
  "name": "JUMEX, APPLE NECTAR, APPLE, APPLE"
},
{
  "_id":  "6412e81ff819dac5b09cabe4",
  "name": "APPLE"
},
{
  "_id":  "6412e834f819dac5b09cabe5",
  "name": "APPLE PIE"
},
{
  "_id":  "6412e844f819dac5b09cabe6",
  "name": "APPLE CIDER"
},
{
  "_id":  "6412f2849b835a5a29568133",
  "name": "APPLE"
},
{
  "_id":  "6412f28f9b835a5a29568134",
  "name": "APPLE"
}]

I used the lucene.standard mapping with the following index search definition:

{
  "mappings": {
    "dynamic": true,
    "fields": {
      "name": {
        "type": "string"
      }
    }
  }
}

For the query you provided:

{
    $search: {
        index: 'default',
        text: {
            query: 'APPLE',
            path: 'name'
        }
    }
      
}

I got the following result:

{
  _id: ObjectId("6412e81ff819dac5b09cabe4"),
  name: 'APPLE'
},
{
  _id: ObjectId("6412f2849b835a5a29568133"),
  name: 'APPLE'
},
{
  _id: ObjectId("6412f28f9b835a5a29568134"),
  name: 'APPLE'
},
{
  _id: ObjectId("6412e7eff819dac5b09cabe2"),
  name: 'BARRILITOS, APPLE SODA, APPLE, APPLE'
},
{
  _id: ObjectId("6412e809f819dac5b09cabe3"),
  name: 'JUMEX, APPLE NECTAR, APPLE, APPLE'
},
{
  _id: ObjectId("6412e834f819dac5b09cabe5"),
  name: 'APPLE PIE'
},
{
  _id: ObjectId("6412e844f819dac5b09cabe6"),
  name: 'APPLE CIDER'
}

As we can see, the exact matches appear first followed by the rest.

Please let me know if my understanding is correct here or not. If it’s not working as expected, please post more details such as your search result, index definition, etc. Feel free to reach out for anything else as well.

Regards,
Satyam

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.