How to ignore multiple occurrence of search keywords in Mongodb $text search?

Hans_Krishandi · December 30, 2023, 11:15am

I have a simple blog post website where different authors can create articles, which consists of title and content field. I used mongodb due to legacy implementation and other factors.

In the website, it has a search bar where user can do a free text search to get a list of relevant articles. And in the backend, I used Mongodb text index and $text search to do the queries.

How I created the index (title has more weight)

db.articles.createIndex(
  { title: "text", content: "text" },
  {
    weights: {
      title: 10,
      content: 1
    },
    name: "ArticleIndex"
  }
)

Example query

db.articles.find(
  { $text: { $search: "coffee bake" } },
  { score: { $meta: "textScore" } }
).sort(
  { score: { $meta: "textScore", _id: -1 } }
)

In the query, I put an additional _id: -1 so that latest created articles will be put first if there’s a tie.

Now the problem is, some of the authors tried to manipulate the sorting by putting certain keywords multiple times in the content, to the extent that it looks quite obvious. For instance, most of my users would search a city name New York. Hence, a particular author spams the phrase New York all over the content. Due to this, his article gets a high text score based on Mongodb $text search, and always appears at the top.

Is there a way in Mongodb $text search to ignore multiple occurrence of the search keywords? Also, is there a way to somehow include the _id field to contribute to the sorting score, i.e. latest item will have higher scores?