Full text search on multiple unrelated fields

Hey i’m new to this community so hi everyone !

Just to mention on start i have a sharded Mongo DB ( 3 config replicas, 2 routers , 2 shards x 3 replicas , hosted locally NOT on Atlas) , not sure if this is important to my issue just to mention it.

So in one of my collections i want to preform full text search, so here is a sample data:

{
  "first_name":"Jhon",
  "category":"sport",
  "location":"Maribor, Slovenia",
  "url":"someurl.com",
  ...
}

I want to be able to perform full text search on multiple fields , but i saw that mongo allows 1 text index per collection. I also so that u can put multiple fields as part of that index ( in my case first_name, category, location etc. ), but the issue i have with this is when i search for example if an object has “Mar” in their first_name , i assume the results will return the objects that contain “Maribor” in location field as well , which is not the output that i want ( i want to get results that only match the criteria against the first_name field ).

I wanted to ask if this is the approach to go , is there a better solution ? ( Atlas is not an option for me at the moment ).

Thanks a lot

Hi @Omen_Omen ,

The Atlas Search indexes come to solve such a problem.

To solve it with regular text indexes you will have to add another stage after the text match, for example add a regex match in the next stage on the first_name field as you require in the case.

Please note that the te, t indexes search words and not subtexts

Thanks
Pavel

Hey @Pavel_Duchovny thanks for the response !

As i mentioned currently Atlas isn’t an option for me.
In regards to your solution about adding additional stage with regex i’m not sure i understand you, would you mind elaborating on that or show a short example ?

Thanks a lot

db.collection.aggregate(   [     { $match: { $text: { $search: "Maribor" } } },     { $match: { "first_name" :  /Maribor/ } }    ])

This query will only return results that have first_name with maribor and not any other…

1 Like

Thanks for the example.

Is this the go to practice for my case ? Doesn’t the second match return exact value ( which can be achieved without the text index on the field ? ) I’m just trying to find the most optimal solution

It can be also a sub expression like /Mar/ or , I thought you are searching for a word inside of sentences or a few words…

Okay so lets say i have the following data:

{
    "first_name":"marvin jones",
    "bio":"I've watched the movie marvelous mesigner , its amazing",
    "link":"www.smarvit.com"
}

If i only apply the search query it will return all 3 results? And than if i add the first_name part:

db.collection.aggregate( [ { $match: { $text: { $search: "marv" } } }, { $match: { "first_name" : /marv/ } } ])

It will only return the results that match the first_name filter ?

Yes thats the idea.

However, text indexes are built to search for words and not substrings. Atlas search has a $regex operator to enhance that :slight_smile:

If you need to search for substrings you will have to use regex matches from the start (those are not optimal for index used unfortenatly)

Atlas search is one heck of a technology that boost any search. I am very in favor of moving to Atlas :slight_smile:

In your example only the following query will work :confused:

.aggregate( [ { $match: { $text: { $search: "marvin" } } }, { $match: { "first_name" : /marv/ } } ])

Ty

So basically i can’t search for substrings efficiently ? In the example above , i have to $search for the whole word , and than match against the substring? Does that mean i can’t search substrings ( the above result would return if the first name was marvin but if it was marvery it won’t return any result, meaing i have to know the full word ( for the $search part ) if i wanted to do substring search ( for the regex part ), which doesn’t really makes sense ?
Am i getting this right or ?

You can just use :

.aggregate( [ { $match: { "first_name" : /^marv/ } } ])

This type of query can use an index on first_name…

Searching substrings/full text search is more of Atlas search capabilities. You will need to test how regex or regular text search can be done on your use case…

Will this query work if i had single field ( asc ) index on first_name ( without the text index ) and for example. i have

{
"first_name":"marvin gaye"
}

?

Sorry again if i’m asking too many questions but i was a bit confused on the indexing part in Mongo

The regex with start anchor and without case insensitivity specifications should work with regular indexes

Ty

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.