Full text search performance

Hi everyone,

I would be very appreciated if somebody could help me with my question below.

  1. db.getCollection(SomeCollection’).find({
    “field1”:25, //int32, indexed (separate index)
    “field2”:“Option1”, // string, indexed (separate index)
    “field3”:/\Qtest@example.com\E/, //string, indexed (separate index)
    “field4”:/\QSome phrase\E/, // very big string, NOT indexed, searching for phrase with regex
    “field5”: {$gt: new ISODate(“2021-01-01T00:00:00Z”)} // datetime, indexed
    })

  2. db.getCollection(SomeCollection’).find({
    “field1”:25, //int32, indexed (separate index)
    “field2”:“Option1”, // string, indexed (separate index)
    “field3”:/\Qtest@example.com\E/, //string, indexed (separate index)
    $text: {$search: ““Some phrase””}, // full text search index created for this field, searching for phrase
    “field5”: {$gt: new ISODate(“2021-01-01T00:00:00Z”)} // datetime, indexed
    })

MongoDB version 4.2 is used.
As I see from winning plan (actually it is always one), in second query indexes are not used (well, except text index), so I’m receiving the results with second query even slower than with first one without any index at all. This problem is getting high priority for our customers, is it possible somehow increase the speed of such search and make it using the indexes? I mean full-text search + additional filters or make any of this two queries somehow max speed. The number of additional filters could be different, I put that 5 fields just as an example.
I’m thinking about sharding, but that’s not an option at the moment.

thank you in advance!

In MongoDB a query will usually only be able to use a single index (exception is $or queries which this one is not). That’s why when querying on multiple fields, compound indexes are recommended. Compound index can include a text index as part of it.

So best thing would be to figure out which fields are most selective and add them into a compound index along with $text index.

Asya
P.S. if you are on Atlas, you can look into Atlas Search.
P.P.S. sharding won’t help you - it will just create more inefficient queries.

2 Likes

Thank you for your answer! That link to the documentation was very useful!

Which problems we could have if we will create compound index with, let’s say 10 fields + text field (that collection is about 200Gb)? How much could such an index affect the data insert?

thank you in advance!

P.S. About atlas search, I guess we will be more interested in serverless iInstances, but as I understood it is not ready for production at the moment.

It’s hard to say how much your insert performance will be impacted by additional indexes because it largely depends on how much extra “capacity” you already have. Best way to tell is to create indexes and test!