Regex query taking time - is there any way to bring the query execution time low?

I have name index which I use for sorting and I have to apply regex search on the same, below is the index:-
{ a: 1, b: 1, name: 1 }

Query:-
{ a: true, b: 'value', name: /jon/i, sort: { name: 1 }}

This Query is using the above index but { a: 1, b: 1, name: 1 }, but the query is taking more than 6s and the no.of records are 85K for { a: true, b: 'value' } query and post name filtering 26 records are returned.

Which index should be created to support this regex query?
OR
How can I perform a regex search quicker?
Thanks for the help in advance!

Hello @Viraj_Chheda ,

I notice you haven’t had a response to this topic yet - were you able to find a solution?
If not, could you please provide below details?

  • MongoDB Version
  • Some sample documents
  • Query you are running
  • Please run your query with explain in executionStats mode (e.g. `db.collection.explain(‘executionStats’).aggregate(…)) and share the output.

Regards,
Tarun

@Tarun_Gaur I have still not found solution to this, below are the details:-

  • MongoDB Version: 4.4.17, Mongo Compass Version: 1.33.1
  • Some sample documents:
{
  "_id": {
    "$oid": "61d6d6a5568e7e17161g3565"
  },
  "user_details": {
    "id": 54,
    "name": "A SHIVAKUMAR",
    "name_in_lowercase": "a shivakumar",
  }
  "is_disable": false
  "source_with_assessment_id": "course_lesson_95891",
  .
  .
  .
  #other fields
}
  • Query you are running
# basically I want to get results as we do with 'like' query
# so that all the names which has 'shiv' in them apperares in the results eg: shivabc, abcshivdef, abcshiv

Filter - { is_disable: false, source_with_assessment_id: 'course_lesson_95891', "user_details.name_in_lowercase": /shiv/i }

Sort - { "user_details.name_in_lowercase": 1 }

  • Please run your query with execution stats mode
    Here execution time is 6621 ms, but I require results to be in ~100 ms
{
"executionStats": {
    "executionSuccess": true,
    "nReturned": 26,
    "executionTimeMillis": 6621,
    "totalKeysExamined": 91735,
    "totalDocsExamined": 26,
    "executionStages": {
      "stage": "FETCH",
      "nReturned": 26,
      "executionTimeMillisEstimate": 27,
      "works": 91736,
      "advanced": 26,
      "needTime": 91708,
      "needYield": 0,
      "saveState": 2477,
      "restoreState": 2477,
      "isEOF": 1,
      "docsExamined": 26,
      "alreadyHasObj": 0,
      "inputStage": {
        "stage": "IXSCAN",
        "filter": {
          "user_details.name_in_lowercase": {
            "$regex": "shiv",
            "$options": "i"
          }
        },
        "nReturned": 26,
        "executionTimeMillisEstimate": 27,
        "works": 91735,
        "advanced": 26,
        "needTime": 91708,
        "needYield": 0,
        "saveState": 2477,
        "restoreState": 2477,
        "isEOF": 1,
        "keyPattern": {
          "is_disable": 1,
          "source_with_assessment_id": 1,
          "user_details.name_in_lowercase": 1
        },
        "indexName": "is_disable_1_source_with_assessment_id_1_user_details.name_in_lowercase_1",
        "isMultiKey": false,
        "multiKeyPaths": {
          "is_disable": [],
          "source_with_assessment_id": [],
          "user_details.name_in_lowercase": []
        },
        "isUnique": false,
        "isSparse": false,
        "isPartial": false,
        "indexVersion": 2,
        "direction": "forward",
        "indexBounds": {
          "is_disable": [
            "[false, false]"
          ],
          "source_with_assessment_id": [
            "[\"course_lesson_95891\", \"course_lesson_95891\"]"
          ],
          "user_details.name_in_lowercase": [
            "[\"\", {})",
            "[/shiv/i, /shiv/i]"
          ]
        },
        "keysExamined": 91735,
        "seeks": 1,
        "dupsTested": 0,
        "dupsDropped": 0
      }
    },
    "allPlansExecution":[] # 27 Plans were there
}

From the explain output you posted

  • Total documents examined (“totalDocsExamined”) are 26
  • Total returned documents (“nReturned”) are 26
  • Total keys examined (totalKeysExamined) are 91735

The server examines a lot of index keys to return just 26 documents, so the query does not effectively use the index. Ideally, nReturned, totalKeysExamined, and totalDocsExamined should be the same number (all 26 in this example), or reasonably close to one another, which means that the server does little to no extra work to return the results. In this particular example, the server needed to examine more than 3500 index keys per one returned document.

Initially, I was going to recommend you to use anchored regex and collation but as you mentioned above that you need all the results where shiv is present irrespective of the location so this won’t be helpful for your use case. However, if you can find a pattern in your queries that can help narrow down the index keys needed to be examined, that will most certainly be helpful. For example, search for shiv anywhere in the name, but maybe you can be reasonably sure that the person’s user id is between 10 and 1000. This additional information would be very useful for performance gain.

If you’re using Atlas, perhaps you can check out Atlas Search which is an embedded full-text search in MongoDB Atlas that gives you a seamless, scalable experience.

Thanks, @Tarun_Gaur for the inputs. We will check if we should avail atlas search for this.

1 Like