Slow $match query after $search

Hello everyone thanks for reading, I have a collection called submissions that has almost 500k documents with the following structure
image
I’m trying to filter these submissions after a $search query, I want to filter the submissions by name and also by event._id so I created this aggregation pipeline

[
  { '$search': { index: 'cl-submissions-all-fields', text: {
    query: 'Massachusetts Institute of Technology',
    path: ['organizations.name']
      } 
    } 
  },
  {
    '$match': {
      'event._id': ObjectId('622e5fee874f59246a93951b')
    }
  }
]

the issue that I’m having right now is that the $search query is pretty fast but after that the $match stage takes almost 30 seconds to return, I have an index for the event._id field and even added one for the whole object but it didn’t change

hoping anyone can help me, thank you for your time

Hi @gapinzon - Welcome to the community!

I have an index for the event._id field and even added one for the whole object but it didn’t change

As noted in the $match documentation:

Since the $search stage is required to be at the start, the $match stage used in your pipeline won’t take advantage of indexes.

Based off the title, I presume the issue is not to do with the resulting output but more so the speed in which the output is generated - Please correct me if I am wrong here.

I am not yet aware of the details of the use case for this aggregation but perhaps it may be possible to use the compound operator with the filter option to replace the $match stage to achieve what you are after. There is a similar example of this being done on the filter section of the docs linked. I have created a similar environment with 2 test docs as follows:

/// Note: Only the event._id value differs between the 2 docs.

adb> db.search.find()
[
  {
    _id: ObjectId("6254f404b7829aa13d1b6c17"),
    title: 'Test Multiple Speakers 3',
    organizations: [
      { name: 'Massachusetts Institute of Technology' },
      { name: 'testorg' }
    ],
    event: { _id: ObjectId("622e5fee874f59246a93951b") }
  },
  {
    _id: ObjectId("6254f404b7829aa13d1b6c18"),
    title: 'Test Multiple Speakers 3',
    organizations: [
      { name: 'Massachusetts Institute of Technology' },
      { name: 'testorg' }
    ],
    event: { _id: ObjectId("123e5fee874f59246a939abc") }
  }
]

Using a compound operator, I perform the following $search aggregation with the filter and should options:

{
    $search: {
      "index":"default",
      "compound": {
        "should": [{
          "text": {
            "query": "Massachusetts Institute of Technology",
            "path": "organizations.name"
          }
        }],
        "filter": [{
          "equals": {
            "value": ObjectId("622e5fee874f59246a93951b"),
            "path": "event._id"
          }
        }]
      }
    }
  }

which results in the following document being returned:

  {
    _id: ObjectId("6254f404b7829aa13d1b6c17"),
    title: 'Test Multiple Speakers 3',
    organizations: [
      { name: 'Massachusetts Institute of Technology' },
      { name: 'testorg' }
    ],
    event: { _id: ObjectId("622e5fee874f59246a93951b") }
  }

However, the above is just an example that may assist you with achieving what you want as the $match stage you’ve provided will not use an index. Please alter it accordingly to suit your use case / environment.

For additional reference, the search index I have used in this example is defined as follows:

{
  "mappings": {
    "dynamic": true,
    "fields": {
      "event": {
        "fields": {
          "_id": [
            {
              "dynamic": true,
              "type": "document"
            },
            {
              "type": "objectId"
            }
          ]
        },
        "type": "document"
      }
    }
  }
}

If further assistance is required, could you advise the following information:

  1. Sample documents from the output with just the $search stage
  2. Output from the aggregation pipeline you’ve provided
  3. Use case details

Hope this helps.

Regards,
Jason

3 Likes

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.