AtlasSearch query optimization

Kaio_Pasqualinotto · September 6, 2025, 12:49pm

I’m facing a performance challenge with a MongoDB aggregation pipeline that uses Atlas Search, and I would appreciate some guidance on how to optimize it.

The Problem

We have a feature for searching delivery documents where users can look up by a document number. Recently, a user performed a search using a single digit (e.g., “1”). This triggered a query that took over 30 seconds to execute (resulting in a timeout) and caused our Atlas serverless instance to scale up from M20 to M30 to handle the load.

Our temporary workaround has been to enforce a 3-character minimum validation on the front-end and to set a 15-second maxTimeMS on the backend query to prevent prolonged server strain. However, we want to fix the root cause by optimizing the backend query to be performant even with short, non-specific inputs.

Code Context

Our search logic is built within a repository that dynamically assembles an aggregation pipeline. Below are the most relevant parts of the code.

db.getCollection(‘nfedeliveries’).aggregate([
// STAGE 1: The problematic $search stage.
// The ‘should’ clause acts as an OR. The last regex, “5.", is too broad
// and forces Atlas Search to scan a huge number of documents.
{
“$search”: {
“index”: “default”,
“compound”: {
“should”: [
{ “regex”: { “path”: “document”, “allowAnalyzedField”: true, “query”: "510046.” }},
{ “regex”: { “path”: “document”, “allowAnalyzedField”: true, “query”: “510048." }},
{ “regex”: { “path”: “document”, “allowAnalyzedField”: true, “query”: "510054.” }},
{ “regex”: { “path”: “document”, “allowAnalyzedField”: true, “query”: “510077." }},
// THIS IS THE LINE CAUSING THE ISSUE
{ “regex”: { “path”: “document”, “allowAnalyzedField”: true, “query”: "5.” }}
]
}
}
},

// STAGE 2: The $match stage.
// This stage becomes the bottleneck because it runs after the broad search,
// having to filter through the massive result set (~140k+ documents).
{
“$match”: {
“owner”: ObjectId(“6899d28cf69a4da66f67d69c”),
“issueDate”: {
“$gte”: ISODate(“2025-07-27T03:00:00Z”),
“$lte”: ISODate(“2025-09-06T02:59:59.999Z”)
},
“unusable”: { “$in”: [ false, null ] },
“receivementInfos”: { “$exists”: false },
“$or”: [
{ “deliveryPerson”: { “$exists”: false } },
{ “deliveryPerson”: { “$eq”: } }
]
}
},

// STAGE 3: Pagination.
// The query times out before even reaching this stage effectively.
{
“$facet”: {
“metadata”: [ { “$count”: “totalDocs” } ],
“data”: [
{ “$sort”: { “updatedAt”: -1 } },
{ “$skip”: 0 },
{ “$limit”: 15 }
]
}
}
],
{ “maxTimeMS”: 30000 });

type or paste code here


#### **Problem Analysis and Hypothesis**

Our hypothesis is that the `regex` query with a very broad pattern (e.g., `'1.*'`) returns a massive result set. Atlas Search quickly finds every document that starts with "1", which could be tens or even hundreds of thousands of documents.

The bottleneck occurs in the next stage: the pipeline then tries to apply the `$match` filter (for `owner`, `issueDate`, etc.) to this enormous dataset in memory. This consumes a massive amount of CPU and time, ultimately leading to the timeout.

#### **Our Questions and Request for Help**

We would like to know the best approach to optimize this scenario. Our specific questions are:

1.  **How can we restructure this query so that filters like `owner` and `issueDate` are applied more efficiently?** 

2.  Is the `regex` operator the best choice for this "starts-with" use case? Would the `autocomplete` operator or a different or different feature provided by mongo? index configuration offer better performance for short inputs?

3.  Are there any other pipeline optimization or Atlas Search indexing strategies we should consider for this type of problem?

4. If we create a new numeric field with the document number instead of using the current one which is a string, will it help with overall performance?

Any help or guidance would be greatly appreciated. Thank you!