Simple $group on one field with 5 million documents yielding horrific performance

$group is performing terribly. I have tried many different approaches. Below are some of the approaches. Explain Plan at the end of this post. Any help would be helpful. I have tried this on local MDB, Atlas (M40). Any help would be great!

Take the following schema. Everything not required has been removed. _id is not used.
“bankAccount” can be null (an unallocated account). “accountNumber” is not unique. I want to find the number of accounts allocated to each “accountNumber”.

{
    "_id" : 12345,
    "bankAccount" : {
        "accountNumber" : "1234567890123450"
        }
    }
}

I have several indexes and the Explain Plan (at the end of this post) indicates an index is being used!

The collection size is 9 million (soon another 9 million will be added), with 5 million having am “BankAccount” object.

Using LINQ:

var sw = Stopwatch.StartNew();
var result = await bankAccountRepo.Queryable()  
							  .Where(n => n.BankAccount.AccountNumber != null)
							   .GroupBy(n => n.BankAccount.AccountNumber)
							   .Select(n => new { AccountNumber = n.Key, Count = n.Count() })
							   .ToListAsync();
sw.Stop();								   
Console.WriteLine($"{sw.Elapsed}    Rows:{result.Count}");

Result
00:00:13.7990042 Rows:175

14 seconds - Kidding me! SQL-Server does this in <100ms It’s only returning 175 rows.

Next, I tried an aggregation.

[
    { "$match" : { "bankAccount.accountNumber" : { "$ne" : null } } }, 
    { "$group" : { "_id" : "$bankAccount.accountNumber", "count" : { "$sum" : 1 } } }, 
    { "$project" : { "AccountNumber" : "$_id", "Count" : "$count", "_id" : 0 } }
]

Result
00:00:13.2047294 Rows:175

Same!

The Explain says it’s using an index scan,
It does not need to examine any of the documents according to the Explain Plan
“totalKeysExamined” : NumberInt(5043932),
** “totalDocsExamined” : NumberInt(0),**

The group section of the plan estimates 14.8sec
“executionTimeMillisEstimate” : NumberLong(14854)

The projection stage estimates at 2.1sec
“executionTimeMillisEstimate” : NumberInt(2101),

What is going on? This makes the $group operator totally unusable!

Next Thought: I’ll retrieve all the "accountNumber"s and group them in memory using LINQ.

  1. Retrieve all account Numbers
var accountNumbers =	await bankAccountRepo.Queryable()
							                       .Where(n => n.BankAccount.AccountNumber != null)
							                       .Select(n => n.BankAccount.AccountNumber)
							                       .ToListAsync();
  1. Group them inMemory using LINQ
var grouped = accountNumbers.GroupBy(n => n)
						   .Select(n => new { AccountNumber = n.Key, Count = n.Count() })
						   .ToList();

3. Results

  1. Retrieve all AccountNumbers from Mongo: 00:00:20.3103406 Rows:5043932
  2. InMemory LINQ Grouping: 00:00:00.2270047 Rows:175

Conclusion AND Questions

  1. I want to know if I have done something wrong. Right now, this is unusable.
  2. Why can LINQ group 5 Million rows in 0.23 seconds while MongoDB is taking 15 seconds?
  3. Why does it take 20.3 seconds to return all the values in an index (which is in memory on the server).
  4. I’m migrating a large real-time banking application from SQL-Server to MongoDB. When I have found poor MongoDB performance in the past, I have been able to split up the query and use other tricks to at least meet the performance of SQL-Server. With this simple reporting query, Nothing I have tried has dropped performance below 13 seconds.

Explain Plan:

{
    "explainVersion" : "1",
    "stages" : [
        {
            "$cursor" : {
                "queryPlanner" : {
                    "namespace" : "paymentsDb-Migration.bankAccount",
                    "indexFilterSet" : false,
                    "parsedQuery" : {
                        "bankAccount.accountNumber" : {
                            "$not" : {
                                "$eq" : null
                            }
                        }
                    },
                    "queryHash" : "2B6152F6",
                    "planCacheKey" : "10155D97",
                    "maxIndexedOrSolutionsReached" : false,
                    "maxIndexedAndSolutionsReached" : false,
                    "maxScansToExplodeReached" : false,
                    "winningPlan" : {
                        "stage" : "PROJECTION_DEFAULT",
                        "transformBy" : {
                            "bankAccount.accountNumber" : NumberInt(1),
                            "_id" : NumberInt(0)
                        },
                        "inputStage" : {
                            "stage" : "IXSCAN",
                            "keyPattern" : {
                                "bankAccount.accountNumber" : NumberInt(1),
                                "bankAccount.abaDetails.number" : NumberInt(1)
                            },
                            "indexName" : "accountNumber_abaDetails.number_V1",
                            "isMultiKey" : false,
                            "multiKeyPaths" : {
                                "bankAccount.accountNumber" : [

                                ],
                                "bankAccount.abaDetails.number" : [

                                ]
                            },
                            "isUnique" : true,
                            "isSparse" : true,
                            "isPartial" : false,
                            "indexVersion" : NumberInt(2),
                            "direction" : "forward",
                            "indexBounds" : {
                                "bankAccount.accountNumber" : [
                                    "[MinKey, undefined)",
                                    "(null, MaxKey]"
                                ],
                                "bankAccount.abaDetails.number" : [
                                    "[MinKey, MaxKey]"
                                ]
                            }
                        }
                    },
                    "rejectedPlans" : [
                        {
                            "stage" : "PROJECTION_DEFAULT",
                            "transformBy" : {
                                "bankAccount.accountNumber" : NumberInt(1),
                                "_id" : NumberInt(0)
                            },
                            "inputStage" : {
                                "stage" : "IXSCAN",
                                "keyPattern" : {
                                    "bankAccount.accountNumber" : NumberInt(1),
                                    "bankAccount.stipStatus" : NumberInt(1),
                                    "bankAccount.isActive" : NumberInt(1),
                                    "bankAccount.createdDateTime" : NumberInt(1)
                                },
                                "indexName" : "accountNumber_stipStatus_isActive_createdDateTime_V1",
                                "isMultiKey" : false,
                                "multiKeyPaths" : {
                                    "bankAccount.accountNumber" : [

                                    ],
                                    "bankAccount.stipStatus" : [

                                    ],
                                    "bankAccount.isActive" : [

                                    ],
                                    "bankAccount.createdDateTime" : [

                                    ]
                                },
                                "isUnique" : false,
                                "isSparse" : true,
                                "isPartial" : false,
                                "indexVersion" : NumberInt(2),
                                "direction" : "forward",
                                "indexBounds" : {
                                    "bankAccount.accountNumber" : [
                                        "[MinKey, undefined)",
                                        "(null, MaxKey]"
                                    ],
                                    "bankAccount.stipStatus" : [
                                        "[MinKey, MaxKey]"
                                    ],
                                    "bankAccount.isActive" : [
                                        "[MinKey, MaxKey]"
                                    ],
                                    "bankAccount.createdDateTime" : [
                                        "[MinKey, MaxKey]"
                                    ]
                                }
                            }
                        },
                        {
                            "stage" : "PROJECTION_DEFAULT",
                            "transformBy" : {
                                "bankAccount.accountNumber" : NumberInt(1),
                                "_id" : NumberInt(0)
                            },
                            "inputStage" : {
                                "stage" : "IXSCAN",
                                "keyPattern" : {
                                    "bankAccount.accountNumber" : NumberInt(1),
                                    "bankAccount.createdDateTime" : NumberInt(1)
                                },
                                "indexName" : "accountNumber_createdDateTime_V1",
                                "isMultiKey" : false,
                                "multiKeyPaths" : {
                                    "bankAccount.accountNumber" : [

                                    ],
                                    "bankAccount.createdDateTime" : [

                                    ]
                                },
                                "isUnique" : false,
                                "isSparse" : true,
                                "isPartial" : false,
                                "indexVersion" : NumberInt(2),
                                "direction" : "forward",
                                "indexBounds" : {
                                    "bankAccount.accountNumber" : [
                                        "[MinKey, undefined)",
                                        "(null, MaxKey]"
                                    ],
                                    "bankAccount.createdDateTime" : [
                                        "[MinKey, MaxKey]"
                                    ]
                                }
                            }
                        },
                        {
                            "stage" : "PROJECTION_DEFAULT",
                            "transformBy" : {
                                "bankAccount.accountNumber" : NumberInt(1),
                                "_id" : NumberInt(0)
                            },
                            "inputStage" : {
                                "stage" : "IXSCAN",
                                "keyPattern" : {
                                    "bankAccount.accountNumber" : NumberInt(1)
                                },
                                "indexName" : "accountNumber_V1",
                                "isMultiKey" : false,
                                "multiKeyPaths" : {
                                    "bankAccount.accountNumber" : [

                                    ]
                                },
                                "isUnique" : false,
                                "isSparse" : true,
                                "isPartial" : false,
                                "indexVersion" : NumberInt(2),
                                "direction" : "forward",
                                "indexBounds" : {
                                    "bankAccount.accountNumber" : [
                                        "[MinKey, undefined)",
                                        "(null, MaxKey]"
                                    ]
                                }
                            }
                        }
                    ]
                },
                "executionStats" : {
                    "executionSuccess" : true,
                    "nReturned" : NumberInt(5043932),
                    "executionTimeMillis" : NumberInt(14858),
                    "totalKeysExamined" : NumberInt(5043932),
                    "totalDocsExamined" : NumberInt(0),
                    "executionStages" : {
                        "stage" : "PROJECTION_DEFAULT",
                        "nReturned" : NumberInt(5043932),
                        "executionTimeMillisEstimate" : NumberInt(2101),
                        "works" : NumberInt(5043933),
                        "advanced" : NumberInt(5043932),
                        "needTime" : NumberInt(0),
                        "needYield" : NumberInt(0),
                        "saveState" : NumberInt(5557),
                        "restoreState" : NumberInt(5557),
                        "isEOF" : NumberInt(1),
                        "transformBy" : {
                            "bankAccount.accountNumber" : NumberInt(1),
                            "_id" : NumberInt(0)
                        },
                        "inputStage" : {
                            "stage" : "IXSCAN",
                            "nReturned" : NumberInt(5043932),
                            "executionTimeMillisEstimate" : NumberInt(737),
                            "works" : NumberInt(5043933),
                            "advanced" : NumberInt(5043932),
                            "needTime" : NumberInt(0),
                            "needYield" : NumberInt(0),
                            "saveState" : NumberInt(5557),
                            "restoreState" : NumberInt(5557),
                            "isEOF" : NumberInt(1),
                            "keyPattern" : {
                                "bankAccount.accountNumber" : NumberInt(1),
                                "bankAccount.abaDetails.number" : NumberInt(1)
                            },
                            "indexName" : "accountNumber_abaDetails.number_V1",
                            "isMultiKey" : false,
                            "multiKeyPaths" : {
                                "bankAccount.accountNumber" : [

                                ],
                                "bankAccount.abaDetails.number" : [

                                ]
                            },
                            "isUnique" : true,
                            "isSparse" : true,
                            "isPartial" : false,
                            "indexVersion" : NumberInt(2),
                            "direction" : "forward",
                            "indexBounds" : {
                                "bankAccount.accountNumber" : [
                                    "[MinKey, undefined)",
                                    "(null, MaxKey]"
                                ],
                                "bankAccount.abaDetails.number" : [
                                    "[MinKey, MaxKey]"
                                ]
                            },
                            "keysExamined" : NumberInt(5043932),
                            "seeks" : NumberInt(1),
                            "dupsTested" : NumberInt(0),
                            "dupsDropped" : NumberInt(0)
                        }
                    }
                }
            },
            "nReturned" : NumberLong(5043932),
            "executionTimeMillisEstimate" : NumberLong(13523)
        },
        {
            "$group" : {
                "_id" : "$bankAccount.accountNumber",
                "count" : {
                    "$sum" : {
                        "$const" : 1.0
                    }
                }
            },
            "maxAccumulatorMemoryUsageBytes" : {
                "count" : NumberLong(12600)
            },
            "totalOutputDataSizeBytes" : NumberLong(45675),
            "usedDisk" : false,
            "nReturned" : NumberLong(175),
            "executionTimeMillisEstimate" : NumberLong(14854)
        },
        {
            "$project" : {
                "AccountNumber" : "$_id",
                "Count" : "$count",
                "_id" : false
            },
            "nReturned" : NumberLong(175),
            "executionTimeMillisEstimate" : NumberLong(14854)
        }
    ],
    "serverInfo" : {
        "host" : "atlas-4vy2as-shard-00-02.krrw3.mongodb.net",
        "port" : NumberInt(27017),
        "version" : "5.0.14",
        "gitVersion" : "1b3b0073a0b436a8a502b612f24fb2bd572772e5"
    },
    "serverParameters" : {
        "internalQueryFacetBufferSizeBytes" : NumberInt(104857600),
        "internalQueryFacetMaxOutputDocSizeBytes" : NumberInt(104857600),
        "internalLookupStageIntermediateDocumentMaxSizeBytes" : NumberInt(104857600),
        "internalDocumentSourceGroupMaxMemoryBytes" : NumberInt(104857600),
        "internalQueryMaxBlockingSortMemoryUsageBytes" : NumberInt(104857600),
        "internalQueryProhibitBlockingMergeOnMongoS" : NumberInt(0),
        "internalQueryMaxAddToSetBytes" : NumberInt(104857600),
        "internalDocumentSourceSetWindowFieldsMaxMemoryBytes" : NumberInt(104857600)
    },
    "command" : {
        "aggregate" : "cuscalBankAccount",
        "pipeline" : [
            {
                "$match" : {
                    "bankAccount.accountNumber" : {
                        "$ne" : null
                    }
                }
            },
            {
                "$group" : {
                    "_id" : "$bankAccount.accountNumber",
                    "count" : {
                        "$sum" : 1.0
                    }
                }
            },
            {
                "$project" : {
                    "AccountNumber" : "$_id",
                    "Count" : "$count",
                    "_id" : 0.0
                }
            }
        ],
        "allowDiskUse" : false,
        "maxTimeMS" : NumberLong(0),
        "cursor" : {

        },
        "$db" : "monoovaPaymentsDb-Migration"
    },
    "ok" : 1.0,
    "$clusterTime" : {
        "clusterTime" : Timestamp(1676282832, 1),
        "signature" : {
            "hash" : BinData(0, "bx+wC+MVULo2KX1DYS8h0apGhcU="),
            "keyId" : NumberLong(7133047024810721286)
        }
    },
    "operationTime" : Timestamp(1676282832, 1)
}