I stumbled upon two issues when migrating code from MapReduce to an aggregation pipeline using $accumulator
in a $group
stage on a MongoDB 6.0.8 server using mongosh 1.10.1.
I explicitly set allowDiskUseByDefault
to true
and pass it in the runCommand
for the aggregation
db.adminCommand(
{
setParameter: 1,
allowDiskUseByDefault: true
}
)
...
db.runCommand({
"aggregate":sourceCollection,
"pipeline":pipeline,
allowDiskUse: true,
cursor:{},
});
First one was this OOM error when data fed into the group stage exceeded a certain size. Average size of docs going into the $group
was 231KB and aggregation bombed when processing 620 such documents (~140MB), but was working up to 612 documents (138MB):
MongoServerError: PlanExecutor error during aggregation :: caused by :: Out of memory
I worked around this one by limiting the amount of documents processed in one run to stay well below that amount (chose max. 200 documents at a time). This being a time based aggregation, I cannot freely choose the exact amount of documents, as I’d always need to process whole hours or days respectively.
The second error I stumbled over was this one:
JavaScript execution interrupted
The documents processed each contain an array of objects and the $accumulator
merges these arrays by adding docs to the target if it’s not present and adding up some properties and simply setting others when it is. If the number of sub-documents in these arrays exceeded a certain amount, I got the aforementioned “JavaScript execution interrupted”; in most cases it would be fine, but there were a couple of hours where we suffered a bot attack that led to an excessive amount of documents per each array.
I worked around this problem by inserting an addition $set
phase before the $group
where I would sort these arrays by a specific key and then limit the number per array to 2,000 elements. I didn’t experiment to find out the exact limit when it would start to fail, but with the selected limit of 2,000 I consistently achieved successful aggregations.
So my question is: Shouldn’t $group
just spill to disk if allowDiskUse
is allowed? Why is there still an OOM error and how else could this have been avoided? MapReduce just ran very long and consumed quite some memory, but I could depend upon it completing eventually - with the aggregation framework I have a somewhat bad feeling that there may be circumstances where an aggregation may bomb and we’d have to find out some workaround. Is there maybe some configuration setting I missed that would allocate more ressources to specific aggregations? This thing is running on a server with 128GB of RAM and I just assume that such a server is not that out of the ordinary, so it would be nice if larger aggregations could actually make use of the available memory?