allowDiskUse is not working in aggregation pipeline

Hi, I was working for an aggregation pipeline to cleanup duplicate data from the collection. But it was throwing "MongoServerError: PlanExecutor error during aggregation :: caused by :: Exceeded memory limit for $group, but didn't allow external spilling; pass allowDiskUse:true to opt in" error even after passing in { allowDiskUse: true }.

The objective was to delete all duplicates (same phoneNumber with same startTime) after retaining the first entry.

Below is my pipeline stages:

const pipeline = [
    {
        $group: {
            _id: {
                phoneNumber: "$details.phoneNumber",
                startTime: "$details.startTime",
            },
            docs: { $push: "$$ROOT" },
            count: { $sum: 1 },
        },
    },
    {
        $match: {
            count: { $gt: 1 },
        },
    },
    {
        $unwind: "$docs",
    },
    {
        $sort: {
            "docs.createdAt": 1,
        },
    },
    {
        $skip: 1,
    },
    {
        $replaceRoot: { newRoot: "$docs" },
    },
]

db.collectionName.aggregate(pipeline, { allowDiskUse: true })

So, whats the issue here that even after passing in allowDiskUse its failing.

Using mongodb v5.8 nodejs driver.

What Atlas tier is being used by the cluster, there are operational limits on shared cluster tiers (free,2,M5) including allowDiskUse in aggregation pipelines.

I was testing it on our dev database, which is on m0. Thus it failed like you mentioned.
Thanks.

And before testing in production, anyway we can optimise the process and make sure nothing unintended will happen?

Use a dedicated tier M10 for the testing that supports the functionality you are needing.

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.