Node.js | GridFS: Executor error during find command :: caused by :: Sort exceeded memory limit

Ni_Ma · April 7, 2022, 4:54pm

I have uploaded a 400Mb zip file to MongoDB using GridFS. I then try to donwnload it using the following code:

let mongoGridFsBucket = new mongodb.GridFSBucket(Mongoose.connection.db, {
  chunkSizeBytes: 1024,
  bucketName
})

let gridFsDownloadStream = mongoGridFsBucket.openDownloadStreamByName(filename)

gridFsDownloadStream.on('error', console.error)
    
gridFsDownloadStream.on('end', function() {
  console.info('downloaded')
})

gridFsDownloadStream.pipe(fs.createWriteStream('/local/path/to/downloaded.zip'))

and get this Error:

MongoServerError: Executor error during find command :: caused by :: Sort exceeded memory limit of 33554432 bytes, but did not opt in to external sorting.

The above code works fine for smaller files (e.g. 9Mb files), I’ve already tested that successfully.

However in this case the file is too big. I looked for a solution online and apparently there is some allowDiskUse flag that I need to set somewhere but I don’t know where and how .

There is no place in the above code where I could set this allowDiskUse to true so I don’t know what else to do to make this work.

Nikitas · May 13, 2022, 7:00pm

So did you find a solution to this problem? I’m still looking Please share

Nikitas · May 14, 2022, 1:59pm

I did some research and I’ve found a solution.

Apparently, when you download a file by streaming it with GridFS, the documents that it is comprised of are first sorted. According to this blog post, when doing a sort, MongoDb first attempts to retrieve the documents using the order specified in an index. When no index is available it will try to load the documents into memory and sort them there.

The catch is that Mongo is configured by default to abort the operation when exceeding usage of 32 MB. In that case, we run into the “Sort exceeded memory limit” error described above. In order to solve this problem then, you’ll have to create an index for the ‘n’ field of the chunks collection that contains the file you want to download:

    // Create an index for the 'n' field to sort the chunks collection.
    db.collection('media.chunks').createIndex({n: 1});

Stennie_X · May 19, 2022, 12:08pm

Welcome to the MongoDB Community @Nikitas!

The blog post you have shared is relevant in terms of limitations of an in-memory sort, however drivers that conform to the GridFS specification should automatically create required GridFS Indexes for API retrieval.

If these indexes do not exist, you can manually create them.

An index on n alone is missing files_id for efficiently retrieving all chunks related to a specific uploaded file.

The expected GridFS indexes are actually:

db.fs.chunks.createIndex( { files_id: 1, n: 1 }, { unique: true } );
db.fs.files.createIndex( { filename: 1, uploadDate: 1 } );

Regards,
Stennie

system · May 24, 2022, 12:09pm

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.