Hi, some of the collections are huge and the database is growing causing space issues on the drive. Developers said to delete data older than one year. Not sure how to do that. What is the best way to delete and also how do we compact them?
Hello @Ana ,
Assuming you have a “date created” field or similar in your documents, to delete documents older than one year, you can use the following command in MongoDB’s shell:
db.collectionName.deleteMany({dateField: {$lt: ISODate(“YYYY-MM-DDTHH:mm:ss.sssZ”)}})
In this command, collectionName
should be replaced with the name of your collection and dateField
should be replaced with the name of the field in the document that holds the date. The ISODate
expression specifies the date before which documents should be deleted.
If this is not how your document looks like, please provide some example documents, and a method to determine the age of the document in question.
To compact a MongoDB collection and reclaim disk space, you can use the compact
command. This will compact the extents of the collection, defragmenting and compacting its underlying data files:
db.runCommand({compact: ‘collectionName’})
Replace collectionName
with the name of the collection you want to compact. Note that the compact
command may take a long time to complete for large collections, and it may also cause increased CPU and I/O usage during the process. Note that the compact command is not guaranteed to free up disk space as per the disk space section in the compact documentation page.
Note: Please update and test the queries as per your use-case and requirements in your test environment before making any changes in production. Always have an up-to-date backup before performing server maintenance such as the
compact
operation.
Please refer compact command documentation to learn more about it.
If these solutions doesn’t work for you, please provide more details, such as your MongoDB version, your deployment topology, your data size and remaining disk space, and any other information that may help.
Regards,
Tarun
Thank you so much. But fs.chunks is the collection that is huge but it doesn’t have date column. Now how do I delete? fs.files has upload date. Please advise.
Hi @Ana, an approach to this would be something like:
// Defines the date range from which you want to remove documents
const startDate = new Date("2021-01-01");
const endDate = new Date("2021-12-31");
// Finds all files with upload date within specified range
var cursor = db.fs.files.find({ "uploadDate": { $gte: startDate, $lte: endDate } });
// Loop through each document found and delete it
while (cursor.hasNext()) {
var file = cursor.next();
db.fs.chunks.deleteMany({ "files_id": file._id });
db.fs.files.deleteOne({ "_id": file._id });
}
Removing data from GridFS must be done this way, as GridFS stores the large file data as smaller chunks in the “fs.chunks” collection, and the file metadata in a single entry in the “fs.files” collection.
Therefore, to remove an entire file, you must first remove the corresponding record in the “fs.files” collection, and then remove all fragments of that file in the “fs.chunks” collection. This ensures that all data in the file is removed correctly.
Also, to ensure that all fragments of a file are removed, it is important to search for the file using “filename” and remove all fragments using “files_id”.
After that you can run the: db.runCommand({compact: ‘collectionName’}), considering the points mentioned by @Tarun_Gaur.
Best
Thank you so much. Will try this.
Feel free to share any problems you have!
Hello, I have ran the delete and then compact. I see difference in the collection sizes but can’t see any difference in the drive size. Please check before and after.
Oh…storage size is the same but the size is different. Any idea what else I can do now?
On test, storage size also has gone down after purge and compact significantly.
Tried to run Repair but got an error TypeError: db.repairDatabase is not a function. Version of mongo is 4.2. Appreciate your help.