I wanted to delete millions of old records.date is a field we can use that

Db.getcollection(‘da’). Count({year:{2021}})

its a huge record needs to delete as batches of 10k or 20k. can we do that?

Seems to generate a syntax error.

Why do you want to batch your deletes?

Please share sample documents, some that need to be deleted and some that should not.

If you have a year you probably have month too. You could delete one month at a time rather than a year at a time.

Db.getcollection(‘da’). Count({year:{$lt:2021}})

Actually this was for finding count of record its about 50Million. So we cant delete that much at a time. we dont need to insert other conditions. Just delete those all records is enough.

So what u suggest is delete per month right?

Try to use below command

var bulk = db.collection.initializeUnorderedBulkOp();
bulk.find({timestamp: {$lte: new Date(“2020-10-31T23:59:59Z”)}}).remove();
bulk.execute();

I did that on my production.

Note: You need to run when you env. is quite.

1 Like

yes, by restricting the delete to a given month you can easily distribute the work over time.

another solution would be to use aggregation to $out into a temp. collection, drop the original, the move back the temp. collection into the original. that might be faster if you keep less documents than the number you want to delete. you might even reclaim more disk space.

1 Like

Not that this suggestion would help out much at this time, but if you know that records will not be valid after a certain period of time and you’re just going to delete them, you could look at using a TTL index and let the database engine remove the records after a certain period of time based on a date field. At high volumes of data this might not be the right approach (you would need to test as best as you can), but in some scenarios it might work well for you.

2 Likes