I have a MongoDB collection with over 2.9M records and i want to remove the old data till a specific date range. & there’s only one index (i.e. _id UNIQUE) created over this collection
Upon running the following command in a shell, it is taking too much time:
"db.collection.remove({_id: {$in: db.collection.find({}, {_id : 1}).limit(100).sort({id:1}).toArray().map(function(doc) { return doc._id; }) }})
"
So which index is more suitable for a function “deleteMany or remove” to remove data from a column using a wildcard?
Secondly, i am creating a new index right now , on the same column in ASC order with option “Create index in the backgound”. How much time the index creation will take ?
Unfortunately we don’t have a date column in the collection but we do have a column that holds such information in this format (“1021-30-12-2021”) . The right side of this string is the date (i.e. 30-12-2021).
I tried using this query ( db.collection.deleteMany({routeKey: /12-01-2022/}) ) but the query is taking too much time.
Because you are using a regular expression that is not anchored at the beginning. This means that even if you have an index, all documents needs to be scanned to produce a result.
The first error is that you do not keep your dates as date but as string. The second error is that you used the worst string representation of a date by using day-month-year. This means you cannot even use relational operators, you cannot sort your data on date.
If querying with date is important you must have a date field store as a date data type. It takes less space than string, it is faster to compare than string and there is a rich library of date manipulation function.
Using the right model, using the right data type and have an index are the first steps for performance.
The following is certainly time consuming and having an ascending index on a field for which you already have a descending index is probably completely useless.
Do not create a new index just for your delete. It will probably take more time to 1) create the index 2) delete the documents and 3) update the index for all deleted document than just doing the delete.