Insert and delete large number of data

As a Kafka subscriber, I am getting 2000+ records within 10 seconds and storing in one collection one by one (100+ millions records).
For deletion, I have scheduler that runs every 30 minutes. I need to delete records older than 2 days. Because mongo id is indexed by default. I am using it as a condition to delete by converting datetime to mongo object id

(Node JS code)
time = new ObjectID(moment().subtract(2, ‘days’).valueOf()/1000);
db.deleteMany({_id:{$lte: time})

Problem is, this delete query takes too much time to finish ( and increase CPU utilization). Is there any alternative way to delete old records? I heard about TTL but not sure if it works with large collection.

Thanks in advance. :v:

Hi @avnish, welcome to the community. Glad to have you here.

You heard right, about TTL indexes. They are great at removing documents that are past the expiration date without having to write code to explicitly delete the documents, and they work regardless of the collection size. However, _id field does not support TTL indexes. So, you would have to index a field whose value is either a date, or an array with date values.

Check out the behavior section of the documentation for more details on the inner workings of this feature.

Hope this helps. Let me know if you have any questions.

Mahi

2 Likes