How to limit the number of items in a collection sorted by a given prop

Hi!

We are trying to build a logging service that limits the number of items for a given property of the log entry (per example only store the latest x number of records for this project).

Cap collections would be great but that means that we would need to create as many collections as projects, which doesn’t seem like the best of ideas.

The other idea was to simply with every write (they are done in batches), to simply insert the new elements, get the count of items for that given project and simply limit the n oldest ones above 100 based on the number of items we just inserted

Any other good approach to solve this efficiently?

Thanks!

(1) Using TTL Indexes you can delete documents automatically based upon a date field. This is a way to control the size of the collection by removing older documents (but not control the number of documents within a collection).

TTL indexes are special single-field indexes that MongoDB can use to automatically remove documents from a collection after a certain amount of time or at a specific clock time.

(2) Another approach is to use Change Streams. With change streams your application can listen to data change events and and perform some action(s). For example, you can listen for change events for a write operation on a collection and check the number of documents in a collection and delete if the number exceeds a limit.

Change streams are available for replica sets and sharded clusters.

1 Like

Thanks for the quick response!

For now, I started doing this and seems to be working perfectly:

async function capSourceData(projectId: string) {
 const element = (await repository.logs.find({ ‘subscription.project.id’: projectId }).sort({ _id: -1 
 }).skip(MAX_ITEMS - 1).limit(1).toArray()).map(p => new Log(p));

 if (element && element[0]) {
  await repository.logs.deleteMany({
        $and: [
            { _id: { $lt: element[0].exposed.id } },
            { 'subscription.project.id': projectId }
        ]
    });
 }
}

I simply fire this promise without the await when adding the new batch of elements into the DB.

I like solution (2) of @Prasad_Saya.

  1. The inserts stay clean.
  2. No one can bypass the limitations as it is enforced elsewhere.
  3. Can spare load on primary by setting the enforcer to read from secondary.