I have a collection with multiple site ObjId's, how could I delete the oldest document based on the date field which have the same site objid field and also add a new field of isCurrent to the only record available?

{"_id":{"$oid":"5b732e09568cca439ce49bde"},"site":{"$oid":"5b6b067e3fce621fee5bc4la"},"date":{"$date":"2016-01-21T04:00:00.000Z"}}
    {"_id":{"$oid":"5d00f00fea90e20167601982"},"site":{"$oid":"5b6b067e3fce621fee5bc4la"},"date":{"$date":"2017-01-21T04:00:00.000Z"}}
{"_id":{"$oid":"5d00f00fea90e2016760199c"},"site":{"$oid":"5b6b067e3fce621fee5bc4la"},"date":{"$date":"2018-01-21T04:00:00.000Z"}}
{"_id":{"$oid":"5d39639ece0a210010320c59"},"site":{"$oid":"5b6b067e3fce621fee5bc4la"},"date":{"$date":"2020-01-21T04:00:00.000Z"}}
{"_id":{"$oid":"5d39639ece0a210010320c32"},"site":{"$oid":"5b6b067e3fce621fee5bc4ul"},"date":{"$date":"2016-01-21T04:00:00.000Z"}}
{"_id":{"$oid":"5d39639ece0a210010320c15"},"site":{"$oid":"5b6b067e3fce621fee5bc4ul"},"date":{"$date":"2020-01-21T04:00:00.000Z"}}

Is this a one time thing or a frequent use-case?

For a frequent use-case, it is important to try to do it with the least number of database requests. But for a one time thing, like a migration to a new schema, we can afford to be a sloppy.

My sloppy approach would be:

  1. $sort by one of the following index site-date, date-site, site, date
  2. $group by site using $max to find the most recent date
  3. $out in a temp collection
  4. read the temp collection to make a bulk write in the source collection that will set isCurrent to all site,date tuple of temp collection
  5. run a delete many on the source collection for isCurrent:{$exists:false}

@EJ_Rojas, if my post solved your issue, please mark it as the solution. If not share any solution you found on your own to help people facing the same issue. This will help keep this forum efficient and useful to all.