Identify gaps in date fields of a collections

I have a collection that has a field such as “publicationDate”: “2023-03-30T00:01:51”, and another with a dd/mm/YYYY string such as “articledatepub”: “24/03/2023”. as I am really at my first steps with MongoDB I would greatly appreciate an hint on how to identify “missing days” between the oldest and most recent date, especially in the first case. Around 100K records on a MongoDB 4.4 self hosted collections. Using python and pymongo. Thanks a lot.

The first think you should do is to use real dates rather than

Real dates, take less space, are faster to compare and are ordered correctly. With dd/mm/YYYY you would need to convert to date anyway for any calculation, sorting and order comparison. And because of the need to convert all the time, you won’t be able to use indexes with the field.

1 Like

Thanks Steeve. Very good point. Alas that date format is pre-existing in the original data. Would it be worth to do a one shot replacement with a proper date value?

I mentioned it because it is.