I have a data set of dates and I’d like to check if a user input date clashes with an existing one,
so far this has been my approach:
for await (const doc of collection.find(query)) {
if(check_clash(doc.date, user_date))
break;
}
but I have 2 concerns:
Will this work even for a larger data set?
Is there any way to do this in batches?
i.e. since I’m guessing every time 1 doc is fetched it takes some amount of time whereas if we take e.g. 20 docs at a time, compare them & then take the next batch it should take less amount of time since there are less calls to the database?
maybe a sudo code would look something like this?
await collection.find(query).getInBatches(20,(docs)=>{
for(const d in docs)
if(check_clash(d.date, user_date))
return false; // break out
})
P.S. I tried the approach in this question: Batching data with find but a solution pointed out:
It is never a good idea to use skip in mongo queries
Depending on the rule of check_clash you may be able to implement this as an aggregation pipeline and run it server side.
In regards to the point of .skip, the alternative is to sort your data (make sure there is a supporting index) and then keep track of the last item processed, when you resume you just get data more than the item last found.
So you could use the _id field as the primary key for this and something like this (pseudocode)
I’ve actually abstracted my problem in the original question to simplify things but the core problem is that I have a user input date range and bookings also with a date range as such:
MONTHLY_COLLECTION:
{
"month_year" : "01-2020",
"dates" : {
"1": [...], // array of Booking IDS whose range falls within this date
"5": [...],
"28":[...]
},
...