Hello everyone.
I need some advice how to proceed with this problem, scenario:
Collection of users (I prepared pseudo-materialized view with other required properties), they will have quite a few other clauses but the most important ones are date range filters, and now based on that date range I need to calculate users availability (calculation is based on 3 other collections, I can put them into the materialized view as well, not a lot of data).
Question:
Is it possible to calculate some values before executing $search
stage itself? For example some $addFields
so I can do my calculation in the code? Or maybe there is a better way to achieve that?
Thanks in advance for all of the tips!
Hi @Michal_Janocha
Question:
Is it possible to calculate some values before executing $search
stage itself? For example some $addFields
so I can do my calculation in the code? Or maybe there is a better way to achieve that?
The $search
stage must be the first stage the pipeline so the exampling adding $addFields
prior to that will most likely result in an error. I’m just wondering if you are trying to execute this operation in a singular operation? Have you thought about performing the calculation client side and then providing these values to the $search
stage?
To help better understand the context of the question, would you be able to provide some details / sample documents and expected output?
Regards,
Jason
Hello @Jason_Tran,
Thank you for the response! The part where you are saying about providing calculated values to $search
stage looks very promising, how can I achieve that?
Anyway I went with pre-calculated values in the search model so it already contains calculated values, the only drawback of this solution is that I have quite large array of objects in each document (~3000), at this moment it doesn’t affect performance but I am not sure if this is a proper solution.
{
"_id": {
"$oid": "628b81d159a887d838e37944"
},
"firstName": "User",
"lastName": "Example",
"email": "user@example.com",
"photoURL": "",
"skills": [{...}],
"certificates": [{...}],
"languages": [{...}],
"availability": [
{
allocatedHours: Double,
day: Date,
maxHours: Double,
availableHours: Double,
availablePercent: Double
}
] <= this array grew to 3k documents because I need daily granularity and I am storing availability up to 10 years
}
Apologies here, what I had meant was pre-calculating the fields (outside of the database) and then providing those to the $search
stage which is what you have done currently based off your most recent reply.
Anyway I went with pre-calculated values in the search model so it already contains calculated values, the only drawback of this solution is that I have quite large array of objects in each document (~3000), at this moment it doesn’t affect performance but I am not sure if this is a proper solution.
Just to clarify, do you mean each individual document output from the $search
stage has approximately ~3000 (and/or more) documents inside the availability
array field?
Regards,
Jason
I guess my current solution does not work as you expect, at this moment I am not providing anything external to my $search
stage, everything comes from materialized view which was built totally out of $search
.
Just to clarify, do you mean each individual document output from the $search
stage has approximately ~3000 (and/or more) documents inside the availability
array field?
No, I guess it would totally kill networking between db => backend => frontend. But my input documents (the ones in materialized view) indeed contains ~3000 documents in availability
array. After searching on them but before returning results I am using projection with $filter
to reduce amount of documents inside this array, you can see that solution below.
{
"$addFields" : {
"availability" : {
"$filter" : {
"input" : "$availability",
"as" : "item",
"cond" : {
"$and" : [
{ "$lte" : [ "$$item.day", ISODate('2022-11-30') ] },
{ "$gte" : [ "$$item.day", ISODate('2022-08-01') ] }
]
}
}
}
}
}