I am wondering what the best strategy would be. We need to query a rather large collection of currently 24m rows.
We search on 2 indexed fields, but to narrow down the query even further, we have an array of up to 10k user ids that could be added to the query. Since the existing query is already an $or query, we needed to add the array twice, resulting in totally 20k ids being transported to Atlas.
Is this a bad idea? Would be it wiser to simply omit the 20k array, and then instead filter the results once they arrive - in code?
It’s a query that is done often, several times a day, thousands of times.
I also considered doing a $lookup directly on the collection, but isn’t it quite imperformant? As I could read its like a MySQL subquery, and thus should suffer from the n+1 issue, producing many small queries per document. Or did I miss something?
Basically the goal is this:
query user documents (up to a sample of 10k)
query like documents where a specific user id is in (either as the LIKED or the LIKER) - this is the expensive one
Then on the server we loop through all users and then check if that specific user is in the LIKED or LIKER collection we also fetched.