How does MongoDB handle $in queries?

For example, how does MongoDB handle this simple query?

db.findMany({_id: {$in: ['a', 'b', 'c']})

We’re using a huge list of $in (thousands to tens of thousands), the performance is slow and we’re trying to optimize it. I’m trying to understand how MongoDB handle $in queries so we can utilize it better. The official guide says keep the number of elements below hundreds, but I’m wondering why it does negative impact on the performance when the list gets too large.

Will MongoDB scan _id index three times and try to find these elements one by one? In this case, I think the complexity is O(M*log(N)), so it should be optimal, but this doesn’t feel like what we saw in production.

Or will MongoDB try to scan the index once and compare with each record with all three values? If this is the case, I would expect O(N*M) complexity (with N as elements to examine and M as number of elements in $in clause), and split $in in batches will improve the performance

Hello,

While I have been studying MongoDB, I came across this command as a powerful tool in query optimisation. Hope it may help you too.

WeDoTheBest4You
Thanks

Hello,

As @wedothebest_We_do_the_Best mentioned you should try using the explain tool to visualize your query but as for your question -
The guidance to keep the number of elements in the $in clause to a “reasonable” size is mainly about practical performance considerations. As the size of the $in list grows, the query can consume more memory and CPU resources, both for managing the larger list of values and for the increased complexity in the index lookup process. This doesn’t mean MongoDB scans the index multiple times; rather, the overhead is in managing the larger set of potential matches.

Moreover you can go through this to understand how indexes work efficiently and how to consider ESR rule to effectively utilise indexes -

Thanks for the ESR suggestion, but I’m afraid there’s nothing we can do at business side to optimize so that we can match ESR rule. Regarding the $in operator, I saw the code break down the InMatchExpression into a series of ComparisonMatch, is this always the case when MongoDB do $in queries? https://github.com/mongodb/mongo/blob/master/src/mongo/db/matcher/expression_algo.cpp#L253