$vectorSearch filter & numCandidates

Hi @Charlie_Mattox-

Thanks for the question, I can definitely see why this behavior can be confusing. numCandidates controls the size of the priority queue of vectors to be assessed when traversing the HNSW graph before returning limit documents. In the event when you are using a filter, it has a secondary purpose which is to assess whether or not to do traversal at all, or do exhaustive ENN on the few objects that meet the prefilter, fewer than numCandidates. What makes this even more complicated is that this is assessed on a per-segment basis, so even knowing that 80 documents meets a prefilter from a $match query doesn’t mean that numCandidates = 81 is the point at which an ENN would occur, likely its below that level based on how many segments are produced, which is not exposed to the user.

This means that there could be cases when a lower numCandidates doesn’t trigger ANN, and the top resulting documents don’t meet the prefilter, but when you bump up numCandidates they do. In general we would recommend to increase numCandidates if you are not seeing expected results, which it appears you have already done.

In the meantime I will work to make sure this behavior is more clearly documented. Hopefully that resolves some of your confusion