Filter on MongoDB Vector Search doesn't work as expected

I’m building an aggregation pipeline in mongodb and I’m encountering some unexpected behaviour.

The pipeline is as follow:

    [{
       "$search":{
          "index":"vector_index",
          "knnBeta":{
             "vector":[
                -0.30345699191093445,
                0.6833441853523254,
                1.2565147876739502,
                -0.6364057064056396
             ],
             "path":"embedding",
             "k":10,
             "filter":{
                "compound":{
                   "filter":[
                      {
                         "text":{
                            "path":"my.field.name",
                            "query":[
                               "value1",
                               "value2",
                               "value3",
                               "value4"
                            ]
                         },
                         {
                         "text":{
                            "path":"my.field.name2",
                            "query":"something_else",
                         }
                      }
                   ]
                }
             }
          }
       }
    },
        {
       "$project":{
          "score":{
             "$meta":"searchScore"
          },
          "embedding":0
       }
    }
    
    ]

The pipeline (should) do a vector search according (vector_index, embedding, vector) (it work correctly it seems. With a filter, in particular the filter should limit the vector search to documents having my.field.name equal to value1 or value2 or ... and my.field.name2 equal to something_else.

Instead, only the second filter works, or at least it seems (the value of the second filter is a single letter).

I tried using the must clause as well in place of the filter inside the compound clause but the outcome remains the same.

I tried also removing the second filtering (the one without the list) and I still get unfiltered results.

Am I doing something wrong? how can it correctly?

1 Like

I have the same problem : I’d like to filter the collection of documents before applying the vector search. But according to the documentation, KNN can’t be used in a compound query. It’s very frustrating.

There is a sample use case where it could be useful:
Imagine working with sample_mflix datasource, and you’d like to do some recommendations when you select a Movie. Suppose you’ve selected a “Drama” movie, you’d like to focus on Drama movies to get recommendations based on the plot of this movie.
Actually, I cannot filter on genres, so I have recommendations on Comedy, Action, War …etc. That’s not what I want.

Hoped filter could work, but it does not seem to be the case regarding @Matteo_Villosio question.

Hey @Frederic_Meriot

Using Vector Search via knnBeta allows you to run a approximate nearest neighbor query along with text pre-filtering.

You do not need to use a compound statement to achieve pre-filtering. If you go to the docs here, and choose the tab for the “Filter Example” you’ll see how you can use a filter with vector search, and even though that filter is nested inside the knnBeta statement it is doing a “pre-filter”.

With this filtering approach you should be able to exactly what you’re looking for in that example.

1 Like