Filter results with metadasa does not work [Langchain]

Hello to everyone, I’m trying to filter the embeddings with metadata before the similarity function for an application of Q&A with Langchain.

This is the vector search index:

{
  "fields": [
    {
      "numDimensions": 768,
      "path": "embedding",
      "similarity": "cosine",
      "type": "vector"
    },
    {
      "path": "contract_id",
      "type": "filter"
    }
  ]
}

This is the code:

vectorStore = MongoDBAtlasVectorSearch(collection, embedding_model, index_name="vector_index")
retriever = vectorStore.as_retriever(search_kwargs={"k":1})
qa = RetrievalQA.from_chain_type(llm, chain_type="stuff", retriever=retriever, return_source_documents=True)

How should i modify the retriever to filter for “contract_id”? I’ve already tried with filter, pre-filter, but it doesn’t work.

Someone else with the same problem that solved?

Hi!

I was able to get this working by adding a pre-filter in the search_kwargs passed to the retriever like so:

retriever = vectorStore.as_retriever(search_type="similarity", search_kwargs={"k": 1, "pre_filter": {"contract_id": "contract_id"}})

Be sure to import MongoDBAtlasVectorSearch from langchain_mongodb .

Semantic search with filtering is covered in docs - https://www.mongodb.com/docs/atlas/atlas-vector-search/ai-integrations/langchain/#std-label-langchain-create-index

1 Like