I am using mongodb atlas search, my following code working fine, but i am unable to add filters, I see some examples but that are using raw aggregation, like in langchain wrapper, there is function(i.e. similaritySearchWithScore(qurey, 5,{preFilter:{name:“test_file.pptx”}})) which support filters as args. I used this but it gives error i.e. error: PlanExecutor error during aggregation :: caused by :: “filter.name” must be a document
Code:-
import { MongoDBAtlasVectorSearch } from "langchain/vectorstores/mongodb_atlas";
const store = new MongoDBAtlasVectorSearch(embeddings, { "collection": my_collection, "indexName": "default", "textKey": "page_content", "embeddingKey":"page_embeddings"})
return await store.similaritySearchWithScore(qurey, 5,{preFilter:{name:"test.pptx"}})
Do we need to add all fields to following template?? which we want to filter in semantic search function, since i filtered one field which was not added to indexing and it throw given error.
If yes what “type” is for field of collection which has JSON data type?? i.e. for string we “type”: “token”, “normalizer”: “lowercase”.
{
"mappings": {
"dynamic": true,
"fields": {
"page_embeddings": {
"dimensions": 1536,
"similarity": "cosine",
"type": "knnVector"
}
}
}
}```
Error:-
****error: PlanExecutor error during aggregation :: caused by :: Path 'name' needs to be indexed as token****
May I ask what you meant by ‘all fields’ here? Are you generating vector embeddings for multiple fields?
Yes, the $vectorSearchfilter option matches only BSON boolean, string, and numeric values so you must index the fields as one of the following Atlas Search field types.
And, yes for the string - index a field as token type. Atlas Search indexes the terms in the string as a single token (searchable term) and stores them in a columnar storage format for efficient filtering or sorting operations. To read more about it, please refer to the Behavior of the token Type - MongoDB Docs.
@Kushagra_Kesav Should we create separate index for the filter fields? Or should we mention in the vector index template?
For example, I want to apply vector search, lets say only for documents where country=‘IN’, should I create a separate index for country or should I add this field in vector index mapping?
Looking at one of the queries you’ve shared, it appears that one curly parenthesis is missing in the query, and there seems to be a misspelling in the word ‘query’.
Additionally, in another case where you are getting an empty result, please ensure that a document exists with the value "document_meta.Disclaimer.Label": "Client Ready". This might be causing the absence of expected outcomes.