I am struggling to come up with a way to filter for a given document, rather than all. I found an especially helpful example for $vectorSearch using a movies database on the Mongo blog (Building Generative AI Applications Using MongoDB: Harnessing the Power of Atlas Vector Search and Open Source Models | MongoDB), but I am looking for an example in which a list of product recommendations based on a customers past purchases. My thought is the flow will be:
- Take each purchase and create an embedding for it (name, category, description)
- Store each embedding along with the purchase
- Create a knnVector search index on the embedding column in the purchases collection, which also has the customer data (id, name, etc.)
- Take each product and create an embedding for it (name, category, description)
When I want to generate a list of similar products based on past purchases is where I am stuck. I am assuming I can use a $eq on customer email, or whatever. Let’s say a customer has three purchases. What I can’t wrap my brain around is whether I should be:
- creating a concatenated list of the three purchases and embeddings for each and looping through these and looking for similar matches in all the products embeddings (this seems inefficient)
- storing each of the three purchases separately and using the $eq aggregation on the customer email.
The example for the vector search has:
result = client['sample_mflix']['movies'].aggregate([
{ '$vectorSearch': {
'queryVector': embed,
'path': 'plot_embedding',
'numCandidates': 100,
'limit': 5,
'index': 'sampleindex'
}
}
])
If I plug in the $eq operator, will this meet what I need? I am testing concurrently with this post, so I amy find out on my own, but I wanted to float this as I am most likely making it too hard ![]()