We are using Atlas basic text search. We are trying to compare items from listA to listB and sometimes we get matches that do make sense by text search standards, but it really isn’t the same item in real world and shouldn’t match.
For instance:
LIST A: SAGE
LIST B: Sage Palm, Sausage, garlic.
So the result is that sage matches Palm Sage and Sausage even though they are not the same item.
I was thinking of using synonyms and building an array of all possible permutations of each item and then comparing the whole phrase against it. For instance:
The search is tricky. I hope these examples make sense:
it is ok for GARLIC to match GARLIC POWDER, or GARLIC OIL (and vice versa).
It is not ok for SAGO PALM to match PALM OIL (they have the world PALM in common. But SAGO PALM is a plant really poisonous for pets while PALM OIL extracted from OIL PALM tree and is not toxic)
The synonym array worked. We ended up building an array for each poison item and stored all the possible combinations of the ingredient in it. Common Ingredients such as salt, garlic, and onion had 4000 to 14000 elements in their array.
Thank you Mongodb team and community for brainstorming