Improve Your App's Search Results with Auto-Tuning
Rate this tutorial
Historically, the only way to improve your app’s search query relevance is through manual intervention. For example, you can introduce score boosting to multiply a base relevance score in the presence of particular fields. This ensures that searches where a key present in some fields weigh higher than others. This is, however, fixed by nature. The results are dynamic but the logic itself doesn’t change.
The following project will showcase how to leverage synonyms to create a feedback loop that is self-tuning, in order to deliver incrementally more relevant search results to your users—all without complex machine learning models!
We have a food search application where a user searches for “Romanian Food.” Assuming that we’re logging every user's clickstream data (their step-by-step interaction with our application), we can take a look at this “sequence” and compare it to other results that have yielded a strong CTA (call-to-action): a successful checkout.
Another user searched for “German Cuisine” and that had a very similar clickstream sequence. Well, we can build a script that analyzes both these users’ (and other users’) clickstreams, identify similarities, we can tell the script to append it to a synonyms document that contains “German,” “Romanian,” and other more common cuisines, like “Hungarian.”
Here’s a workflow of what we’re looking to accomplish:
In our app tier, as events are fired, we log them to a clickstreams collection, like:
In this simplified list of events, we can conclude that {"session_id":"1"} searched for “romanian food,” which led to a higher conversion rate, payment_success, compared to {"session_id":"2"}, who searched “hungarian food” and stalled after the add_to_cart event.
You can import this data yourself using sample_data.json.
Let’s prepare the data for our search_tuner script.
By the way, it’s no problem that only some documents have a metadata field. Our $group operator can intelligently identify the ones that do vs don’t.
Let’s create the view by building the query, then going into Compass and adding it as a new collection called group_by_session_id_and_search_query:
Here’s what it will look like:
There you have it, folks. We’ve taken raw data recorded from our application server and put it to use by building a feedback that encourages positive user behavior.
By measuring this feedback loop against your KPIs, you can build a simple A/B test against certain synonyms and user patterns to optimize your application!