Query Analytics Part 1: Know Your Queries
Rate this article
Do you know what your users are searching for? What they’re finding? Or not finding?
The quality of search results drives users toward or away from using a service. If you can’t find it, it doesn’t exist… or it may as well not exist. A lack of discoverability leads to a lost customer. A library patron can’t borrow a book they can’t find. The bio-medical researcher won’t glean insights from research papers or genetic information that is not in the search results. If users aren’t finding what they need, expect, or what delights them, they’ll go elsewhere.
As developers, we’ve successfully deployed full-text search into our application. We can clearly see that our test queries are able to match what we expect, and the relevancy of those test queries looks good. But as we know, our users immediately try things we didn’t think to try and account for and will be presented with results that may or may not be useful to them. If you’re selling items from your search results page and “Sorry, no results match your query” comes up, how much money have you not made? Even more insidious are results for common queries that aren’t providing the best results you have to offer; while users get results, there might not be the desired product within quick and easy reach to click and buy now.
Having Atlas Search enabled and in production is really the beginning of your search journey and also the beginning of the value you’ll get out of a well-tuned, and monitored, search engine. Atlas Search provides Query Analytics, giving us actionable insights into the
$search
activity of our Atlas Search indexes.Note: Query Analytics is available in public preview for all MongoDB Atlas clusters on an M10 or higher running MongoDB v5.0 or higher to view the analytics information for the tracked search terms in the Atlas UI. Atlas Search doesn't track search terms or display analytics for queries on free and shared-tier clusters.
Callout section: Atlas Search Query Analytics focuses entirely on the frequency and number of results returned from each $search call. There are also several search metrics available for operational monitoring of CPU, memory, index size, and other useful data points.
You might be thinking, “Hey, I thought this Atlas Search thing would magically make my search results work well — why aren’t the results as my users expect? Why do some seemingly reasonable queries return no results or not quite the best results?”
Consider these various types of queries of concern:
Query challenge | Example |
---|---|
Common name typos/variations | Jacky Chan, Hairy Potter, Dotcor Suess |
Relevancy challenged | the purple rain, the the [yes, there’s a band called that], to be or not to be |
Part numbers, dimensions, measurements | ⅝” driver bit, 1/2" wrench, size nine dress, Q-36, Q36, Q 36 |
Requests for assistance | Help!, support, want to return a product, how to redeem a gift card, fax number |
Because you know better | cheap sushi [the user really wants “good” sushi, don’t recommend the cheap stuff], blue shoes [boost the brands you have in stock that make you the most money], best guitar for a beginner |
Word stems | Find nemo, finds nemo, finding nemo |
Various languages, character sets, romanization | Flughafen, integraçao,中文, ko’nichiwa |
Context, such as location, recency, and preferences | MDB [boost most recent news of this company symbol], pizza [show me nearby and open restaurants] |
Consider the choices we made, or were dynamically made for us, when we built our Atlas Search index — specifically, the analyzer choices we make per string field. What we indexed determines what is searchable and in what ways it is searchable. A default
lucene.standard
analyzed field gives us pretty decent, language-agnostic “words” as searchable terms in the index. That’s the default and not a bad one. However, if your content is in a particular language, it may have some structural and syntactic rules that can be incorporated into the index and queries too. If you have part numbers, item codes, license plates, or other types of data that are precisely specified in your domain, users will enter them without the exact special characters, spacing, or case. Often, as developers or domain experts of a system, we don’t try the wrong or almost correct syntax or format when testing our implementation, but our users do.With the number of ways that search results can go astray, we need to be keeping a close eye on what our users are experiencing and carefully tuning and improving.
Maintaining a healthy search-based system deserves attention to the kinds of challenges just mentioned. A healthy search system management cycle includes these steps:
- (Re-)deploy search
- Measure and test
- Make adjustments
- Go to 1, repeat
How you go about re-deploying the adjustments will depend on the nature of the changes being made, which could involve index configuration and/or application or query adjustments.
Here’s where the local development environment for Atlas could be useful, as a way to make configuration and app changes in a comfortable local environment, push the changes to a broader staging environment, and then push further into production when ready.
You’ll want to have a process for analyzing the search usage of your system, by tracking queries and their results over time. Tracking queries simply requires the addition of
searchTerms
tracking information to your search queries, as in this template:1 { 2 $search: { 3 "index": "<index name>", 4 "<operator>": { 5 <operator-specification> 6 }, 7 "tracking": { 8 "searchTerms": "<term-to-search>" 9 } 10 } 11 }
You’ve measured, twice even, and you’ve spotted a query or class of queries that need some fine-tuning. It’s part art and part science to tune queries, and with a virtuous search query management cycle in place to measure and adjust, you can have confidence that changes are improving the search results for you and your customers.
Now, apply these adjustments, test, repeat, adjust, re-deploy, test... repeat.
So far, we’ve laid the general rationale and framework for this virtuous cycle of query analysis and tuning feedback loop. Let’s now see what actionable insights can be gleaned from Atlas Search Query Analytics.
The Atlas Search Query Analytics feature provides two reports of search activity: All Tracked Search Queries and Tracked Search Queries with No Results. Each report provides the top tracked “search terms” for a selected time period, from the last day up to the last 90 days.
Let’s talk about the significance of each report.
What are the most popular search terms coming through your system over the last month? This report rolls that up for you.
The approximate number of search queries for the selected time period is good info to track the usage of your newly added search box traffic. What’s the distribution of queries like? Are folks always searching for the same small set of terms or phrases, or are the queries all over the place? In the screenshot shown here, due to the data being generated from a test query runner every day with a small set of queries, the top ten query terms make up the majority (83.85%) of the queries in the selected (last 30 days) time period.
For the selected time period, you can drill into the top-tracked search terms.
So while we can rejoice that our app’s search system is getting some usage, we need to be vigilant and check these top-ranked search terms. As you’ll see, a top query term could also be getting no results! So we need to check how our system handles these top queries and adjust based on our findings there. The top search terms report is a starting point of queries worth digging into further. This report allows you to drill into each query (“Search Terms”) with the View link in the right column. The detail view of a query provides the full query pipelines used for those tracked search terms, as shown below.
Ah, so our hot “Jacky Chan” query is getting no results. Why’s that? We’ll answer that in the second part of this article series, as well as discuss what happened with some of those other queries. The other queries shown return zero results because of filtering criteria that were applied by the user (there’s no “Horror” movie matching “bruce lee,” for example). The “View” links on the right side of the report detail the full aggregation pipeline used and deserve attention to see the full context of the query. There could be more to a search request than just the query terms.
Analyzing your queries can yield insights that you’d otherwise miss. Underperforming queries don’t help you or your users get to where we all want. Your users could be unsatisfied. You could be missing out on sales or engagements.
Leveraging Atlas Search Query Analytics is a great step forward toward query usage insights and search result improvements. But this is just a starting point. Beyond what users search for and how many results are returned, consider these questions to incorporate into your search system analytics process and infrastructure: For a given query, what documents are the users engaging with (clicking into)? Are there queries that returned results but got no engagement once presented? Are users converting searches into purchases?
Stay Tuned - Query Analytics Part 2: Tuning the System is coming soon!
We’d love your feedback on the preview of this feature! Please let us know using the MongoDB Feedback Engine.