MongoDB
MongoDB Developer Center
chevron-right
Developer Topics
chevron-right
Products
chevron-right
MongoDB
chevron-right

Case-Insensitive Queries Without Case-Insensitive Indexes

Lauren Schaefer, Daniel CoupalPublished Feb 12, 2022 • Updated May 31, 2022
MongoDBSchema
facebook icontwitter iconlinkedin icon
random alt
Rate this article
star-empty
star-empty
star-empty
star-empty
star-empty
We've reached the sixth and final (at least for now) MongoDB schema design anti-pattern. In the first five posts in this series, we've covered the following anti-patterns.
Today, we'll explore the wonderful world of case-insensitive indexes. Not having a case-insensitive index can create surprising query results and/or slow queries...and make you hate everything.
Ron Swanson says 'I hate everything'
Once you know the details of how case-insensitive queries work, the implementation is fairly simple. Let's dive in!
Check out the video above to see the case-insensitive queries and indexes in action.

Case-Insensitive Queries Without Case-Insensitive Indexes

MongoDB supports three primary ways to run case-insensitive queries.
First, you can run a case-insensitive query using
$regex
with the i option. These queries will give you the expected case-insensitive results. However, queries that use $regex cannot efficiently utilize case-insensitive indexes, so these queries can be very slow depending on how much data is in your collection.
Second, you can run a case-insensitive query by creating a
case-insensitive index
(meaning it has a
collation strength
of 1 or 2) and running a query with the same collation as the index. A
collation
defines the language-specific rules that MongoDB will use for string comparison. Indexes can optionally have a
collation with a strength that ranges from 1 to 5
. Collation strengths of 1 and 2 both give you case-insensitivity. For more information on the differences in collation strengths, see the
MongoDB docs
. A query that is run with the same collation as a case-insensitive index will return case-insensitive results. Since these queries are
covered by indexes
, they execute very quickly.
Third, you can run a case-insensitive query by setting the default collation strength for queries and indexes to a strength of 1 or 2 when you create a collection. All queries and indexes in a collection automatically use the default collation unless you specify otherwise when you execute a query or create an index. Therefore, when you set the default collation to a strength of 1 or 2, you'll get case-insensitive queries and indexes by default. See the collation option in the
db.createCollection() section of the MongoDB Docs
for more details.
Warning for queries that do not use $regex: Your index must have a collation strength of 1 or 2 and your query must use the same collation as the index in order for your query to be case-insensitive.
You can use
MongoDB Compass
(MongoDB's desktop GUI) or the
MongoDB Shell
(MongoDB's command-line tool) to test if a query is returning the results you'd expect, see its execution time, and determine if it's using an index.

Example

Let's revisit the example we saw in the
Unnecessary Indexes Anti-Pattern
and the
Bloated Documents Anti-Pattern
posts. Leslie is creating a website that features inspirational women. She has created a database with information about 4,700+ inspirational women. Below are three documents in her InspirationalWomen collection.
Leslie decides to add a search feature to her website since the website is currently difficult to navigate. She begins implementing her search feature by creating an index on the first_name field. Then she starts testing a query that will search for women named "Harriet."
Leslie executes the following query in the MongoDB Shell:
She is surprised to only get one document returned since she has two Harriets in her database: Harriet Tubman and Harriet Beecher Stowe. She realizes that Harriet Beecher Stowe's name was input in all uppercase in her database. Her query is case-sensitive, because it is not using a case-insensitive index.
Leslie runs the same query with
.explain("executionStats")
to see what is happening.
The Shell returns the following output.
She can see that the winningPlan is using an IXSCAN (index scan) with her first_name_1 index. In the executionStats, she can see that only one index key was examined (executionStats.totalKeysExamined) and only one document was examined (executionStats.totalDocsExamined). For more information on how to interpret the output from .explain(), see
Analyze Query Performance
.
Leslie opens Compass and sees similar results.
MongoDB Compass shows that the query is examining only one index key, examining only one document, and returning only one document. It also shows that the query used the first_name_1 index.
MongoDB Compass shows that the query is examining only one index key, examining only one document, and returning only one document. It also shows that the query used the first_name_1 index.
Leslie wants all Harriets—regardless of what lettercase is used—to be returned in her query. She updates her query to use $regex with option i to indicate the regular expression should be case-insensitive. She returns to the Shell and runs her new query:
This time she gets the results she expects: documents for both Harriet Tubman and Harriet Beecher Stowe. Leslie is thrilled! She runs the query again with .explain("executionStats") to get details on her query execution. Below is what the Shell returns:
She can see that this query, like her previous one, uses an index (IXSCAN). However, since $regex queries cannot efficiently utilize case-insensitive indexes, she isn't getting the typical benefits of a query that is covered by an index. All 4,704 index keys (executionStats.totalKeysExamined) are being examined as part of this query, resulting in a slightly slower query (executionStats.executionTimeMillis: 3) than one that fully utilizes an index.
She runs the same query in Compass and sees similar results. The query is using her first_name_1 index but examining every index key.
MongoDB Compass shows that the query is returning two documents as expected. The $regex query is using the first_name_1 index but examining every index key.
MongoDB Compass shows that the query is returning two documents as expected. The $regex query is using the first_name_1 index but examining every index key.
Leslie wants to ensure that her search feature runs as quickly as possible. She uses Compass to create a new case-insensitive index named first_name-case_insensitive. (She can easily create indexes using other tools as well like the Shell or
MongoDB Atlas
or even programmatically.) Her index will be on the first_name field in ascending order and use a custom collation with a locale of en and a strength of 2. Recall from the
previous section
that the
collation strength
must be set to 1 or 2 in order for the index to be case-insensitive.
Creating a new index in MongoDB Compass with a custom collation that has a locale of en and a strength of 2.
Creating a new index in MongoDB Compass with a custom collation that has a locale of en and a strength of 2.
Leslie runs a query very similar to her original query in the Shell, but this time she specifies the collation that matches her newly-created index:
This time she gets both Harriet Tubman and Harriet Beecher Stowe. Success!
She runs the query with .explain("executionStats") to double check that the query is using her index:
The Shell returns the following results.
Leslie can see that the winning plan is executing an IXSCAN (index scan) that uses the case-insensitive index she just created. Two index keys (executionStats.totalKeysExamined) are being examined, and two documents (executionStats.totalDocsExamined) are being examined. The query is executing in 0 ms (executionStats.executionTimeMillis: 0). Now that's fast!
Leslie runs the same query in Compass and specifies the collation the query should use.
Query with collation in Compass.
She can see that the query is using her case-insensitive index and the query is executing in 0 ms. She's ready to implement her search feature. Time to celebrate!
Leslie dances.
Note: Another option for Leslie would have been to set the default collation strength of her InspirationalWomen collection to 1 or 2 when she created her collection. Then all of her queries would have returned the expected, case-insensitive results, regardless of whether she had created an index or not. She would still want to create indexes to increase the performance of her queries.

Summary

You have three primary options when you want to run a case-insensitive query:
  1. Use $regex with the i option. Note that this option is not as performant because $regex cannot fully utilize case-insensitive indexes.
  2. Create a case-insensitive index with a collation strength of 1 or 2, and specify that your query uses the same collation.
  3. Set the default collation strength of your collection to 1 or 2 when you create it, and do not specify a different collation in your queries and indexes.
Alternatively,
MongoDB Atlas Search
can be used for more complex text searches.
This post is the final anti-pattern we'll cover in this series. But, don't be too sad—this is not the final post in this series. Be on the lookout for the next post where we'll summarize all of the anti-patterns and show you a brand new feature in MongoDB Atlas that will help you discover anti-patterns in your database. You won't want to miss it!
When you're ready to build a schema in MongoDB, check out
MongoDB Atlas
, MongoDB's fully managed database-as-a-service. Atlas is the easiest way to get started with MongoDB and has a generous, forever-free tier.
Check out the following resources for more information:

Copy Link
facebook icontwitter iconlinkedin icon
Rate this article
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Article
Paginations 2.0: Why I Would Choose MongoDB

May 19, 2022
Article
Generating MQL Shell Commands Using OpenAI and New mongosh Shell

May 10, 2022
Article
MongoDB & Mongoose: Compatibility and Comparison

May 12, 2022
Article
Massive Arrays

May 31, 2022
Table of Contents