Unnecessary Indexes
Lauren Schaefer, Daniel Coupal4 min read • Published Feb 12, 2022 • Updated May 31, 2022
Rate this article
So far in this MongoDB Schema Design Anti-Patterns series, we've discussed avoiding massive arrays as well as a massive number of collections.
Today, let's talk about indexes. Indexes are great (seriously!), but it's easy to get carried away and make indexes that you'll never actually use. Let's examine why an index may be unnecessary and what the consequences of keeping it around are.
Would you rather watch than read? The video above is just for you.
Before we go any further, we want to emphasize that indexes are good. Indexes allow MongoDB to efficiently query data. If a query does not have an index to support it, MongoDB performs a collection scan, meaning that it scans every document in a collection. Collection scans can be very slow. If you frequently execute a query, make sure you have an index to support it.
Now that we have an understanding that indexes are good, you might be wondering, "Why are unnecessary indexes an anti-pattern? Why not create an index on every field just in case I'll need it in the future?"
We've discovered three big reasons why you should remove unnecessary indexes:
- Indexes take up space. Each index is at least 8 kB and grows with the number of documents associated with it. Thousands of indexes can begin to drain resources.
- Indexes can impact the storage engine's performance. As we discussed in the previous post in this series about the Massive Number of Collections Anti-Pattern, the WiredTiger storage engine (MongoDB's default storage engine) stores a file for each collection and for each index. WiredTiger will open all files upon startup, so performance will decrease when an excessive number of collections and indexes exist.
- Indexes can impact write performance. Whenever a document is created, updated, or deleted, any index associated with that document must also be updated. These index updates negatively impact write performance.
In general, we recommend limiting your collection to a maximum of 50 indexes.
To avoid the anti-pattern of unnecessary indexes, examine your database and identify which indexes are truly necessary. Unnecessary indexes typically fall into one of two categories:
- The index is rarely used or not at all.
Consider Leslie from the incredible TV show Parks and Recreation. Leslie often looks to other powerful women for inspiration.
Let's say Leslie wants to inspire others, so she creates a website about her favorite inspirational women. The website allows users to search by full name, last name, or hobby.
Leslie chooses to use MongoDB Atlas to create her database. She creates a collection named
InspirationalWomen
. Inside of that collection, she creates a document for each inspirational woman. Below is a document she created for Sally Ride.Leslie eats several sugar-filled Nutriyum bars, and, riding her sugar high, creates an index for every field in her collection.
She also creates a compound index on the last_name and first_name fields, so that users can search by full name. Leslie now has one collection with eight indexes:
{ first_name: 1 }
{ last_name: 1 }
{ birthday: 1 }
{ occupation: 1 }
{ quote: 1 }
{ hobbies: 1 }
{ last_name: 1, first_name: 1}
Leslie launches her website and is excited to be helping others find inspiration. Users are discovering new role models as they search by full name, last name, and hobby.
Leslie decides to fine-tune her database and wonders if all of those indexes she created are really necessary.
She opens the Atlas Data Explorer and navigates to the Indexes pane. She can see that the only two indexes that are being used are the compound index named
last_name_1_first_name_1
and the hobbies_1
index. She realizes that this makes sense.Her queries for inspirational women by full name are covered by the
last_name_1_first_name_1
index. Additionally, her query for inspirational women by last name is covered by the same last_name_1_first_name_1
compound index since the index has a last_name
prefix. Her queries for inspirational women by hobby are covered by the hobbies_1
index. Since those are the only ways that users can query her data, the other indexes are unnecessary.In the Data Explorer, Leslie has the option of dropping all of the other unnecessary indexes. Since MongoDB requires an index on the
_id
field, she cannot drop this index.In addition to using the Data Explorer, Leslie also has the option of using MongoDB Compass to check for unnecessary indexes. When she navigates to the Indexes pane for her collection, she can once again see that the
last_name_1_first_name_1
and the hobbies_1
indexes are the only indexes being used regularly. Just as she could in the Atlas Data Explorer, Leslie has the option of dropping each of the indexes except for _id
.Leslie decides to drop all of the unnecessary indexes. After doing so, her collection now has the following indexes:
_id
is indexed by default{ hobbies: 1 }
{ last_name: 1, first_name: 1}
Creating indexes that support your queries is good. Creating unnecessary indexes is generally bad.
Unnecessary indexes reduce performance and take up space. An index is considered to be unnecessary if (1) it is not frequently used by a query or (2) it is redundant because another compound index covers it.
You can use the Atlas Data Explorer or MongoDB Compass to help you discover how frequently your indexes are being used. When you discover an index is unnecessary, remove it.
Be on the lookout for the next post in this anti-patterns series!
Check out the following resources for more information:
This is part of a series