What determines, which index needs to be used in case there are multiple indexes (with common fields) defined for a collection?

Lets say I have 3 indexes defined on a collection:

  1. (user_email, 1)
  2. (list_type, 1)
  3. (user_email, 1) + (list_type, 1)

Considering there can be multiple documents with same user_email, and list_type has only 2 values (“bulk” and “single”), if I make a query:

find({user_email: "johndoe@example.com", list_type: "bulk"})

Which one of the indexes would be used? And on what basis does mongo determine which one to use?

2 Likes

Hello @Amin_Memon, here are some clarifications.

Which one of the indexes would be used?

In theory, any of the indexes could be used. In case, if there was a compound index on (list_type, 1) + (user_email, 1), it would be a candidate too.

In your case, ideally, a compound index comprising the two fields should be created and that would be used. Since, there is a possibility of using the two fields in two combinations, you need to determine which is the best. This can be determined in the way the data is. For a given query, using the two fields, find how many documents will be selected with the first field alone. The best option is when the query returns the least number of documents using the first field. See Query Selectivity for details.

Practically, the way to determine which index would be used is to generate a query plan on the find query using the explain() method. You can analyze the plan and find out which index is used, how and see statistics like how much time it has taken to execute the query. See Analyze Query Performance for details.


And on what basis does mongo determine which one to use?

MongoDB’s query optimizer generates query plans for all the indexes that could be used for a given query. In the example scenario, all the three indexes are possible candidates.

The optimizer uses each of the index plans and executes them for a certain period of time and chooses the best performing candidate (this is determined based upon factors like - which returned most documents in least time, and other factors). Based upon this it will output winning and rejected plans and these can be viewed in the plan output.

MongoDB caches the plans for a given query shape. Query plans are cached so that plans need not be generated and compared against each other every time a query is executed and get the winning plan.

2 Likes

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.