Mongo query planner using inefficient index

Hi all,

This is my query with explained Statistics.

> db.pt.holdings_aggregation.find(
... {
...     "author": ObjectId("6223bdd4dbd482bef8cd5c4d"),
...     "company": ObjectId("6223bdd4dbd482bef8cd5c3f"),
...     "ticker": "ABCR",
...     "cusip": "000752105",
...     "as_of_date": {"$gte":ISODate('2020-06-03')},
...     "issuer": "ABC-NACO Inc.",
...     "holdings_type": "eq",
... }).explain("executionStats")["executionStats"];

{
	"executionSuccess" : true,
	"nReturned" : 2563,
	**"executionTimeMillis" : 2675,**
	"totalKeysExamined" : 402194,
	"totalDocsExamined" : 402194,
	"executionStages" : {
		"stage" : "FETCH",
		"filter" : {
			"$and" : [
				{
					"author" : {
						"$eq" : ObjectId("6223bdd4dbd482bef8cd5c4d")
					}
				},
				{
					"company" : {
						"$eq" : ObjectId("6223bdd4dbd482bef8cd5c3f")
					}
				},
				{
					"cusip" : {
						"$eq" : "000752105"
					}
				},
				{
					"holdings_type" : {
						"$eq" : "eq"
					}
				},
				{
					"issuer" : {
						"$eq" : "ABC-NACO Inc."
					}
				},
				{
					"as_of_date" : {
						"$gte" : ISODate("2020-06-03T00:00:00Z")
					}
				}
			]
		},
		"nReturned" : 2563,
		"executionTimeMillisEstimate" : 2650,
		"works" : 402195,
		"advanced" : 2563,
		"needTime" : 399631,
		"needYield" : 0,
		"saveState" : 3157,
		"restoreState" : 3157,
		"isEOF" : 1,
		"invalidates" : 0,
		"docsExamined" : 402194,
		"alreadyHasObj" : 0,
		"inputStage" : {
			"stage" : "IXSCAN",
			"nReturned" : 402194,
			"executionTimeMillisEstimate" : 50,
			"works" : 402195,
			"advanced" : 402194,
			"needTime" : 0,
			"needYield" : 0,
			"saveState" : 3157,
			"restoreState" : 3157,
			"isEOF" : 1,
			"invalidates" : 0,
			"keyPattern" : {
				"ticker" : 1
			},
			"indexName" : "ticker_1",
			"isMultiKey" : false,
			"multiKeyPaths" : {
				"ticker" : [ ]
			},
			"isUnique" : false,
			"isSparse" : false,
			"isPartial" : false,
			"indexVersion" : 2,
			"direction" : "forward",
			"indexBounds" : {
				"ticker" : [
					"[\"ABCR\", \"ABCR\"]"
				]
			},
			"keysExamined" : 402194,
			"seeks" : 1,
			"dupsTested" : 0,
			"dupsDropped" : 0,
			"seenInvalidated" : 0
		}
	}
}

As you can see above, Execution Time of query is 2675 milli seconds.

Now below, I will be using hint to force this query to use Compound Index.

> db.pt.holdings_aggregation.find(
... {
...     "author": ObjectId("6223bdd4dbd482bef8cd5c4d"),
...     "company": ObjectId("6223bdd4dbd482bef8cd5c3f"),
...     "ticker": "ABCR",
...     "cusip": "000752105",
...     "as_of_date": {"$gte":ISODate('2020-06-03')},
...     "issuer": "ABC-NACO Inc.",
...     "holdings_type": "eq",
... }).hint("author_1_company_1_ticker_1_cusip_1_as_of_date_1").explain("executionStats")["executionStats"];
{
	"executionSuccess" : true,
	"nReturned" : 2563,
	**"executionTimeMillis" : 358,**
	"totalKeysExamined" : 27343,
	"totalDocsExamined" : 27343,
	"executionStages" : {
		"stage" : "FETCH",
		"filter" : {
			"$and" : [
				{
					"holdings_type" : {
						"$eq" : "eq"
					}
				},
				{
					"issuer" : {
						"$eq" : "ABC-NACO Inc."
					}
				}
			]
		},
		"nReturned" : 2563,
		"executionTimeMillisEstimate" : 350,
		"works" : 27344,
		"advanced" : 2563,
		"needTime" : 24780,
		"needYield" : 0,
		"saveState" : 213,
		"restoreState" : 213,
		"isEOF" : 1,
		"invalidates" : 0,
		"docsExamined" : 27343,
		"alreadyHasObj" : 0,
		"inputStage" : {
			"stage" : "IXSCAN",
			"nReturned" : 27343,
			"executionTimeMillisEstimate" : 10,
			"works" : 27344,
			"advanced" : 27343,
			"needTime" : 0,
			"needYield" : 0,
			"saveState" : 213,
			"restoreState" : 213,
			"isEOF" : 1,
			"invalidates" : 0,
			"keyPattern" : {
				"author" : 1,
				"company" : 1,
				"ticker" : 1,
				"cusip" : 1,
				"as_of_date" : 1
			},
			"indexName" : "author_1_company_1_ticker_1_cusip_1_as_of_date_1",
			"isMultiKey" : false,
			"multiKeyPaths" : {
				"author" : [ ],
				"company" : [ ],
				"ticker" : [ ],
				"cusip" : [ ],
				"as_of_date" : [ ]
			},
			"isUnique" : false,
			"isSparse" : false,
			"isPartial" : false,
			"indexVersion" : 2,
			"direction" : "forward",
			"indexBounds" : {
				"author" : [
					"[ObjectId('6223bdd4dbd482bef8cd5c4d'), ObjectId('6223bdd4dbd482bef8cd5c4d')]"
				],
				"company" : [
					"[ObjectId('6223bdd4dbd482bef8cd5c3f'), ObjectId('6223bdd4dbd482bef8cd5c3f')]"
				],
				"ticker" : [
					"[\"ABCR\", \"ABCR\"]"
				],
				"cusip" : [
					"[\"000752105\", \"000752105\"]"
				],
				"as_of_date" : [
					"[new Date(1591142400000), new Date(9223372036854775807)]"
				]
			},
			"keysExamined" : 27343,
			"seeks" : 1,
			"dupsTested" : 0,
			"dupsDropped" : 0,
			"seenInvalidated" : 0
		}
	}
}

This query’s execution time is 358.

Any clue, Why Mongo is not automatically using compound index as execution time difference is huge.

Hi @Rajat_Goyal ,

MongoDB uses an empirical method to compare the indexes and might get a wrong index if by emperical sampling it yeilds same or better results and than get cached.

What you can do is to try and clear query cache for this query:

See if that lets you choose the correct index…

Do you need the ticker index on its own? Another possibility is to remove this index to force the query to use the other index…

Thanks
Pavel

Welcome to the MongoDB Community Forums @Rajat_Goyal !

For some more context on your environment can you also share:

  • the specific version of MongoDB server as reported by db.version() in the MongoDB shell

  • the total number of docs via db.pt.holdings_aggregation.estimatedDocumentCount()

  • the relevant index definitions

  • whether this deployment is a standalone, replica set, or sharded cluster

Thanks,
Stennie

1 Like

Yes, Other indexes are required for other queries.
I did try clearing cache for query planner for whole collection.
Although I did run the same query using hint several times, still without hint it picks other indexes.
For now, I am unsure if I should continue with explicitly providing hint method for better results.

I created this dummy data through scripts for performance analysis.
I am using:
DB Version : 3.6.23
Collection Count: 100009269 ( 100 Million Documents)
Local Standalone Environment with optimal CPU and SSD.

Hi @Rajat_Goyal ,

You can use hint or you can configure the following cachePlanSetFilter: https://docs.mongodb.com/reference/command/planCacheSetFilter/#plancachesetfilter

This will force the optimizer to evaluate only specific indexes for a query

Now it is not persistent through a restart of the node.

Please note that 3.6 is no longer supported and is end of life… I suggest that you upgrade your database to a latest version to get optimised improved index selections

Thanks
Pavel