Covered Queries with multikey and regular expression

Rosy · January 4, 2024, 2:19am

Hello,

I have below 2 single indexes:
Index 1 - {“merchant.item.sku”:1}
Index 2 - {“merchant.item.name”:1}

Original Query used by Index 2 is {“merchant.item.name”:/^STA/} .

Now, I am trying to combine these multikey indexes in one and have a covered query.

New Index:

{“merchant.item.sku”:1, “merchant.item.name”:1}

But the catch is I don’t have any specific value available for “merchant.item.sku”

I tried below but surely $ne not taking up with the index above.
{“merchant.item.sku”:{$ne:null}, “merchant.item.name”:/^STA/}

However if I update it to {“merchant.item.sku”:{$ne:null}, “merchant.item.name”:“STA”} removing regular expression it still picks the index.

But I need name to be a regular expression. How can I achieve this? Please suggest!

Kaushal_Kishor · January 4, 2024, 3:56am

Hello , can you try with Wildcard index i.e. ({ “merchant.$**” : 1 })

Artur_57972 · January 4, 2024, 11:44am

Covered queries won’t work in multikey indexes when the fields to be covered are in an array, you can check this in this part of the MongoDB documentation.

Rosy · January 4, 2024, 1:50pm

Thanks for your reply.

“merchant.item” is an array of objects and has around 30 fields inside. Also, documents are in millions in the collection.
I am not quite sure if adding wildcard indexes will be performance efficient. What you suggest?

Rosy · January 4, 2024, 1:56pm

Thanks @Artur_57972
However for below scenarios updated index was picked as seen in explain plan.

Giving name value which leads to $eq instead of regular expression:
{“merchant.item.sku”:{$ne:null}, “merchant.item.name”:“STANLEY”}
Giving value for sku (random value gave just to test as exact value is not known)
{“merchant.item.sku”:“2323”, “merchant.item.name”:/^STA/}

Here “merchant.item” is actually array of objects.

Artur_57972 · January 4, 2024, 2:13pm

The index will be used, but the query won’t be of type Covered Query

Rosy · January 4, 2024, 2:18pm

Okay. Got it.

But in this case index is not even getting picked
{“merchant.item.sku”:{$ne:null}, “merchant.item.name”:/^STA/}

Is there any way I can get the updated index being picked up by updating this query.
Its just that “merchant.item.name” has to be regular expr.

Artur_57972 · January 4, 2024, 2:24pm

You can use the hint option to force the query to use a specific index. In this page you can find some information about regex and index use.

Rosy · January 4, 2024, 5:17pm

Yes, I tried this option but there is no performance improvement which is expected after applying index as it still scans all documents.

Artur_57972 · January 4, 2024, 5:27pm

Can you provide some example documents that I can try on my machine?

Rosy · January 4, 2024, 9:12pm

Sharing 1 sample document:

{
record: “MER_DTLS”,
“merchant”:{
“merchantId”: “MER12435”,
“merchantName”: “Merchant Adam”,

	"item":[
	{
		"sku": "SKU1",
		"name": "Name 1 of the item",
		"referenceNo": "REF1235",
		"details" :"SKU 1 details to be entered",
		"uuid" : "76743647wererererr545",
		"currency": "CAD",
		...
		...
		...
	},
	{
		"sku": "SKU 2",
		"name": "Name 2 of the item",
		"referenceNo": "REF56574684",
		"details" :"SKU 2 details to be entered",
		"uuid" : "yty20244758yuu-999",
		"currency": "CAD",
		...
		...
		...
	},
	{
		...
		...
	}
],
"comments": "Hi I am new merchant",
}

}

Kaushal_Kishor · January 4, 2024, 9:14pm

I would suggest try this option and see if the index being used or not . if does not help , you can drop… as per the documentation , this index option for array of objects only.

Please let me know if it works or does not work.

Artur_57972 · January 5, 2024, 3:49pm

I executed the following code in the Mongosh, and it worked, the find operation used the index. You can try it by just copying and paste to your Mongosh.

const database = "test";
const collection = "merchants";

use(database);

db.dropDatabase();

db.getCollection(collection).createIndex({ "merchant.item.sku": 1, "merchant.item.name": 1 });

db.getCollection(collection).insertMany([
  {
    record: "MER_DTLS",
    merchant: {
      id: "MER12345",
      name: "Merchant Adam",
      item: [{ sku: "SKU0", name: "Name 0 of the item" }],
    },
  },
  {
    record: "MER_ABCD",
    merchant: {
      id: "MER67890",
      name: "Merchant Suzy",
      item: [{ sku: "SKU2", name: "STAN" }],
    },
  },
]);

db.getCollection(collection)
  .find({
    "merchant.item.sku": { $exists: true, $ne: null },
    "merchant.item.name": { $regex: /^STA/ },
  })
  .explain("executionStats");

Rosy · January 10, 2024, 4:39pm

Thanks @Artur_57972 your code worked.

But the performance wise it ended up almost same to COLLSCAN.

Rosy · January 10, 2024, 4:41pm

Thanks @Kaushal_Kishor
This one worked. I still need to check further on performance and also how it will impact existing related indexes.
Will keep you posted.

Rosy · January 11, 2024, 5:54pm

I have a question.
What is the difference in wildcard indexes for “merchant.$” and “merchant.item.$”
During find, both indexes will look for path “merchant.item" and “merchant.item.name”
or
“merchant.item" and “merchant.item.sku”

So why we selected “merchant.$” and not this “merchant.item.$”

Artur_57972 · January 12, 2024, 3:53pm

The improvement in performance will be significant when you have enough documents so it doesn’t fit on the server cache, for small data sets, the time taken to execute the query will be very close. To have a better idea of the performance of a query, you can take a look at this page of the documentation.