My data is very dynamic with the possibility to add custom attributes and events.
This is how my dataset looks like:
{
"_id" : ObjectId("5eccf898ac7ff694845f1ccf"),
"attributes" : [
{
"k" : "first_name",
"v" : "John"
},
{
"k" : "last_name",
"v" : "Doe"
},
{
"k" : "email",
"v" : "john.doe@example.net"
},
{
"k" : "gender",
"v" : "Male"
}
],
"events" : {
"event" : "add_to_cart",
"event_data" : [
{
"k" : "product_name",
"v" : "T-Shirt"
},
{
"k" : "price",
"v" : 25
},
{
"k" : "variants",
"v" : [
{
"k" : "color",
"v" : "red"
},
{
"k" : "size",
"v" : "xl"
},
{
"k" : "matherials",
"v" : [
[
{
"k" : "name",
"v" : "Cotton"
}
],
[
{
"k" : "name",
"v" : "Wool"
}
]
]
}
]
}
]
},
"created_at" : "2020-05-25 16:12:58",
"updated_at" : "2020-05-25 16:12:58"
}
I definitely can create the following index:
db.clients.ensureIndex({"events.event_data.k" : 1, "events.event_data.v" : 1 })
and it performs great. However, if you take a look at my dataset, you’ll notice that values can be very nested (material e.g.).
This query works great:
db.clients.find({
"events.event_data": {
"$elemMatch": {
"k": "product_name",
"v": "T-Shirt"
}
}
})
anyway, when I need to query a deeper level then it scans the whole size. Here is how to scan clients who bought cotton T-Shirt:
db.clients.find({
"events.event_data.v.v":
{
"$elemMatch": {
"$elemMatch" : {
"k": "name", "v": "Cotton"
}
}
}
})
However in this case it does COLLSCAN which is, obviously, something I would like to avoid?
Thank you.