Hi all!
I have a collection which has rather large documents (265.000 objects occupy 78GB space).
Users are able to query documents by creating any combination and order of search fields.
Creating an index takes a long time, because server has to move 78GB objects through its memory.
To not restrict the user to be able to only use certain combinations of search fields, I have come up with the following concept:
Create an object metaInfos as array in each object.
Create objects in { name: “”, value: “” } form for each field that can occur in a search.
Create a SINGLE index on metaInfos.name + metaInfos.value.
Query any combination of search fields like this :
{
"$and" : [
{
"metaInfos" : {
"$elemMatch" : {
name : 'nameOfField1',
value : 'valueOfField1'
}
}
},
{
"metaInfos" : {
"$elemMatch" : {
name : 'nameOfField2',
value : 'valueOfField2'
}
}
}
]
}
Explain shows me this uses the index for each of the queries I tested.
Now I also want to use aggregation on it.
E.g. I have a state field containing the current state of the document in string format.
I would like to aggregate by group like this:
db.getCollection("test").aggregate(
[
{
"$group" : {
"_id" : "$state",
"count" : {
"$sum" : NumberInt(1)
}
}
},
],
{
"allowDiskUse" : false
});
But since the state is only available inside the array I have to
db.getCollection("test").aggregate(
[
{
"$unwind" : {
"path" : "$metaInfos"
}
},
{
"$match" : {
"metaInfos.name" : "state"
}
},
{
"$group" : {
"_id" : "$metaInfos.value",
"count" : {
"$sum" : NumberInt(1)
}
}
}
],
{
"allowDiskUse" : false
}
);
Explain however shows this to NOT use the index, but rather make a collection scan.
Is there any other way to use aggregation on a specific item in an array that is not location based like .0 ?
Thanks for your help
Andreas