Hi, we are leveraging mongoDB capabilities to process a large amount of data, yet performance is much slower than expected.
We have 4.600.000 records with position data. Each record has 38 properties, of which one is the priority (the lower the priority number, the higher it should be prioritised) and one is the date time it has been issued at.
About 1 of every 10.000 updates is for the same domain object and we want to have the latest state per domain object.
To do this we save incoming data per field and created an aggregate to get the latest state for a given domain object
db.barProperty.aggregate([
{ $match: {
domainObjectId : "e1f2a6ba-de58-4548-a40f-990bdbf57d48"
}},
{ $sort: {
priority: 1,
issuedAt: -1
}},
{ $group: {
"_id": "$field",
"value": {"$last": "$value"},
"issuedAt": {"$last": "$issuedAt"}
}},
{
"$group": {
"_id": null,
"issuedAt": {"$max": "$issuedAt"},
"array": {
"$push": {
"k": "$_id",
"v": "$value"
}
}
}},
{ "$replaceRoot": {
"newRoot": {
"$mergeObjects": [
{"$arrayToObject": "$array"},
{"issuedAt": "$issuedAt"}
]
}
}}
],
{ allowDiskUse: true });
There is an index on
db.barProperty.createIndex(
{ domainObjectId: 1, priority: 1, issuedAt: -1 }
)
Following group step is taking up to 30 seconds for a given identifier:
{
"$group":{
"_id":"$field",
"value":{
"$last":"$value"
},
"issuedAt":{
"$last":"$issuedAt"
}
}
}
The second group and last mergeObjects is relatively quick since at that point it only handles 38 documents.
Any help with this is highly appreciated.