Hello,
I am struggling with an aggregation pipeline which is asking for allowDiskUse for sorting while I think it should not.
Here is the code for Mongo shell:
db.four_monthly.aggregate(
[
{ $match: { } },
{ $project: { scatter_plot: 0, fit_plot: 0 } }, { $sort: { date_end: -1 } },
{ $group: { _id: { road: "$road", direction: "$direction", pk: "$pk" }, doc: { $first: "$$ROOT" } } }
],
{ allowDiskUse: true }
)
So basically what I am looking for in newest document (on date_end) for each unique set of road+direction+dk.
This query does not work without allowDiskUse, which is not what I want.
The following indexes are defined:
[
{
"v" : 2,
"unique" : true,
"key" : {
"date_end" : -1,
"road" : 1,
"direction" : 1,
"pk" : 1
},
"name" : "date_end_-1_road_1_direction_1_pk_1",
"ns" : "flow-speed-calculator.four_monthly"
},
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "flow-speed-calculator.four_monthly"
},
{
"v" : 2,
"key" : {
"road" : 1,
"direction" : 1,
"pk" : 1,
"date_end" : -1
},
"name" : "road_1_direction_1_pk_1_date_end_-1"
}
]
What I guess is that it sort documents first and then go throught the sorted list to extract one document for each road+direction+pk document, that’s why it requires disk use. However, what I expected MongoDb to do is first request all unique set of road+direction+pk then sort on date_end and return newest for each.
Is that possible ? If so, how do I fix my aggregation ?
Thanks a lot in advance,
Best regards, Adam.