In case someone else has a similar problem, I’ve come up with at least a solution that works, not necessarily the solution:
db.profiles.aggregate([
// Match only profiles with given location
{
"$match": {
"locations.adm1_id": "new-york"
}
},
// Unwind locations
{
"$unwind": "$locations"
},
// The unwinded working set might have profiles with non-relevant locations.
// Match only unwinded profiles with given location.
{
"$match": {
"locations.adm1_id": "new-york"
}
},
// Sort profiles with highlighted location first, second order by
// orderKey-field, so that the next stage picks up the highlighted profile.
{
"$sort": {
"locations.highlight": -1,
"orderKey": 1
}
},
// Now we might have duplicates if a profile had two locations with same
// adm1_id, so group by _id, grabbing the `$first` values
{
"$group": {
"_id": "$_id",
"name": {
"$first": "$name"
},
"locations": {
"$first": "$locations"
},
"orderKey": {
"$first": "$orderKey"
}
}
},
// $group messes up the sort ordering, so we sort again!
{
"$sort": {
"locations.highlight": -1,
"orderKey": 1
}
},
// We want to skip & limit our results, but are also interested in the total amount of results.
{
"$facet": {
"paginatedResults": [
{
"$skip": 24 // skip whatever is needed
},
{
"$limit": 24 // limit to page size
},
// use project to make sure we only return safe fields,
// $group stage is optional and used only if adm1_id was provided.
{
"$project": {
"name": 1,
"locations": 1,
"orderKey": 1
}
}
],
// Get total count of result documents
"totalCount": [
{
"$count": "count"
}
]
}
}
]);
It is obviously slower than a plain .find() on the index, but a typical query within a collection of 5000 documents takes approx. ~12-20ms, or ~20-30ms, measured from the node client.
Any ideas how to achieve the same, but better and faster, are very much welcome. My aggregate-knowledge is very limited.