How to use $unwind > $group > $sort > <?> in an aggregate pipeline?

Oliver_Browne · December 15, 2022, 1:48pm

I have an aggregate pipeline that is transforming documents. I’m at a point where the document contains a field like this…

{
  ...,
  notSorted: [
    { code: "a", index: 1, <other data> },
    { code: "c", index: 1, ...},
    { code: "a", index: 2, ...},
    { code: "c", index: 2, ...},
    { code: "b", index: 2, ...},
    { code: "b", index: 1, ...}
  ],
  ...
}

I want to keep that field and add a new, sorted version, field, like so…

{
  ...,
  notSorted: [ <as before> ],
  sorted: [
    {
      codes: "A",
      elements: [
        { index: 1, <other data> },
        { index: 2, ... }
      ]
    },
    {
      codes: "B",
      elements: [
        { index: 1, ...},
        { index: 2, ... }
      ]
    },
    {
      codes: "C",
      elements: [
        { index: 1, ...}, 
        { index: 2, ...}
      ]
    },
  ],
  ...
}

What I’m currently trying…

I think what I should do, within my aggregate pipeline is something like:

an $unwind stage to split the array apart into separate docs,
then a $group stage to pull those new docs together based on the ‘code’ value,
a $sort stage to get the docs sorted by code alphabetical order,
and a as-yet-unknown last step to pull them all back together in one document.

Does that sound like it would work?
if yes, what would you do for step 4?

Problems I’m running into with this approach…

When I do the $unwind, things seem okay, I get 6 documents, one for each element of the notSorted array…

[ aggregate pipeline: $unwind stage ]
{
  path: "$notSorted"
}

Then I can use $group to pull them together by the “code” id:

[ aggregate pipeline: $group stage ]
{ 
  _id: "$notSorted.code",
}

but I then lose all data other than the code. How do I keep the rest? I can use a $mergeObjects step in with the group, but then I can’t see how I can easily keep the existing object structure, and I don’t want to merge the various elements of A, I just want them grouped together, but still as separate objects…

Urgh. I feel stupid. What am I missing?

steevej · December 17, 2022, 2:16pm

Please publish your sample documents in usable form. Little 3 dots and things like <other data> cannot be cut-n-paste directly into our system for experimentation. Values and keys are case sensitive. So when you use code:a make sure you use code:a rather than code:A elsewhere.

My approach will involve multiple stages.

1 - A $set stage that uses $reduce on notSorted to compute with $addToSet an array of codes, as:_codes.

2 - A $set stage that uses $map on _codes with $filter on notSorted to collect the elements of a given code as:_grouped

3 - A final $set stage that uses $map on _grouped that uses $sortArray on _grouped.elements to produce as:sorted.