Sort collection on document ordernumber and sort subarray on subarray objectorder

Hello,
I have this collection that I want to sort:

[{
  "_id": {
    "$oid": "61b1da8abab5f8eefe9e1589"
  },
  "documentordernumber": 2,
  "documentname": "IT",
  "sublinks": 
    [
      {
        "sublinkid": 2,
        "sublinkname": "Apple"
      },
      {
        "sublinkid": 1,
        "sublinkname": "Microsoft"
      },
      {
        "sublinkid": 3,
        "sublinkname": "Google"
      }
    ]
  },
  {
    "_id": {
      "$oid": "61b1da8abab5f8eefe9e1589"
    },
    "documentordernumber": 1,
    "documentname": "Furniture",
    "sublinks": 
      [
        {
          "sublinkid": 3,
          "sublinkname": "Cabinet"
        },
        {
          "sublinkid": 1,
          "sublinkname": "Chair"
        },
        {
          "sublinkid": 2,
          "sublinkname": "Table"
        }
      ]
    }
]

I would like to get the collection and sort the documents on documentordernumber (easy part) and within each document I want to sort the sublinks on their sublinkid, all in one query in Node.js or in MongoDB Compass with aggregation) Thank you!

I’m afraid that’s a huge anti-pattern as once you do the $unwind both the $sort and the $group will need to be in memory or worse still useDisk operations and use huge amounts of RAM.

It is possible to do this on the server where you sort the array in place using aggregation although it’s fairly advanced , most good Aggregation code needs code to generate it.

The code at [https://github.com/asya999/bits-n-pieces/blob/master/scripts/sortArray.js] will generate the aggregation pipeline you need, you can run it in a mongo shell or in node. @Asya_Kamsky also did a talk about this which you can find online.

So in your case you basically want to (a) sort by documentordernumber using an index then use { $set : { sublinks : X } } where X is the output of running sortArray("$sublinks","sublinkid")

If you can’t run the JS function the value for X in yoru case is below…

{
  "$cond": [{
      "$eq": [
        0,
        {
          "$size": "$sublinks"
        }
      ]
    },
    [],
    {
      "$slice": [{
          "$reduce": {
            "input": "$sublinks",
            "initialValue": [{
                "sublinkid": {
                  "$maxKey": 1
                }
              },
              {
                "sublinkid": {
                  "$minKey": 1
                }
              }
            ],
            "in": {
              "$let": {
                "vars": {
                  "rv": "$$value",
                  "rt": "$$this"
                },
                "in": {
                  "$let": {
                    "vars": {
                      "idx": {
                        "$reduce": {
                          "input": {
                            "$range": [
                              0,
                              {
                                "$size": "$$rv"
                              }
                            ]
                          },
                          "initialValue": 9999999,
                          "in": {
                            "$cond": [{
                                "$gt": [
                                  "$$rt.sublinkid",
                                  {
                                    "$arrayElemAt": [
                                      "$$rv.sublinkid",
                                      "$$this"
                                    ]
                                  }
                                ]
                              },
                              {
                                "$min": [
                                  "$$value",
                                  "$$this"
                                ]
                              },
                              "$$value"
                            ]
                          }
                        }
                      }
                    },
                    "in": {
                      "$concatArrays": [{
                          "$cond": [{
                              "$eq": [
                                0,
                                "$$idx"
                              ]
                            },
                            [],
                            {
                              "$slice": [
                                "$$rv",
                                0,
                                "$$idx"
                              ]
                            }
                          ]
                        },
                        [
                          "$$rt"
                        ],
                        {
                          "$slice": [
                            "$$rv",
                            "$$idx",
                            {
                              "$size": "$$rv"
                            }
                          ]
                        }
                      ]
                    }
                  }
                }
              }
            }
          }
        },
        1,
        {
          "$size": "$sublinks"
        }
      ]
    }
  ]
}
4 Likes

I agree with John, the accepted answer is NOT the best way to do it at all. You never want to $unwind an array when it’s going to be followed by $group back into original documents.

In this case it causes additional issues like it won’t be able to use an index on documentordernumber if there is such an index (there should be to avoid having to sort in memory).

By the way, upcoming version 5.2.0 will have an implementation of $sortArray expression in aggregation so the workarounds John and I describe will not be necessary in the future. (See https://jira.mongodb.org/browse/SERVER-60967 and https://jira.mongodb.org/browse/SERVER-29425)

Once you’re on 5.2 the pipeline should be:

[ 
  {$sort:{ documentsordernumber : 1}}, 
  {$set:{ sublinks: {$sortArray:{ input:"$sublinks", sortBy:{sublinkid:1}}}}}
]

Asya

4 Likes

If I may add. Sometimes some operations are better left to be done in the application back end rather than at the data access layer. This data looks like some web page generator where you want to present sub-links in a certain order. Sorting those small arrays in the application back end is quick and easy and can be more flexible, if someone wants to see them in alphabetic order instead. It might even scale better when done in the back end because it is easier to run multiple back end instances.

3 Likes

We just launched mongodb 5.2 and it appears it has $sortArray() in aggregation for just this.

1 Like