Aggregate $sort multiple fields in order

I know that the mongo shell docs say $sort parses the fields in left to right order, but what happens when it’s in a language/runtime that doesn’t guarantee object key order?

db.users.aggregate(
   [
     { $sort : { age : -1, posts: 1 } } // age may not be the first key when serialized in a language/runtime that doesn't guarantee serialized key ordering
   ]
)

e.g. https://docs.mongodb.com/manual/reference/operator/aggregation/sort/#ascending-descending-sort

Can we specify $sort in an array for example to ensure that fields are sorted in the desired order?

Also of concern, the doc instructions is not necessarily true for the shell either as evidenced by this open bug from 2013: https://jira.mongodb.org/browse/SERVER-11358 (which is still the case when tested on mongo 4.2.6)

> x = {a:1, 10:1}
{ "10" : 1, "a" : 1 }

Welcome to the MongoDB Community @juniorprogrammer!

JavaScript objects (and analogous data structures such as dictionaries and hashes in other programming language) do not guarantee ordering of keys. This is definitely a consideration when ordering can be significant in the BSON format used by the MongoDB server.

The mongo shell will preserve the order of alphanumeric keys in JavaScript objects, but numeric keys (or strings that look like numbers) will be re-ordered to the front of the object as per your example and the open issue you referenced. This behaviour is part of the JavaScript language implementation used by the mongo shell (currently the MozJS runtime).

Most languages provide an order-preserving data type which can be used as an alternative (for example, Map in modern versions of Node.js). Official MongoDB drivers include a helper class if there isn’t a native data type, such as the SON class in the Python driver).

The mongo shell (as at MongoDB 4.2) has a historical implementation of Map that differs from modern JavaScript, so this is a lingering issue that still needs to be resolved.

However, I strongly recommend avoiding numeric key names as this is both a straightforward workaround for this issue and avoids some syntactic ambiguity between array references and embedded fields. Using dot notation, a reference like data.2 could either be referring to the 3rd element in the 0-based data array or a field 2 embedded within data . Using numeric field names may also result in unexpected outcomes (such as backfilling an array) with the right combination of update syntax and documents.

Regards,
Stennie

Hi @Stennie_X,

Thanks for that quick reply, so if I’m understanding you correctly here, you are suggesting for node, we use a Map object rather than a plain old JS object? for example something like the below to ensure “a” gets sorted before “10” in case of some strange JS run time:

const sortMap = new Map();
sortMap.set('a', 1);
sortMap.set('10', 1);
users.aggregate(
   [
     { $sort : sortMap } 
   ]
)

Thanks for the guidance once again!

Hi,

I’m actually suggesting you avoid using numeric field names so a workaround is not required, but Map() is the correct approach for an order-preserving data structure in JavaScript.

You don’t typically see this used in examples since the default behaviour works as expected in JavaScript unless you use numeric field/key names. Semantic names describing the field context are much more common (and helpful for future readers of your data model / code).

Regards,
Stennie

Hi @Stennie_X,

Thanks! This is super helpful!

I totally understand what you’re saying regarding using alphanumeric rather than numeric field names and we’re currently using meaningful alpha field names. I was just made aware that in the ES standard, it didn’t necessarily dictate any specific order for the enumeration of properties of objects, so I just wanted to be prepared for backup options in case the current V8 engine behavior changes and this has been very helpful for that! I’m glad to know that the driver supports the order-preserving Map data structure in case V8 behavior changes in the future.

Reference for anyone else looking at this in the future: Property ordering of [[Enumerate]] / getOwnPropertyNames()

Thanks again!