MongoDB Bytes #5: Arrays and their interaction with Range Queries

Hey folks,

Today, we are going to discuss how range queries work when a field is an array. Non Array elements in a document must match each clause of a query’s criteria.

For example, if we query for

{ <field_name> : { “$gt” : 10, “$lt” : 20 }}

the field_name would have to be both greater than 10 and less than 20.

However, if a document’s field is an array, the document matches if there is an element of that array that matches each part of the criteria and not all. Let’s see this through an example.

Suppose we have a collection containing the following documents:

{ _id: ObjectId("633545c15e008f3fb4cf3a5a"), a: 2 },

{ _id: ObjectId("633545c15e008f3fb4cf3a5b"), a: 12 },

{ _id: ObjectId("633545c15e008f3fb4cf3a5c"), a: 18 },

{ _id: ObjectId("633545c15e008f3fb4cf3a5d"), a: 25 },

{ _id: ObjectId("633545c15e008f3fb4cf3a5e"), a: [ 5, 25 ]},

{ _id: ObjectId("63354b395e008f3fb4cf3a5f"), a: [ 10, 14 ] }

}

Now, before we move ahead, I just want to point out that the above collection is not an ideal way to model your data. We should always be very mindful when mixing arrays and non-array values for a field.

If we want to find all the documents that have a value of between 5 and 20, then running our range query:

db.collection.find({"a":{$gt:5,$lt:20}})

returns the following output:

db.collection.find({"a":{$gt:5,$lt:20}})

[

{ _id: ObjectId("633545c15e008f3fb4cf3a5e"), a: [ 5, 25 ] },

{ _id: ObjectId("63354b395e008f3fb4cf3a5f"), a: [ 10, 14 ]},

{ _id: ObjectId("633545c15e008f3fb4cf3a5b"), a: 12 },

{ _id: ObjectId("633545c15e008f3fb4cf3a5c"), a: 18 }

]

The output contains four documents. It contains a:12 as well as a:18, and the array a: [ 10, 14 ], all of which are expected. But we also see that the output contains array a: [ 5, 25 ] as well. Neither 5 nor 25 is between 5 and 20 but still, the array is returned.:thinking::thinking: This is because 25 is greater than 5 (the $gt:5 condition), while 5 is less than 20 (the $lt:20 ) condition. This shows us that the normal range queries may behave differently if an array is involved

So, how do we get the expected output?:thinking:

Well, the easiest solution is to use $elemMatch in our queries. The $elemMatch operator matches documents that contain an array field with at least one element that matches all the specified query criteria. It follows the given syntax:

{ <field>: { $elemMatch: { <query1>, <query2>, ... } } }

However, the thing to remember is that $elemMatch won’t match non-array elements. Using it for our collection:

db.collection.find({"a":{"$elemMatch":{$gt:5,$lt:20}}})

We get the following output:

{

_id: ObjectId("63354b395e008f3fb4cf3a5f"), a: [ 10, 14]

} 

As we can see, the document containing array a: [ 5, 25 ] is no more in the output. And so are documents with fields a: 12 and a: 18.

To summarise,

  1. Range queries work differently when a field is an array. The query matches if there is an element of that array that matches each part of the criteria and not all.
  2. We can use $elemMatch in our range queries if the field is an array. It forces MongoDB to compare all clauses with each array element.
  3. $elemMatch can also make use of indexes.
  4. However, this operator will not match for non-array fields.
4 Likes