Request for explanation! -- Chapter 4: Query Operators Lecture

Hi folks!

I’m having a great time going through M001 and am hoping someone can do a little extra explaining to help me get why things work the way they do in the “Query Operators - Logic” lecture (in Chapter 4).

Our instructor, Yuliya, is explaining the distinction between implicit vs. explicit $and, and the situations where explicit $and is crucial. As an example, we head to the routes collection to try and find out how many CR2 and A81 airplanes come through the KZN airport.

It’s stated in the lecture that when we run the following query (without the explicit $and)…

{"$or": [{dst_airport: "KZN"}, {src_airport: "KZN"}] },
{"$or": [{airplane: "CR2"}, {airplane: "A81"}] }

the above may get us results with the right airport but the wrong airplane.

Why does this happen? (And why, by contrast, does usage of explicit $and ensure we get only the right airplanes in our results?) Any color you can add is much appreciated. :grinning_face_with_smiling_eyes:

Thanks,
Ana

2 Likes

A query is a JSON document.

In principal, each field has a different name. Some drivers do not enforce that. But most do. Some ignore repeated fields.

This means that if you have a query that involve 2 or more conditions on the same field such as

{ "a" : { "$lt" : 10 } , "a" : { "$gt" : 2 } }

you lose one of the condition. If you run the following in the mongo shell you will see that the query

> q = { "a" : { "$lt" : 10 } , "a" : { "$gt" : 2 } }
{ "a" : { "$gt" : 2 } }

However if you use explicit $and you get

> q = { "$and" : [ { "a" : { "$lt" : 10 } } , { "a" : { "$gt" : 2 } } ] }
{
	"$and" : [
		{
			"a" : {
				"$lt" : 10
			}
		},
		{
			"a" : {
				"$gt" : 2
			}
		}
	]
}

This means that if you have a query like

{ "$or" : [ ... ] , "$or" : [ ... ] }

you lose one of the $or just like you lose of the a clause above. You need explicit $and to keep both clause.

That’s probably the other way around, correct airplane and wrong airport since the query is really:

> q= {"$or": [{dst_airport: "KZN"}, {src_airport: "KZN"}] , "$or": [{airplane: "CR2"}, {airplane: "A81"}] }
{ "$or" : [ { "airplane" : "CR2" }, { "airplane" : "A81" } ] }

Thankfully there is short cut for some of the usual case. The query

{ "$and" : [ { "a" : { "$lt" : 10 } } , { "a" : { "$gt" : 2 } } ] }

can be written as (using implicit and on $gt and $lt

{ "a" : { "$lt" : 10 , "$gt" : 2 } }

and

{"$or": [ { "airplane" : "CR2"}, { "airplane" : "A81" } ] }

is equivalent to

{ "airplane" : { "$in" : [ "CR2", "A81" ] }
5 Likes

Hi! I just listened to that lecture again. And here’s what I learned. The implicit and operator is already implied in a query by default. When using the two or operators as above, we’re just saying give me the documents matching either one of these two field’s values. By placing the explicit and operator in the query we are then saying, only give me the results if the field values match both of the parameters we have specified. I hope that makes sense. It just did for me!

3 Likes

[ comment removed as I missed some of the subtility of what was written ]

That’s why I am editing the quotes from Jason_Nutt so that others do not mis-read was is written, like I did.

Yes

Yes that’s correct.

2 Likes

Sorry about not having formatted that correctly…I am not able to access the community on my laptop… it gives me the error

Request blocked. We can’t connect to the server for this app or website at this time. There might be too much traffic or configuration error. If you provide content to customers through cloudfront, you can find steps to troubleshoot and help prevent this error by reviewing the cloudfront documentation

and am unclear on how to correct this. Until I can get into community on the actual computer I appreciate the correcting for clarity’s sake. @steevej

1 Like

Thank you @Jason_Nutt for revisiting and explaining this material and thank you @steevej for fine-tuning @Jason_Nutt’s answer!!

2 Likes

Greatly appreciate you stepping through this from the ground up @steevej! I will likely return to your above explanation in the future. :smiley:

@steevej:

That’s probably the other way around, correct airplane and wrong airport since the query is really:

> q= {"$or": [{dst_airport: "KZN"}, {src_airport: "KZN"}] , "$or": [{airplane: "CR2"}, {airplane: "A81"}] }
{ "$or" : [ { "airplane" : "CR2" }, { "airplane" : "A81" } ] }

Gotcha.

Finally, to paraphrase, explicit $and creates sets of parameters in your query which must be met (as @Jason_Nutt said “only give me the results if the field values match both of the parameters”) . By contrast, when you just chain $or clauses together without explicit $and, anything after the first $or clause will get dropped because one of your parameters has already been met, and that’s all you’ve required in your query… Have I got that right?

In my understanding of this, the last part of how you are understanding this has been misinterpreted.

Both of the matches will be returned in the $or query. The $and is the operator in this case that limits the returns. $or says, “return the airports that match OR the airplanes that match”, $and, in this particular case, says, “return ONLY the airplanes AND the airports that fall into BOTH of these matched field value pairs”. I believe this is the solution that you are looking for. Let me know if not and I will look at it again. I love to figure things out. Thanks for the opportunity to enhance my growth and learning. Happy coding @Ana_Hevesi !

1 Like

Hi @Ana_Hevesi, something similar has also been discussed here:

The conclusion was: since an object should never have duplicate keys, implicit $and will never work in that case.
And it seems like this stands true for operators (like $or in this case) as well.
Therefore, the following can never leverage implicit $and.

{
  "$or": ["query logic 1"],
  "$or": ["query logic 2"]
}
{
  "a": "query logic 1",
  "a": "query logic 2"
}

Instead, only query logic 2 will be taken into consideration every time.

This is incorrect, in fact, only the last $or will be considered.

Thanks and Regards.
Sourabh Bagrecha,
Curriculum Services Engineer

3 Likes

@Jason_Nutt and @SourabhBagrecha Thank you both for the clarification!

2 Likes

Absolutely @AnaNoemi ! Here to help, here to learn. Each one, teach one. :deciduous_tree:

Hi Folks,

This is the gist of the issue: when MongoDB documents are represented as JavaScript objects, Python dictionaries or similar nested key:value formats any keys at the same level in the document must be unique.

Setting the same key multiple times may have unexpected outcomes depending on the language implementation. For example, JavaScript engines typically use the last value that is set for a unique key but historical JavaScript specs (prior to ES6) did not explicitly define the order of keys in an object so you could not rely on this behaviour.

When do you need to use $and?

The explicit $and operator is used for queries with multiple expressions specifying the same field or operator at the same level in a query document.

Using an explicit $and resolves the issue of duplicate keys at the same level in a document by converting them into an array of subdocuments. The $and operator supports correct construction of queries in your driver/client when expressions need to duplicate fields or operators at the same level in a document, but does not change any server-side logic.

As @steevej noted earlier, a query using multiple expressions specifying the same field (for example with $gt and $lt comparisons) can also be rewritten to use implicit AND by moving the duplicated expressions into a subdocument.

How are documents really stored by MongoDB?

If you want to follow the rabbit hole a bit deeper: the underlying storage format in MongoDB is BSON, which is a JSON-like binary serialisation format with additional data types. While it is convenient to think of MongoDB as a JSON database, it is actually a BSON database and there are some important differences.

BSON documents are ordered objects, not dictionaries, so have some different behaviour than JSON: order of keys is significant for embedded documents and technically you can have duplicate keys in BSON (although it is a bad idea to do so!).

For more background, see my response on Why does MongoDB check for order of keys when matching embedded documents?.

Regards,
Stennie

6 Likes

Great explanation with documentation. Thanks @Stennie.

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.