Lab - Using $lookup / Is my approach wrong?

In the homework:
Chapter 3: Core Aggregation - Combining Information
Lab - Using $lookup

Using this as a solution:

from: “air_alliances”,
localField: “”,
foreignField: “airlines”,
as: “AlNm”

I get an answer (maximum value for AlCnt), but I am told this is wrong. Why is that? I hope someone can’t point out my mistake?

Second and side question:
This homework has 3 possible answers and 3 trials.
Trying all possibilities without even thinking will inevitably lead to the solution. Am I missing something? (For example because my browser wouldn’t show it all?)



Overall, the query is quite alright. But this:

Better be:

airplane:{$in:["380", "747"]},

In both cases though. we will have trouble. Let’s type this command: db.air_routes.find({},{_id:0, airplane:1}) to look at the data:

We get

{ "airplane" : "CR2" }
{ "airplane" : "142 146" }
{ "airplane" : "E70 CRJ M87" }

Planes are a string, not an array.

$in won’t go through the string word by word.

Still we could:

  • split the string, and get an array, and use $in

If we don’t split the string, something like this:
"CR2 747 380" won’t be matched.

The general, and better approach is to use regex, like so:


I’m not saying this solves the query, just some ideas.


I see. This is interesting, I had indeed not looked that closely to the shape of the data. I was only thinking of an array and took it for granted. Well, I was wrong. Thank you for pointing out the miss.

1 Like

@Michel_Bouchet, I have tried the same way you did and seems my approach is wrong. I am getting 10,8,5 planes for the given airlines in the same way you are getting. Are we missing anything?

Yes we are missing something. Santiago_Miranda rightly pointed out the mistake, read his answer above, you will see where the issue is. We are dealing with a string, not an array.

1 Like


I do think the shape of the data in airplane e.g. “747 380”, should have been mentioned in the Lab notes, since once a person goes down the $in route, it doesn’t matter what they do, their aggregation pipeline is wrong.

They will never see the documents they are missing and their totals will be wrong.

Of course if it’s a lesson in beware your data, then fine, but I am not sure that should be the point.

PS - I know there were only three possible answers, and you had three try’s but no one likes to get things wrong.

1 Like

Hi @NMullins!

You can use the @ to call in any of the content engineers. The course will probably be updated soon though. But I guess there is enough statistics (users reporting this problem) to argue you are right.


I am not sure who would be the appropriate person to @ in these cases, especially for a course that is about to be refreshed.

This is the second time I have undertaken this particular course, and I try to avoid looking at my previous answers, but I fell foul of this regex search this time, I went with $in this time, and regex last time, probably because I spotted the slight data complexity with air_routes.airplane previously.

Try to warn the user of any “Data shape” gotcha’s, in any future labs.

Cheers :slight_smile:

The one I remember is @Sonali_Mamgain (sorry!). You can look em up in the users table under ☰, or as answerers on many questions.

Have a good day :slight_smile:

One thought, and I hit a similar issue, is to introduce regex at some point earlier in the chapter. I also tried $in and then split up that string in order for $in to work. It was only after I got the right answer could I see that a regex was used.

A comment on regex, in Python the regex is often slower than a “startswith”, “endswith” or “in”. Is there any similar slowdown on a Mongo Server?

I went through the same issue, in my case, as the there are three options and three attempts, I was able to select a correct answer. Later I went through the detailed answer to find the string pattern of airplane.

Yes, I tried regex and it works now.

If the route document in air_routes collection accepts more than one airplane per route, in my opnion the data model should be changed from string to array. I lost more than 2 hours on this problem :frowning:

Thanks guy!

Definitively, an array would be better. However, in real life we have to deal with legacy data that is sub-optimal. Features like $regex are there to help us in cases like that.

The morale is that you have to know your data. The schema analysis tools of Compass is great for that.