Query with $group Chapter 3

I used the following query just to group records after a $match stage, and got records only containing the rated. field. like
{ "_id" : "X" } { "_id" : "M" } { "_id" : "R" }

What am I doing wrong?

db.movies.aggregate([{$match: {"awards": {$regex: /Won \d Oscar/}}}, {$group: {_id: "$rated"}}]).pretty()


Your group stage only specify _id. You must use at least one accumulator such as sum.

Okay. So the group stage is used to make calculations for additional fields, right?

That’s the whole point of chapter 3. I’m sorry to write that, but it looks like you are trying to do the labs with following the lectures.

Point taken. However, asking questions, is how you know the depth of your understanding, right?


You are right. Sorry for the useless comment.

Please note, there are some films with more than 9 oscars, so \d* is required.
NOTE - Recommended to use \d+ (See later comment)

Not being familiar with Regex, that was the challenge in this lab.

This was useful website to practice the regex commands on.


Command at the top and test data at the bottom e.g.

Won 9 Oscars
won 2 Oscar
Won 2 Oscar
Won Oscar
Won 12 Oscars


Thank you for the regex web link. That really helped.

You would need to use \d+ as \d* would also match Won Oscar. That would not be a problem for this lab but it the real world it might.

Probably the point was to use $cond statement somehow, but I missed it so I digged into regex too, despite never worked with it. Anyway, it worked, but I used \d\d? - two digits but second optional. Thanks for regexr.

Thanks for the \d+ note, I have modified my code to use that.

The regular expression \d\d? will effectively works for Won 12 Oscar, but won’t for Won 123 Oscar and Won 1234 Oscar. So, like \d*, \d\d? works for this data set by giving the same result as \d+, it does not in all context. To resume:

\d match exactly one digits
\d* match 0 or more digits
\d+ match one or more digits
\d\d? match one or two digits

In awards field i found data in this way…
1 win.,2 wins,1 nomination etc…But in lab instructions its mentioned as
Won 13 Oscars
Won 1 Oscar
I am bit confused…with this…please clarify?
I have used split function and then filtered only oscar winner documents…I got total document count:2154 …is this process right?

