M121 regex - odd behavior?

Hello folks –

I’m puzzling my way through the m121 Chapter 1 - Optional Lab - Expressions with $project

In the collection movies, the field “writers” has many entries that contain strings like “John Smith (play)” and I guess I will be using " $map …in … as…" to reduce them to the name as in the preceding lab.

I was curious whether anything in the array fields “directors” and “cast” also contain the extra " ( … ) " that would have to be handled ( $mapped ) so that $setIntersection could be used reliably.

So I ran this cursor:

db.movies.aggregate({ $match: { writers: { $regex: ‘(.*)’, $options: ‘i’ }}}).pretty()

and got many that do not have " (play) " or “(novel)” etc.

So I tested the regex at https://regex101.com/ (in PCRE2 mode) and it passed.

But no luck in in my query. I get ordinary strings like “Thomas White”, as well as “Georges M�li�s” , and “William Shakespeare (play)” .

I’m not a regular regex user so there’s probably something about $regex that I’m missing. Can you tell me what it is?


OK – it’s making sense now (or I’m making sense …)

This works

db.movies.aggregate({ $match: { writers: { $regex: /(.*)/, $options: ‘i’ }}}).itcount()

and counts 22595


db.movies.aggregate({ $match: { cast: { $regex: /(.*)/, $options: ‘i’ }}}).itcount()

reports 0

As does this

db.movies.aggregate({ $match: { directors: { $regex: /(.*)/, $options: ‘i’ }}}).itcount()

Now I can get on with the setIntersection question.

Thanks for listening :slight_smile:

1 Like