Regex keeps avoiding some values

String : "Won 3 Oscars. Another 2 wins & 1 nomination. "

i want to find all documents that start with Won and has a number and Oscars next.
example : -
“Won 3 Oscars. Another 2 wins & 1 nomination.”
“Won 2 Oscars. Another 3 wins & 1 nomination.”

I am using

                {$match : {
             $and : [
                     {awards : {$regex : /^Won/i}},
                     {awards : {$regex : /Oscars/i}}, 
                     
                     ],     
                
            }},

            {$project : {
                _id : 0,
                val : {$toInt : 
                    {$arrayElemAt : [{ $split: [ "$awards", " " ]}, 1]}
                },
            }},
            
            {$match : {
                "val" : {$gte : 1}
            }},
        //    
            {$sort : {val : 1}}

for some reason it keeps avoiding Won 1 Oscars.
I also tried to split the string and try matching the first and third elements.

the same problem rises. it avoids Won 1 Oscars.

It looks like you want us to give you the $regex to solve:

Won 13 Oscars
Won 1 Oscar

from Chapter 3 - Lab - $group and Accumulators.

In your title you mentioned Is there a better way? But you did not share what way you currently have so it is hard to say if what we can propose is better or not.

Anyway the documentation for $regex is at https://docs.mongodb.com/manual/reference/operator/query/regex/.

In the documentation, you will find that the caret is an operator that anchors the beginning of a string. The plus sign, allows you to match one or more of the previous pattern. Numbers can be matches in 2 ways.

I am using
{$match : {
$and : [
{awards : {$regex : /^Won/i}},
{awards : {$regex : /Oscars/i}},
]
}}

for some reason it keeps avoiding Won 1 Oscars.
I also tried to split the string and try matching the first and third elements.

the same problem rises. it avoids Won 1 Oscars.

Because it is

Won 1 Oscar

rather than

Won 1 Oscars

1 Like

Thanks.

if the String = “Won 2 Golden Globes. Nominated for Oscar”

how can I avoid it. Since my regex only searches for “won” in the first text and “Oscar”. it will consider the above string correct.

I can use array but that seems tedious is there a better way with regex?

You need a single regular expression that sounds like:

  1. anchored at the beginning
  2. has the word Won
  3. has a space
  4. allows 1 or more digits with backslash d or the character class [0-9]
  5. has another space
  6. has the word Oscar.
2 Likes

Thanks, It Solved the problem.