Cant complete Chapter 2: Basic Aggregation - Utility Stages Lab: Using Cursor-like Stages

This is the pipeline I came up with so far:

‘’'var pipeline = [{$match : {“tomatoes.viewer.rating” : {$gte : 3.0}}}, {$addFields : {num_favs: {$setIntersection: [[“Sandra Bullock”,“Tom Hanks”, “Julia Roberts”, “Kevin Spacey”, “George Clooney”], “$cast”]}}}, {$addFields : {num_favs : {$size : “$num_favs”}}}, {$sort : {“num_favs” : -1, “tomatoes.viewer.rating” : -1, “title” : -1}}, {$limit : 25}]

db.movies.aggregate(pipeline)’’’

I am getting the following error when executing the above pipeline:

uncaught exception: Error: command failed: {

“operationTime” : Timestamp(1620020717, 1),

“ok” : 0,

“errmsg” : “The argument to $size must be an array, but was of type: null”,

“code” : 17124,

“codeName” : “Location17124”,

“$clusterTime” : {

“clusterTime” : Timestamp(1620020717, 1),

“signature” : {

“hash” : BinData(0,“uGzb6vDpGxtcESLU1KCIkxm1pcg=”),

“keyId” : NumberLong(“6902062171803353090”)

}

}

} : aggregate failed :

_getErrorWithCode@src/mongo/shell/utils.js:25:13

doassert@src/mongo/shell/assert.js:18:14

_assertCommandWorked@src/mongo/shell/assert.js:639:17

assert.commandWorked@src/mongo/shell/assert.js:729:16

DB.prototype._runAggregate@src/mongo/shell/db.js:266:5

DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1058:12

@(shell):1:1

I also tried testing the pipeline by removing the last three stages and setintersection is not working. I am getting empty arrays in all fields.

That’s the way to go. B-)

What database are you using?

Try to replace $addFields with a $match that uses $in to see if you have any documents with the desired actors. Something like (untested):

{ $match :  { cast : { $in : [ ... actors ] } } }

Here is my new pipeline:

var pipeline = [{$match : {$and : [{“tomatoes.viewer.rating” : {$gte : 3.0}}, {cast : {$in : [“Sandra Bullock”, “Tom Hanks”, “Julia Roberts”, “Kevin Spacey”, “George Clooney”]}}]}}, {$addFields : {num_favs: {$setIntersection: [[“Sandra Bullock”,“Tom Hanks”, “Julia Roberts”, “Kevin Spacey”, “George Clooney”], “$cast”]}}}, {$addFields : {num_favs : {$size : “$num_favs”}}}, {$sort : {“num_favs” : -1, “tomatoes.viewer.rating” : -1, “title” : -1}}, {$limit : 25}]

now I am getting a result when aggregating but still not getting the right answer. I assume the 25th document corresponds to the first record in the result for which I get gravity. How do I get the right result?

You are missing one of the requirement, something like the release country.

This is my new pipeline. Still not getting the right result:

var pipeline = [{$match : {$and : [{“tomatoes.viewer.rating” : {$gte : 3.0}},{countries: {$in : [“USA”]}}, {cast : {$in : [“Sandra Bullock”, “Tom Hanks”, “Julia Roberts”, “Kevin Spacey”, “George Clooney”]}}]}}, {$addFields : {num_favs: {$setIntersection: [[“Sandra Bullock”,“Tom Hanks”, “Julia Roberts”, “Kevin Spacey”, “George Clooney”], “$cast”]}}}, {$addFields : {num_favs : {$size : “$num_favs”}}}, {$sort : {“num_favs” : -1, “tomatoes.viewer.rating” : -1, “title” : -1}}, {$limit : 25}]

Please republish your pipeline with proper formatting using triple back ticks or html pre or code tag. We cannot copy your pipeline directly on our system as the quotes are all wrong because it is normal text.

Hi @Daniel_Martinez,

Few things to check in your aggregation pipeline:

  1. In the $match stage, the comma-separated conditions are ANDed by default.

  2. For checking “movies released in USA” you can directly use the condition: "countries": "USA"

  3. You can pre-define an array favorites to reduce the complexity in your pipeline as below:

    var favorites = [ “Sandra Bullock”, “Tom Hanks”, “Julia Roberts”, “Kevin Spacey”, “George Clooney”]

  4. You can use only one $addField stage to find $size of num_favs field.

I would recommend checking the output of each stage before adding any subsequent stage. If you still face any issues, please feel free to reach out.

Kind Regards,
Sonali Mamgain

1 Like

I think the problem is that I shouldn’t have the expression {cast : {$in : favorites}} in match as the expectation is that I also get num_favs: 0 for some documents. The problem is that when I remove the expression and aggregate I get the following error

Error: command failed: {
   "operationTime" : Timestamp(1620885318, 1),
   "ok" : 0,
   "errmsg" : "The argument to $size must be an array, but was of type: null",
   "code" : 17124,
   "codeName" : "Location17124",
   "$clusterTime" : {
      "clusterTime" : Timestamp(1620885318, 1),
      "signature" : {
         "hash" : BinData(0,"90Ejp9zItF7OPD8BHdCBnv97kRw="),
            "keyId" : NumberLong("6931036270290272257")
         }
      }
   } : aggregate failed :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:18:14
_assertCommandWorked@src/mongo/shell/assert.js:639:17
assert.commandWorked@src/mongo/shell/assert.js:729:16
DB.prototype._runAggregate@src/mongo/shell/db.js:266:5
DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1058:12
@(shell):1:1error

That means set intersection is returning null for some fields when it shouldn’t. Help

Hi @Daniel_Martinez,

Using the condition cast: { $in: favorites } in the $match stage will ensure that you do not get null as a result of $setIntersection.

I would highly recommend tidying up your aggregation pipeline first and then debugging the same. I am able to get the right answer using the aggregation pipeline that you have shared.
Alternatively, you could use the condition to prevent null values in the $size operator. Refer to this example: https://docs.mongodb.com/manual/reference/operator/aggregation/size/#example

If you still face any issues, can you please share the edited pipeline?

Kind Regards,
Sonali Mamgain

1 Like

Hi everyone,
i got the same issue as Daniel: “errmsg” : “The argument to $size must be an array, but was of type: null”. The query i wrote is the following:
db.movies.aggregate([

		{
			$match: {"countries": {$all: ["USA"]},"tomatoes.viewer.rating": {$gte:3}, cast: { $in: ["Sandra Bullock", "Tom Hanks", "Julia Roberts","Kevin Spacey","George Clooney"]} }
		},
		{
			$addFields:{num_favs: {$size :{$setIntersection: ["$cast", ["Sandra Bullock", "Tom Hanks", "Julia Roberts","Kevin Spacey","George Clooney"]]}}}
		},
		{
			$match: {"num_favs": {$ne : []}}
		},
		{
			$sort: {num_favs:-1, tomatoes.viewer.rating:-1, title:-1}
		},
		{
			$skip: 24
		},
		{
			$limit: 1
		}])

My question is: why the error message is coming out only when i add the $sort stage and if run the query without it there is not error and i get the result?

Thanks in advance

First, the error means that $setIntersection does not output an array. Most likely because some documents do not have a $cast array.

Second, why the error with $sort? Because the $skip and $limit only applies the pipeline to the first 25 documents and they all the $cast array.

Most likely you can prune out none-existing $cast in the $match stage.

Hi steevej,
$setIntersection always output an array, if there is not element it outputs an empty array. As described in mongoDb doc “Takes two or more arrays and returns an array that contains the elements that appear in every input array.”.
I think that the error message said that size does not want an empty array.
My question was: Why i get the error message only if there is the $sort stage and i have not that messsage otherwise?
Strictly speaking:

$match: {"countries": {$all: ["USA"]},"tomatoes.viewer.rating": {$gte:3}, cast: { $in: ["Sandra Bullock", "Tom Hanks", "Julia Roberts","Kevin Spacey","George Clooney"]} }
	},
	{
		$addFields:{num_favs: {$size :{$setIntersection: ["$cast", ["Sandra Bullock", "Tom Hanks", "Julia Roberts","Kevin Spacey","George Clooney"]]}}}
	},
	{
		$match: {"num_favs": {$ne : []}}
	},
	{
		$sort: {num_favs:-1, tomatoes.viewer.rating:-1, title:-1}
	},
	{
		$skip: 24
	},
	{
		$limit: 1
	}])

Get the error message.

$match: {"countries": {$all: ["USA"]},"tomatoes.viewer.rating": {$gte:3}, cast: { $in: ["Sandra Bullock", "Tom Hanks", "Julia Roberts","Kevin Spacey","George Clooney"]} }
	},
	{
		$addFields:{num_favs: {$size :{$setIntersection: ["$cast", ["Sandra Bullock", "Tom Hanks", "Julia Roberts","Kevin Spacey","George Clooney"]]}}}
	},
	{
		$match: {"num_favs": {$ne : []}}
	},
	{
		$skip: 24
	},
	{
		$limit: 1
	}])

No error message and i get a response data.

Why this behavior?

Thanks

That would be a very bad API if the message:

meant

Some experiment starting with the documents:

{ _id: 1 }
{ _id: 2, array: [] }
{ _id: 3, array: [ 1 ] }
{ _id: 4, array: null }

We gonna $addFields to each documents one by one. The pipeline will be like

[ { '$match': { _id: 1 } },  // _id will be changed for each test case.
  { '$addFields': { size_of_array: { '$size': '$array' } } } ]

The results

// _id:1
The argument to $size must be an array, but was of type: missing
// _id:2
[ { _id: 2, array: [], size_of_array: 0 } ]
// _id:3
[ { _id: 3, array: [ 1 ], size_of_array: 1 } ]
// _id:4
The argument to $size must be an array, but was of type: null

So $size of an empty array returns 0, and generates the error message you get when its argument is null. So for some documents, $setIntersection must be null. Let’s try it using the same documents and the pipeline:

[ { '$addFields': { intersection: { '$setIntersection': [ '$array', [ 9 ] ] } } } ]

which provides the results:

{ _id: 1, intersection: null },
{ _id: 2, array: [], intersection: [] },
{ _id: 3, array: [ 1 ], intersection: [] },
{ _id: 4, array: null, intersection: null }

So my conclusions remain:

and

It looks like 1032 documents from the collection have cast:null or non existant.

I just need a place to express how frustrated this lab made me feel. I feel like I wasn’t taught the knowledge to complete it. It felt like intentional torture that made me feel miserable and unnecessarily tortured. I need a place to express my frustrations without being judged even more than forum to seek actual skills to do the labs.

1 Like

Couldn’t agree more. Have been sitting on this for the past one day with no results!

Hi @priyanka_prasad & @Katie_Guill, thank you for your feedback.
I will forward this feedback to the appropriate channel and please be rest assured that our team will be making a fix as soon as possible.
Having said that, we are working on a complete revamp of this course from scratch and will soon be opening it up for beta testing, please let us know if you would like to be a part of that beta program.

In case you have any doubts, please feel free to reach out to us.

Thanks and Regards.
Sourabh Bagrecha,
Curriculum Services Engineer

@SourabhBagrecha Thank you for taking this up!. Would be glad to be a part the beta program. Count me in!

1 Like

Hi @Sonali_Mamgain . Any idea whats wrong with my pipeline?

I have been struggling for a while the truth is.

For some reason i cannot reference the num_favs array in the $match stage for checking the cast.
I thought $addFields would create a “variable” with some value so we could use that variable into the following stages.
Isn’t that the case?

In your $sort stage you use tomatoes.viewer.rating, but this field is not present anymore since it is not part of the previous $project stage.

1 Like

You are right, but that’s not a solution. Thanks anyway.