M220 Chapter 1 Driver Setup Ticket

I am looking at the query in the notebook and also online in WP schools… I have some questions.

First… in the notebook they set the movies variable in the connection string… Where I am confused is according to notebook, I don’t need to stipulate any aggregation pipeline. (Hence I shouldn’t have to use &match)… It appears that python assumes the match portion as well as the projection as evident below.

# find all movies with Salma Hayek, but only project the "_id" and "title" fields
cursor = movies.find( { "cast": "Salma Hayek" }, { "title": 1 } )
print(dumps(cursor, indent=2))

Where I am confused is in the return statement in the db.py file.

return list(db.movies.find()).limit(1)

In the notebook mflix.movies is assigned to the variable movies…

My question is do I need to start my query with any of the following?

I am currently using this query

movie = {"countries": {"$in": countries}}, {"title": 1}

Do I need to start the query with
movie = db.movies.find(…), It seems a little redundant since it is in the return statement. I should just be able to define the query…

or perhaps movie = db.movies.aggregate([…])??

Where can I go to read on how to do this, or am I overthinking this?

Hi @David_Thompson,

This assumption is not correct. I believe you are getting confused between MongoDB Query Language(MQL) and MongoDB Aggregation. The notebook as well as this function are using the .find() operator of MQL. As you proceed with the course, in other labs, you will learn and implement aggregation pipelines as well.

Python functions will always have a return statement, unlike the jupyter notebook code. The task here is to write the query inside the find() of this statement so that it is a valid query. You can certainly, as a Python Developer, explore other ways too, like writing the query, assigning it to a variable, and then return it but just keep in mind how the function’s return statement wants the result ie, the result of the query as a list and limit 1.

I would advise you to properly read the lab instructions as well as the commented-out portions of the db.py functions to properly understand the tasks.

Let us know if the doubt still persists.

Regards,
Satyam

Ok… My understanding of the ticket is I am matching the “countries” array to the contents of the countries imput…

The ticket states I can use the $in: operator to look for multiple entries in the countries variable. I am then using $project: to project the title and _id field.

My question is I think I am matching from two arrays. Does this mean I can use one $in: operator or to I need to use the $in: operator twice?

If its twice, can I combine the statements into one $in: statement? Like this

{"$match": {"$in": {“countries”, countries}}}

This would match the contents of the countries array in mongo with the countries variable. Or do I need to separate them out?

{"$match": {"$in": {“countries”}}: {"$in":{countries}}}

Same syntax as the other courses you took.

See https://docs.mongodb.com/manual/reference/operator/query/in/

@Satyam ,
Thanks… i see my error… []…

1 Like

@steevej ,
So… Correct me if I am wrong but I am going to write my query and break it down… Please tell me where I am going wrong so I can. see my error.

I want to match the countries array with the coma separated items in the variable string countries.

movie = {"$match"": “countries” : {"$in": { [ countries ] } } }

This should match the “countries” in the record with the contents in the countries variable… I cannot get an output in the db.py file once it is saved and I run pytests -m projection. (I just noticed it may be because I am missing the projection portion.). Is there a way to break down the query and ensure that I am writing the match portion correctly?

One other question… Do I need to use elematch to match the elements of the variable to the field?

My current query is:

return list(db.movies.find([{
            "$match": {"countries": {"$in": {[countries]}}}},
                                    {"$project": {"$title": 1}}])).limit(1)

Do I need to switch $match with $elematch?

Are you doing an aggregation or calling find()?

If find(), them specify $match is wrong.

If aggregation, then your $match does not follow the $match syntax.

You have

{ "$match"" : "countries" ... }

while you should have

{ "$match" : { "countries" : ... } }

So you have 2 things wrong. You have an extra double quote. The query part of the match is not in an object. Queries are JSON documents and must follow JSON syntax.

As for the $in part. The first example from the documentation I provided reads:

field: { $in: [<value1>, <value2>, ... <valueN> ] }

but you have

"countries" : { "$in" : { [ countries ] } } 

The curly braces around [ countries ] does not match the syntax. Also, if countries, the variable inside the brackets is already an array then you should not put brackets around it.

You edited your post while I was replying so some of the things became more clear.

You are calling find(), so you do not use $match.

@steevej ,
I don’t want the answer, but I need some advise. I have looked at the link you sent, along with going onto compass and doing the query (minus the variable) and looking at it in Python 3. Here is my issue.

I know that I can use the $in operator to look at the comma separated list of variables contained in the countries argument. I also know that $in should be used to look into an array (if I remember right).

Where I get confused is whether I can combine the array I need to match with the $in operator for the variable. It would change the syntax considerably for the query.

I am also confused as to whether I need to enclose the $in operator for the variable in [] or {}… It is technically a comma separated list (not an array)… or does Python treat lists as arrays?

This is my current query

return list(db.movies.find({"countries": {"$in":[countries]}},{"$title":1})).limit(1)

This does not return anything… I get a len() error… This leads me to believe that I am not returning on the querry…

Where am confused is from what I remember… to match an entry in an array, you need to use the $in operator… That changes the structure of the query and confuses me on how to structure the $in statement since it needs to match an item in an array to a list… Is it just {"$in": [“countries”, countries]}?

No. But did you tried it? Sometimes just trying something is more useful than asking.

You should not put brackets around it means

{"$in":countries}

rather than

{"$in":[countries]}

I have looked at this all the ways I know how… I have even entered the query onto Compass and exported it to Python3… Entering the query in Compass (I used the countries in the py.test) I get the following.

Exporting it to Python3 I get

Now… Adding the other test (“Russia”, “Japan”) I get the following

Exporting this to Python 3…

So… Logically all I should have to do is to replace the “Kosovo”, “Russia”, “Japan” respectively with the variable name countries. This doesn’t work.

return list(db.movies.find({"countries": {"$in":[countries]}}, {"title": 1})).limit(1)

This returns the following when I run pytest -m projection

(mflix) bossmandave@Davids-MacBook-Pro mflix-python % pytest -m projection
==================================================================== test session starts =====================================================================
platform darwin -- Python 3.8.12, pytest-3.3.0, py-1.8.0, pluggy-0.6.0
rootdir: /Users/bossmandave/Desktop/mflix-python, inifile: pytest.ini
plugins: flask-0.11.0
collected 43 items                                                                                                                                           

tests/test_projection.py FF                                                                                                                            [100%]

========================================================================== FAILURES ==========================================================================
________________________________________________________________ test_basic_country_search_db ________________________________________________________________

client = <FlaskClient <Flask 'mflix.factory'>>

    @pytest.mark.projection
    def test_basic_country_search_db(client):
        countries = ['Kosovo']
        result = get_movies_by_country(countries)
>       assert len(result) == 2
E       TypeError: object of type 'AttributeError' has no len()

tests/test_projection.py:15: TypeError
_____________________________________________________________ test_basic_country_search_shape_db _____________________________________________________________

client = <FlaskClient <Flask 'mflix.factory'>>

    @pytest.mark.projection
    def test_basic_country_search_shape_db(client):
        countries = ['Russia', 'Japan']
        result = get_movies_by_country(countries)
>       assert len(result) == 1237
E       TypeError: object of type 'AttributeError' has no len()

tests/test_projection.py:22: TypeError
==================================================================== 41 tests deselected =====================================================================
========================================================== 2 failed, 41 deselected in 2.80 seconds ===========================================================
(mflix) bossmandave@Davids-MacBook-Pro mflix-python % 

I changed the [ ] to { } and I get the same result… Thoughts?

Oh as a side note I am running the test from the terminal inside the VS Code program… It shouldn’t matter since I have the whole project folder inside the program.

Hey @David_Thompson,

Kindly remove the square brackets around countries variable in your query, after doing so, your return statement should become:

return list(db.movies.find({'countries': {'$in': countries}},{'title':1}))

Regards,
Satyam

@Satyam ,
I changed it and re ran pytest -m projection… It still fails

This

and this

Is the same recommendation.

Absolutely not. You see, in your code you receive countries (the variable) exactly equalts to [ “Kosovo” ], an array of 1 value. The brackets are already there.

Now your issue is that it looks like you missed the following from the instructions in the TODO comment.

Do not include a limit in your ownimplementation, it is included here to avoid sending 46000 documents down the wire.

But your return has a limit, but

But yours is

return list(db.movies.find({'countries': {'$in': countries}},{'title':1})).limit(1)
1 Like

@Satyam ,
That limit was in the original return atatement

I didn’t add it. It was in the original db.py file.

@David_Thompson,
Yes, originally the limit(1) has been put to avoid sending many documents, as written in the comments of the function and rightly pointed out by @steevej.

Kindly try removing the limit() in your implementation and then try running.

~Satyam

@Satyam , @steevej ,
I got it… I misunderstood the directions. I thought that the .limit(1) was supposed to be there because it was in the original file. LOL (that was how the .net course was structured)…

Thank you for your patience… I will try and interpret the instructions better (although, in my opinion, it could be explained better). Also, the examples contained in the notebook are completely useless. (they don’t accurately reflect the MQL syntax)…

Just pointing that out…

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.