M220P Migration Ticket

I an unsure as to wether I understand the ticket so I am asking for a pointer…

The ticket states

# TODO: Create the proper predicate and projection
# add a predicate that checks that the "lastupdated" field exists, and then
# checks that its type is a string
# a projection is not required, but may help reduce the amount of data sent
# over the wire!

So I created my predicate and projection as follows:

predicate = {"$lastupdated": {"$exists" : True, "$type": "string"}}
projection = {"_id": 1, "lastupdated": 1}

The predicate checks for the field lastupdated and then makes sure it exists and is of type string.

I then use projection to project the _id and the field lastupdated…

Then in the update I use the set operator for the lastupdated field.

My question is how do I run this… Do I now run conda activate mflix from the mflix-python folder… Then cd into the migration folder and run the script? Once the script runs successfully, do I then run the pytest -m migration test inside the migration folder or do I run it in the mflix-python folder?

Hey @David_Thompson,

Yes, activate the conda environment and then run the movie_last_updated_migration.py in the migration folder using the command:

python movie_last_updated_migration.py

Once you see the script running successfully with no errors, you can run the unit tests the same way you ran for previous labs. The command for this is:

pytest -m migration

Once the tests pass, run the application and proceed to the status page for the validation code.

Let us know if you are facing any issues when doing this.

Regards,
Satyam

1 Like

Thanks… I am getting an error at line 35… It is the for loop after the query…

This leads me to believe my query is wrong… But I am unsure where it is wrong…

This is my predicate and projection

predicate = {"lastupdated": {"$exists" : True, "$type": "string"}}
projection = { "_id": 0, "lastupdated": 1 }

I believe it is correct… It identifies the field to check “lastupdated” and uses the “$exists” : True to check if it exists, then it checks the field type by using “$type” : “string”…

I believe I did this right…

When I run the script this is what I get

(mflix) bossmandave@Davids-MacBook-Pro-2 migrations % python3 movie_last_updated_migration.py
Traceback (most recent call last):
  File "movie_last_updated_migration.py", line 35, in <module>
    for doc in cursor:
  File "/usr/local/lib/python3.8/site-packages/pymongo/cursor.py", line 1189, in next
    if len(self.__data) or self._refresh():
  File "/usr/local/lib/python3.8/site-packages/pymongo/cursor.py", line 1087, in _refresh
    self.__session = self.__collection.database.client._ensure_session()
  File "/usr/local/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1563, in _ensure_session
    return self.__start_session(True, causal_consistency=False)
  File "/usr/local/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1516, in __start_session
    server_session = self._get_server_session()
  File "/usr/local/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1549, in _get_server_session
    return self._topology.get_server_session()
  File "/usr/local/lib/python3.8/site-packages/pymongo/topology.py", line 424, in get_server_session
    self._select_servers_loop(
  File "/usr/local/lib/python3.8/site-packages/pymongo/topology.py", line 198, in _select_servers_loop
    raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: localhost:27017: [Errno 61] Connection refused
(mflix) bossmandave@Davids-MacBook-Pro-2 migrations % 

What connection am I supposed to use? Is this the cluster connection string?

# ensure you update your host information below!
host = "mongodb://127.0.0.1:5000"

I currently have it set to my local host that I open when I run python3 run.py

------------UPDATE------------

I fixed the host issue, but when I run the script it get the following output:

(mflix) bossmandave@Davids-MacBook-Pro-2 mflix-python % python3 migrations/movie_last_updated_migration.py
23530 documents to migrate
Cannot encode object: {'lastupdated'}
(mflix) bossmandave@Davids-MacBook-Pro-2 mflix-python % 

I thought that the program ran successfully, but when I ran pytest -m migration, it fails…

Got the migration to work… I ran it once and it changed the documents… But the pytest -m migration fails…

I ran the script again it passes but doesn’t modify any documents… Test still fails… Here is my output:

(mflix) bossmandave@Davids-MacBook-Pro-2 mflix-python % python3 migrations/movie_last_updated_migration.py
23530 documents to migrate
23529 documents updated
(mflix) bossmandave@Davids-MacBook-Pro-2 mflix-python % pytest -m migration                               
============================================== test session starts ==============================================
platform darwin -- Python 3.8.12, pytest-3.3.0, py-1.8.0, pluggy-0.6.0
rootdir: /Users/bossmandave/Desktop/mflix-python, inifile: pytest.ini
plugins: flask-0.11.0
collected 43 items                                                                                              

tests/test_migration.py F                                                                                 [100%]

=================================================== FAILURES ====================================================
_______________________________________________ test_proper_type ________________________________________________

client = <FlaskClient <Flask 'mflix.factory'>>

    @pytest.mark.migration
    def test_proper_type(client):
        result = get_movie("573a13b8f29313caabd4c8c5")
>       assert isinstance(result.get('lastupdated'), datetime)
E       AssertionError: assert False
E        +  where False = isinstance('2015-09-06 00:17:54.620000000', datetime)
E        +    where '2015-09-06 00:17:54.620000000' = <built-in method get of dict object at 0x111a1e640>('lastupdated')
E        +      where <built-in method get of dict object at 0x111a1e640> = {'_id': ObjectId('573a13b8f29313caabd4c8c5'), 'awards': {'nominations': 29, 'text': '5 wins & 29 nominations.', 'wins'...11, 13, 28), 'email': 'carice_van_houten@gameofthron.es', 'movie_id': ObjectId('573a13b8f29313caabd4c8c5'), ...}], ...}.get

tests/test_migration.py:15: AssertionError
============================================== 42 tests deselected ==============================================
==================================== 1 failed, 42 deselected in 0.35 seconds ====================================
(mflix) bossmandave@Davids-MacBook-Pro-2 mflix-python % python3 migrations/movie_last_updated_migration.py
23530 documents to migrate
0 documents updated
(mflix) bossmandave@Davids-MacBook-Pro-2 mflix-python % 

@David_Thompson,

Please check this post for the Migration Ticket. It should be able to resolve your error: Ticket: Migration

Regards,
Satyam

Thanks… I ran it under the mflix-python directory… Got the following:

(mflix) bossmandave@Davids-MacBook-Pro-2 mflix-python % python3 migrations/movie_last_updated_migration.py
Traceback (most recent call last):
  File "migrations/movie_last_updated_migration.py", line 35, in <module>
    for doc in cursor:
  File "/usr/local/lib/python3.8/site-packages/pymongo/cursor.py", line 1189, in next
    if len(self.__data) or self._refresh():
  File "/usr/local/lib/python3.8/site-packages/pymongo/cursor.py", line 1087, in _refresh
    self.__session = self.__collection.database.client._ensure_session()
  File "/usr/local/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1563, in _ensure_session
    return self.__start_session(True, causal_consistency=False)
  File "/usr/local/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1516, in __start_session
    server_session = self._get_server_session()
  File "/usr/local/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1549, in _get_server_session
    return self._topology.get_server_session()
  File "/usr/local/lib/python3.8/site-packages/pymongo/topology.py", line 424, in get_server_session
    self._select_servers_loop(
  File "/usr/local/lib/python3.8/site-packages/pymongo/topology.py", line 198, in _select_servers_loop
    raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: localhost:27017: [Errno 61] Connection refused
(mflix) bossmandave@Davids-MacBook-Pro-2 mflix-python % 

As far as checking the datatype of the lastupdated field… I believe it did in the predicate… Can you verify?

predicate = {"lastupdated": {"$exists" : True, "$type": "string"}}
projection = { "_id": 0, "lastupdated": 1 }

Hey @David_Thompson,

From the terminal screenshot, Your script movie_last_updated_migration.py is failing because it is trying to connect to localhost:27017 :

pymongo.errors.ServerSelectionTimeoutError: localhost:27017: [Errno 61] Connection refused

In the migration file movie_last_updated_migration.py , line 17 sets the URI string:

16 # ensure you update your host information below!
17 host = "mongodb://localhost:27017"

To use your Atlas cluster, replace this string with the URI for your Atlas cluster, so that a connection can be established.

Yes I found that error and corrected it… I got it to connect, and I got the migration to pass… (See above)
But the test pytest -m migration fails…

If I run the script again, it registers the records, but there are 0 modified… I checked my mflix movie collection and the lastupdated field is still a string…

In the for loop that modifies the lastupdated field, I believe the parse changes it to the Date class date (ISODate).

In the bulk update, I formatted it like this {’$set’ : { ‘lastupdated’ : lastupdated}.

-------------UPDATE------------

Now I really need help… I changed my bulk update method to { ‘$set’ : { ‘lastupdated’ : ‘$lastupdated’}}.

Now the field has $lastupdated in it… LOL

I need to drop and reload the movies collection… Can I get a link on how to do this?

@Satyam ,
I messed up the data on my cluster… I changed my set to set lastupdated to “$lastupdated” and that is in every field now… Do I just drop mflix and reload the entire cluster?
------------UPDATE----------

Fixed it … I just reloaded the sample dataset

1 Like
bossmandave@Davids-MacBook-Pro-2 mflix-python % conda activate mflix
(mflix) bossmandave@Davids-MacBook-Pro-2 mflix-python % python3 migrations/movie_last_updated_migration.py
23530 documents to migrate
23529 documents updated
(mflix) bossmandave@Davids-MacBook-Pro-2 mflix-python % pytest -m migration                               
============================================== test session starts ==============================================
platform darwin -- Python 3.8.12, pytest-3.3.0, py-1.8.0, pluggy-0.6.0
rootdir: /Users/bossmandave/Desktop/mflix-python, inifile: pytest.ini
plugins: flask-0.11.0
collected 43 items                                                                                              

tests/test_migration.py F                                                                                 [100%]

=================================================== FAILURES ====================================================
_______________________________________________ test_proper_type ________________________________________________

client = <FlaskClient <Flask 'mflix.factory'>>

    @pytest.mark.migration
    def test_proper_type(client):
        result = get_movie("573a13b8f29313caabd4c8c5")
>       assert isinstance(result.get('lastupdated'), datetime)
E       AssertionError: assert False
E        +  where False = isinstance('2015-09-12 09:51:28.903000000', datetime)
E        +    where '2015-09-12 09:51:28.903000000' = <built-in method get of dict object at 0x10f60f440>('lastupdated')
E        +      where <built-in method get of dict object at 0x10f60f440> = {'_id': ObjectId('573a13b8f29313caabd4c8c5'), 'awards': {'nominations': 29, 'text': '5 wins & 29 nominations.', 'wins'...11, 13, 28), 'email': 'carice_van_houten@gameofthron.es', 'movie_id': ObjectId('573a13b8f29313caabd4c8c5'), ...}], ...}.get

tests/test_migration.py:15: AssertionError
============================================== 42 tests deselected ==============================================
==================================== 1 failed, 42 deselected in 0.31 seconds ====================================
(mflix) bossmandave@Davids-MacBook-Pro-2 mflix-python % 

Output from running the script and then running the test… Don’t know why it failed as it modified all but one document… However, if I go to the cluster, all documents still have the lastupdated as type string… Not sure if this is expected…

Where do I go from here?

Hey @David_Thompson,

Your test failure is related to the type of variable, the assertion checks to see if it is a DateTime. The test fails because the field being tested isn’t updated to this type.

The next step is to modify your code to then re-run the test. Can you share your code here so that we can see if something’s not correct in it?

# ensure you update your host information below!
# host = "https://127.0.0.1:5000"
host = "mongodb+srv://m220student:m220password@mflix.bcxn7.mongodb.net/"

# don't update this information
MFLIX_DB_NAME = "sample_mflix"
mflix = MongoClient(host)[MFLIX_DB_NAME]

# TODO: Create the proper predicate and projection
# add a predicate that checks that the "lastupdated" field exists, and then
# checks that its type is a string
# a projection is not required, but may help reduce the amount of data sent
# over the wire!
predicate = {"lastupdated": {"$exists": "True", "$type": "string"}}
projection = { "_id": 1, "lastupdated": 1 }

cursor = mflix.movies.find(predicate, projection)

# this will transform the "lastupdated" field to an ISODate() from a string
movies_to_migrate = []
for doc in cursor:
    doc_id = doc.get('_id')
    lastupdated = doc.get('lastupdated', None)
    movies_to_migrate.append(
        {
            "doc_id": ObjectId(doc_id),
            "lastupdated": parser.parse(lastupdated)
        }
    )

print(f"{len(movies_to_migrate)} documents to migrate")

try:
    # TODO: Complete the UpdateOne statement below
    # build the UpdateOne so it updates the "lastupdated" field to contain
    # the new ISODate() type
    bulk_updates = [UpdateOne(
        {"_id": movie.get("doc_id")},
        {"$set": {"lastupdated": lastupdated}}
    ) for movie in movies_to_migrate]

    # here's where the bulk operation is sent to MongoDB
    bulk_results = mflix.movies.bulk_write(bulk_updates)
    print(f"{bulk_results.modified_count} documents updated")

except InvalidOperation:
    print("no updates necessary")
except Exception as e:
    print(str(e))

in lines 34 to 44 I didn’t change anything because the directions didn’t tell me to … The only other thing I think I may have done is in the set command…

If I understand the code right the ‘lastupdated’ : parser.parse(lastupdated) should change the field to a Date object…

@David_Thompson,

  1. Can you add a print statement after line 21 after we declared mflix, like this:
    mflix = MongoClient(host)[MFLIX_DB_NAME]
    print(mflix.list_collection_names())
    
    
    This is to make sure we are using the correct collection name since you said in Atlas, that all are still showing the type string. Do check once you run the script that, the correct name is being printed.
  2. Remove " " from True in Predicate. It would be like this:
    predicate = { "lastupdated": {"$exists":True ,"$type":"string"}}
    projection = {"lastupdated":1,"_id":1}
    
    
  3. Change your bulk_updates to this:
     bulk_updates = [UpdateOne(
         {"_id": movie.get("doc_id")},
         {"$set": {"lastupdated": movie.get("lastupdated")}}
     ) for movie in movies_to_migrate]
    
     # here's where the bulk operation is sent to MongoDB
     bulk_results = mflix.movies.bulk_write(bulk_updates)
     print(f"{bulk_results.modified_count} documents updated")
    
    
    Notice the movie.get(“lastupdated”) in $set.

Hopefully, this should help, otherwise also don’t worry, we should be able to figure something out from the next round of errors. :nerd_face:

1 Like

adding the print statement after declaring mflix results in an error.

(mflix) bossmandave@Davids-MacBook-Pro-2 mflix-python % python3 migrations/movie_last_updated_migration.py
Traceback (most recent call last):
  File "migrations/movie_last_updated_migration.py", line 23, in <module>
    print(mflix.list_collecion_names())
  File "/usr/local/lib/python3.8/site-packages/pymongo/collection.py", line 3313, in __call__
    raise TypeError("'Collection' object is not callable. If you "
TypeError: 'Collection' object is not callable. If you meant to call the 'list_collecion_names' method on a 'Database' object it is failing because no such method exists.

So I took as stab that the problem was in the " " around True in line 30 and I changed line 55 to {’$set’ : {‘lastupdated’: movie.get(‘lastupdated’)}}

Saved it and ran the script again.

(mflix) bossmandave@Davids-MacBook-Pro-2 mflix-python % python3 migrations/movie_last_updated_migration.py
23530 documents to migrate
23530 documents updated

Because the documents now matched (23530 to migrate and 23530 updated) I ran pytest -m migration.

It passes now.


(mflix) bossmandave@Davids-MacBook-Pro-2 mflix-python % pytest -m migration                               
============================================== test session starts ==============================================
platform darwin -- Python 3.8.12, pytest-3.3.0, py-1.8.0, pluggy-0.6.0
rootdir: /Users/bossmandave/Desktop/mflix-python, inifile: pytest.ini
plugins: flask-0.11.0
collected 43 items                                                                                              

tests/test_migration.py .                                                                                 [100%]

============================================== 42 tests deselected ==============================================
==================================== 1 passed, 42 deselected in 0.25 seconds ====================================
(mflix) bossmandave@Davids-MacBook-Pro-2 mflix-python % 

Question… Where I went wrong in the movies.get(‘lastupdated’) area is that I didn’t know what to access… It is bulk_updates… and also movies_to_migrate

When do I know to access the field by just the collection name? movie.get?

1 Like

Hey @David_Thompson,

Super happy to see the ticket’s test cases passed! :dizzy:

The movie.get() method returns the value of the item with the specified key, hence we are using this here. Bulk Writes has also been discussed in the lecture preceding this lab as well. I am also linking some documentation for you to cement and further increase your knowledge about these:

  1. MongoDB BulkWrite
  2. UpdateOne
  3. Pymongo Bulk Write Operations

Feel free to reach out for anything else as well.

Regards,
Satyam

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.