Replace entire nested array in document

The data in this post is a hypothetical similar to a problem I’m dealing with.
I am working with Python using pymongo.
I have a document that looks something like this:

_id: ObjectId("123456789")
continent_name: "Europe"
continent_id: "001"
countries: Array
    country_name: "France"
    country_id: "011"
    cities: Array
        city_name: "Paris"
        city_id: "101"
        city_name: "Maseille"
        city_id: "102"
    country_name: "England"
    country_id: "012"
    cities: Array
        city_name: "London"
        city_id: "201"
        city_name: "Bath"
        city_id: "202"

I’ve run my data collection and received an array for cities in england, including cities already in the document, e.g. London, Bath, Manchester, Liverpool.
Given the nature of the data I am collecting, when I (infrequently) run my data collection, I want to either create a new country with its own cities aray, or replace the entire existing cities array with a new one, but have no idea how to go about it.

I received a suggestion that looks something like this:

  .updateOne({"continent_name": "Europe", "countries.country_name": "England", "countries.cities.city_name" :  {$ne: "Manchester"}}, 
             {$push: {"countries.$.cities": {"city_name": "Manchester", "city_id":"whatever"}}})

However this results in duplicate cities.
As I said, I need to replace the entire nested array if it already exists, or create a new continent/country object if it doesn’t.

I THINK what I need to do is something like the following:

    {"continent_name": "Europe", "countries_name": "England"},
    {$set: {"countries.cities.$[]": [ARRAY OF CITIES]},
    upsert: True

However I can’t figure out the correct dot notation in the $set section, receiving errors with everything I’ve tried.

Anyone know a potential solution?

If the already exists, replace the cities array.
If the continent exists but not the country, create the country with a new cities array.
If the continent doesn’t exist already, create a new document entirely.

Having spoken to a friend, I think the solution (although I’m unable to try right now) will be this:

{$set: {"countries.cities": [ARRAY OF CITIES]}

But to do a check for the CONTINENT first, creating if not exists, then a check for the COUNTRY next, creating if not exists. That was by the time we get to the $set step, I can guarantee both CONTINENT and COUNTRY exist.

Further testing has led me to this:

{"$set": {"countries.$[country].cities": [{"city_id": city["city_id"], "city_name": city["city_name"]} for city in data["cities"]]}},
array_filters=[{"country.country_id": data["country_id"]}]

However, this is creating duplicate country objects inside the continent, rather than replacing the city array inside the existing country object.


Upsert the continent document. This preserves existing sub-documents like country arrays if they already exist.

    {"$set": {
        "continent_id": data["continent_id"],
        "continent_name": data["continent_name"]

Upsert the country object. This preserves existing sub-documents like city arrays if they already exist.

    {"continent_id": data["continent_id"]},
    {"$set": {
        "countries": [
                "country_id": data["country_id"],
                "country_name": data["country_name"]

Upsert the cities objects. This time we’re using array filters to match the specific country object, the $setting the cities array.

    {"countries": {"$elemMatch": {"country_id": data["country_id"]}}},
    {"$set": {
        "countries.$[country].cities": [
                "city_id": city["city_id"],
                "city_name": city["city_name"]
            } for city in data["cities"]
    array_filters=[{"country.country_id": data["country_id"]}]

The result is exactly as desired, replace the entire cities array if it exists, or create everything required.