Filter for Null after Lookup

Hi there,
I am new to Mongo and I have a question regarding aggregation. How can I return a document that contains null value in the date field after running a lookup?

Here is my code;

pipeline = [
            {"$match": {'_id': user_id}},
            {"$lookup": {"from": "merchant", "localField": "merchantsid",
                         "foreignField": "_id", "as": "merchants"}},
            {"$project": {"email" : 1, "name" : 1,"merchants" : 1, "_id" : 0,
                           "merchants": {"$filter": { "input": "$merchants", "as": "merchant",
                                                    "cond": { "$eq": [ "merchant.date", "null"]}}}}},
            {"$sort": {"merchants.teid": -1}},
            ]
return await database.User.aggregate(pipeline).to_list(int(limit))

The above gives me an empty array but the desired result should be something like this;

[
    {
        "email": "johndoe@email.com",
        "name": "John Doe",
        "role": "user",
        "merchants": [
            {
                "_id": "913c5df4-581c-4fa5-9ba5-53c2f255d220",
                "teid": 906220009,
                "merchant_name": "SHOWCASE LTD",
                "model": "Gateway",
                "package": "1N5",
                "date": null,
                "created_at": "2023-01-22T21:44:07.789000",
                "updated_at": "2023-01-22T21:44:07.789000"
            }
        ]
    }
]

Any help is appreciated.

Thanks

I have found a solution but its not the best one. It requires that the null date field be filled with a default date value and filter against this.
At the moment if i try to filter by “” or “null” I get no documents.

Here the code but I am still seeking solutions that will filter for dates that have null value.

pipeline = [
            {"$match": {'_id': user_id}},
            {"$lookup": {"from": "merchant", "localField": "merchantsid",
                         "foreignField": "_id", "as": "merchants"}},
            { "$project": {"email": 1,"name": 1,
                    "merchantsid": {"$filter": {"input": "$merchants",
                    "as": "merchant",
                    "cond": { "$eq": [ "$merchant.date", "1970-01-01T00:00:00"]}}}}},
            {"$sort": {"merchants.teid": -1}},
            ]

The result is as follows;

[
    {
        "_id": "8aa6c533-d98b-4557-be97-a95990becf1c",
        "email": "JohnDoe@email.com",
        "name": "John Doe",
        "merchantsid": [
            {
                "_id": "4adaadf8-25d9-48dd-b7d3-2494b0d4006a",
                "teid": 3662200014,
                "merchant_name": "SHOWCASE LTD",
                "model": "Gateway",
                "package": "1N5",
                "date": "1970-01-01T00:00:00",
                "created_at": "2023-01-23T15:27:00.759000",
                "updated_at": "2023-01-23T15:27:00.759000"
            }
        ]
    }
]

For context, I have 2 collections, a user and merchant one. There merchant has a reference in the user collection. ie. A user can have many merchants. The lookup gets a user’s and any child merchants as an array. What I want to do now is get users’ who have merchants with null dates.

Null written like the following

is not null but a string with the text null.

Hi Steeve,

Thanks for your reply.
I have tried your suggestion but the result is an empty array.

Here is the result;

[
    {
        "_id": "8aa6c533-d98b-4557-be97-a95990becf1c",
        "email": "JohnDoe@email.com",
        "name": "Patrick Lou",
        "merchantsid": []
    }
]

I have also tried using type like so;

{"$eq": [ "merchant.date", {"$type":10} ]}

But I get the same result as above.

Any ideas why?

Thanks.

If your

it is not null anymore.

Share sample documents from both collections.

Your $sort stage is useless since the field merchants is not present anymore.

Hi Steve,

OK. Got it. I will remove $sort.

Here is the Merchant Collection;

[
    {
        "_id": "1c7844ac-ebbb-4520-afb2-6374a4f1a3ac",
        "teid": 366220001,
        "merchant_name": "Hartnells Fresh Food",
        "model": "Pro",
        "package": "1N5",
        "date": "2022-11-10T00:00:00",
        "created_at": "2023-01-22T15:56:10.746000",
        "updated_at": "2023-01-22T15:56:10.746000"
    },
    {
        "_id": "3b4a31f5-9fbd-448f-863a-ae1987678d03",
        "teid": 366220002,
        "merchant_name": "ABC LTD",
        "model": "Pro",
        "package": "456",
        "date": "2022-10-10T00:00:00",
        "created_at": "2023-01-22T15:56:42.010000",
        "updated_at": "2023-01-22T15:56:42.010000"
    },
    {
        "_id": "46c9af68-7361-4cc2-9ee4-6bf4a80da2b8",
        "teid": 366220003,
        "merchant_name": "SURVEY LTD",
        "model": "Pro",
        "package": "456",
        "date": "2022-09-10T00:00:00",
        "created_at": "2023-01-22T15:56:58.822000",
        "updated_at": "2023-01-22T15:56:58.822000"
    },
    {
        "_id": "913c5df4-581c-4fa5-9ba5-53c2f255d220",
        "teid": 366220009,
        "merchant_name": "SHOWCASE LTD",
        "model": "Gateway",
        "package": "123",
        "date": null,
        "created_at": "2023-01-22T21:44:07.789000",
        "updated_at": "2023-01-22T21:44:07.789000"
    },
    {
        "_id": "7c7584c0-19c9-4c6c-80a0-ee27cddc2b7e",
        "teid": 3662200011,
        "merchant_name": "JACINTA LTD",
        "model": "Gateway",
        "package": "123",
        "date": "1970-01-01T00:00:00",
        "created_at": "2023-01-23T13:18:01.415000",
        "updated_at": "2023-01-23T13:18:01.415000"
    },
    {
        "_id": "1d3bc919-4894-412e-9a67-e9450aefe598",
        "teid": 3662200012,
        "merchant_name": "Vimto LTD",
        "model": "Gateway",
        "package": "1N5",
        "date": null,
        "created_at": "2023-01-23T13:18:55.027000",
        "updated_at": "2023-01-23T13:18:55.027000"
    },
    {
        "_id": "4adaadf8-25d9-48dd-b7d3-2494b0d4006a",
        "teid": 3662200014,
        "merchant_name": "Drake Services LTD",
        "model": "Gateway",
        "package": "1N5",
        "date": "1970-01-01T00:00:00",
        "created_at": "2023-01-23T15:27:00.759000",
        "updated_at": "2023-01-23T15:27:00.759000"
    }
]

Here is the user collection;

[
{
        "_id": "8aa6c533-d98b-4557-be97-a95990becf1c",
        "email": "JohnDoe@email.com",
        "name": "John Doe",
        "disabled": false,
        "sellerid": 101,
        "role": "user",
        "created_at": "2023-01-22T15:41:26.645000",
        "updated_at": "2023-01-22T15:41:40.200000",
        "merchantsid": [
            "1c7844ac-ebbb-4520-afb2-6374a4f1a3ac",
            "46c9af68-7361-4cc2-9ee4-6bf4a80da2b8",
            "913c5df4-581c-4fa5-9ba5-53c2f255d220"
        ]
    }
]

Thanks.

Your logic was correct since the beginning but you are making a lot of syntax error.

As already mentioned "null" with quotes is not null as in your data it is the string with the 4 characters n, u, l and l.

Also, which gives me doubts about what you share when you write that

gives

It cannot since the syntax

is wrong. You are missing a $. It should be $$merchant.date, look at https://www.mongodb.com/docs/manual/reference/operator/aggregation/filter/. What is funny is that when you were using the string “null” you had no $ but you added one when testing with the special date. The correct cond: that will work should be:

cond: { $eq : [ "$$merchant.date" , null ] }

Now that I understand your intent you would be better off adding a pipeline: to your $lookup with a

{$match:{date:null}}

which should be more efficient than doing $filter once the $lookup is done.

Your $sort on teid in this $lookup pipeline will then make sense to sort the merchant within the merchantsid result.

Thanks for your solution.

I tried;
cond: { $eq : [ "$$merchant.date" , null ] }

I also tried:
{$match:{date:null}}

Both did not work. So I used them on Studio 3T for Mongo and it worked correctly. However, since I am using python the null value is encapsulated in quotes it does not work for the date field.

Here is the modified code again.

async def some_function(limit=100,):
    user_id = "8aa6c533-d98b-4557-be97-a95990becf1c"
    
    stage_match_user_id = {"$match": {"_id": user_id}}
      
    stage_lookup_related = {"$lookup": {"from": "merchant", "localField": "merchantsid", 
                            "foreignField": "_id", "as": "related_merchants"}}
      
    stage_project_required = {"$project": {"email": 1,"name": 1, 
                            "related": {"$filter": {"input": "$related_merchants",
                            "as": "merchant",
                            "cond": { "$eq" : [ "$$merchant.date" , "null" ]}}}}}
    
    pipeline = [stage_match_user_id,stage_lookup_related,stage_project_required]
  
    return await database.User.aggregate(pipeline).to_list(int(limit))

I used the same code above to fetch model or merchant_name and it worked as expected in Python. The only issue is that it does not work with dates in code.
I have stored the dates as datetime and I have used a model as shown below;

class MerchantBaseSchema(BaseModel):
    id: str = Field(default_factory=uuid.uuid4, alias="_id")
    teid: int
    merchant_name: str
    model: str
    package: str 
    date: dt | None = None
    created_at: dt | None = None
    updated_at: dt | None = None
        
    class Config:
        orm_mode = True
        allow_population_by_field_name = True
        arbitrary_types_allowed = True
        schema_extra = {
                        "example": {
                                "teid": 366220001,
                                "merchant_name": "Hartnells Fresh Food",
                                "model": "Pro",
                                "package": "1N5",
                                "date": "2022-11-10T00:00:00",
                                "created_at": "2022-11-10T00:00:00",
                                "updated_at": "2022-11-10T00:00:00",
                                }
                        }

The issue seems to lie in python I think. Your code works when used elsewhere but not in python.

Do you have any idea why this is the case?

Patrick

I have found the answer.

In python null is represented as None.
There if i change null to None it works like so.

"cond": { "$eq" : [ "$$merchant.date" , None]}}}}}

Oh my. Why didn’t I think of this.

Thanks for your help and validating the code.

It would have been good to know that you are using python right from the start.

In python you have to use None rather than null. What is funny is that you seem to know that because in your schema you have

So in python, your cond: has to be

"cond": { "$eq" : [ "$merchant.date" , None ]}

Like I wrote

But make it date:None since you are using python.

Thanks Steve,

Sorry I didn’t say what language I was using. Now I know. I will add the match pipeline to make it more efficient. Thank you for your prompt replies.

1 Like

For completeness here is the final code taking into consideration @steevej advice on efficiency;

It works well now. Thanks again Steevej.

stage_match_user_id = {"$match": {"_id": user_id}}
  
stage_match_null_related = {"$lookup": {"from": "merchant",
                        "pipeline": [{"$match":{"date": None}, }], 
                        "localField": "merchantsid",
                        "foreignField": "_id", "as": "related_merchants"}}
stage_project_required = {"$project": {"email": 1,"name": 1, "related_merchants": 1}}
  
pipeline = [stage_match_user_id, stage_match_null_related, stage_project_required]
  
return await database.User.aggregate(pipeline).to_list(int(limit)) 
1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.