Conditional $lookup

Is there a way to do a conditional $lookup?

I have records in a collection that have an optional (null) field. I’d rather not waste processing on the records for which the lookup field is null. Not to mention that, depending on indexing, doing a $lookup on a null field could still be expensive.

With this $lookup syntax,

{
   $lookup:
     {
       from: <collection to join>,
       let: { <var_1>: <expression>, …, <var_n>: <expression> },
       pipeline: [ <pipeline to execute on the collection to join> ],
       as: <output array field>
     }
}

you can specify condition(s) on the join in the pipeline. In your case, that would be a null check condition (in addition to the join conditions).

Yes, the problem with the solution as you posted is this bug:
https://jira.mongodb.org/browse/SERVER-40362

So when you get to nulls on a lookup join, the query runs SLOWWWWW. Now, one thing I could do is an ifNull, and then change the null to be a value that I am certain won’t be in in the lookup collection… but this seems… jenky. So, was hoping for alternative solution.

You can try using a $match before the $lookup, to exclude the documents with the non-existing field:

{ $match: { someField: { $exists: true } } }

The alternative workaround to use the $ifNull to substitute a null for non-existing field is fine too. Substitute let: { someField : '$someField' } with let: { someField: { $ifNull: ["$someField", null ] } }

In applications it is not unusual to have unusual cases - data or logic. It just happens this case is a program shortcoming (and temporary only). It is not that bad to have a case by covering with an additional condition, if it helps running the program efficiently. Some documentation around this workaround helps for reference and apply the issue fix later on when available.

1 Like

Hello. This conditional lookup does not work for me
MongoDB version: 5
GUI: robo3t-snap


updatedFieldsByContractor.updatedValues.cities there is not in the original document. it just appears here because of that lookup. I mean when i run this command: db.getCollection('contractors').find({ _id: ObjectId('htat document') }) to get the same document it had not updatedFieldsByContractor.updatedValues.cities field. what should i do to lookup conventionally?

Hello @Kasir_Barati, you cannot refer the contractors collection document fields directly in the $lookup's pipeline $match stage. Please refer the above linked MongoDB documentation for the $lookup and find correct syntax and usage.

1 Like

@Kasir_Barati, was @Prasad_Saya’s answer help you solve your issue? If it has, please mark the post as the solution.

This is not my question. Therefore I cannot select it as the answer

1 Like

Here I am, a few years later. The original answer is not what I’m looking for. I don’t want to filter out results. I want the entire result set, but only do lookups on certain records.

For example, collection fruit and child_fruit. If I do db.child_fruit.aggregate(pipeline), I only want child_fruit records to perform $lookup against fruit collection if child_fruit HAS a value for a field “fruit_name”. For example:

##fruit collection
db.fruit.insert({name: ‘Orange’});

##child_fruit collection
db.child_fruit.insert({name: ‘Something’, fruit_name: ‘Orange’});
db.child_fruit.insert({name: ‘Another’});

db.child_fruit.aggregate([
{$lookup: {
from: ‘fruit’,

}}
]);

The problem is, given that I still want the “Another” record back in my final result set, I don’t believe there is a way to “ignore” the $lookup for child_fruit records without “fruit_name” field…

In this small example it’s irrelevant, but when I’m querying millions of records, many of which do not have a fruit_name field, it makes a huge difference. I want to just “bypass” the lookup operation in this case for these records. Maybe @Asya_Kamsky has an idea, she is a genius.

One idea that comes to mind is that you could use $facet.

One will start with

$match : { "fruit_name" : { "$exists" : 1 } }

and do the $lookup while the other will start with

$match : { "fruit_name" : { "$exists" : 0 } }

The issue I see is that each facet is limited to 16Gb which may or may not be an issue.

The other thing you could try is do 2 aggregations in a transaction. But I am not sure it is possible to do aggregations inside transactions.

Why 2 normal aggregations do not work?

Have you looked at the new $lookup variants with which you can have conditions and sub-pipeline?

Because the example is part of a much larger aggregation doing other pipeline operations. So running it as two separate aggregations is much less optimal. And yes, $facet hitting that limit would be an issue (it’s 16mb, not Gb… Gb would be great lol).

But I appreciate the thought!

It would be awesome if there was a way to only conditionally $lookup. Oh well!

1 Like

$facet is definitely not the way to go. However, if the field you are doing $lookup on is an empty array then it won’t match anything in the foreign collection (because array is used sort of as {$in:[]} which matches nothing.

So the solution to your problem may just be to fill in something like: {$set:{fruit_name:{$ifNull:["$fruit_name", [] ]}}} before the $lookup and the empty [] as localField won’t match anything, unlike null or missing field.

Asya

2 Likes

And since $unionWith was added in 4.4 you can also do two aggregations within the same aggregation pipeline, btw.

Asya

2 Likes

Awesome, thank you, I’ll give that a go. Isn’t this a more common occurrence? I would think that doing a $lookup against a related collection defined by an optional field happens quite a bit, so it seems to be there should be a more straightforward way to do this? Or am I just bonkers?

Thanks again.

1 Like

i am stuck with same case where i am trying to do a lookup with same collection as ben but i have the fruit_name fields already as an empty array [].
but it doesnt returns the another document

my aggregate query
db.child_fruits.aggregate([
{ $match: {
_id: ObjectId(‘624c577bc98e24f398b012ad’)
}
},
{ $lookup: {
from: ‘fruit’,
localField: ‘fruit_name’,
foreignField: ‘name’,
as: ‘fruitDetails’
}
},
{ $unwind: { path: ‘$fruitDetails’ } }
])