How to use variables with mongodb aggregate $lookup's from field

how can i join collections based on a field value?
this is my collection named “content” which actually just has references to “audios” and “videos” collections

[
  {
    createdAt: "2022-01-24T18:46:28.781+00:00",
    refId: ObjectId("6262d0d9302ff436aa2028f3"),
    collection: "audios",
    ...
  },
  {
    createdAt: "2022-01-25T18:46:28.781+00:00",
    refId: ObjectId("625007153e8aaaa2b275f475"),
    collection: "videos",
    ...
  },
]

what i want is just joining collections using the “collection” field value in $lookup’s from field like this

db.content.aggregate([
    {
      $lookup: {
        from: '$collection',
        let: { col: '$refId' },
        pipeline: [
          { $match: { $expr: { $eq: ['$_id', '$$col'] } } },
          { $project: { _id: 0, title: 1, description: 1 } },
        ],
        as: 'col',
      },
    },
   { $unwind: { path: '$col', preserveNullAndEmptyArrays: true } },
])

what i was expecting as a results is:

[
  {
    createdAt: "2022-01-24T18:46:28.781+00:00",
    col: {
        title: "NEW AUDIO",
        description: "description...."
    },
    collection: "audios",
    ...
  },
  {
    createdAt: "2022-01-25T18:46:28.781+00:00",
    col: {
        title: "NEW VIDEO",
        description: "description...."
    },
    collection: 'videos',
    ...
  },
]

but it does not work in this way and not joining collections based on the “collection” field value

You cannot do the following

The $ sign is ignored. As per documentation, the from: field needs to be a collection name.

At first glance, the only way I can see a way to do something similar is with:

Untested

match = { $match: { $expr: { $eq: ['$_id', '$$col'] } } }

project = { $project: { _id: 0, title: 1, description: 1 } }

lookup_pipeline = [ match , project ]

videos_lookup = { "$lookup" : {
  "from" : "videos" ,
  "let" : { "col" : "$refId" } ,
  "pipeline" : lookup_pipeline ,
  "as" : "col" 
} }

audios_lookup = { "$lookup" : {
  "from" : "audios" ,
  "let" : { "col" : "$refId" } ,
  "pipeline" : lookup_pipeline ,
  "as" : "col" 
} }

unwind_col = { $unwind: { path: '$col', preserveNullAndEmptyArrays: true } }

videos_facet = [ { "$match" : { "collection" : "videos" } } ,
  videos_lookup ,
  unwind_col
]

audios_facet = [ { "$match" : { "collection" : "audios" } } ,
  audios_lookup ,
  unwind_col
]

db.content.aggregate( [
  { "$facet" : {
    "audios" : audios_facet ,
    "videos" : videos_facet
  } }
] )

The above does not give you the format you want but it gives you all the data you want. You may use other stages after the $facet to exactly produce the format you want. But it might be easier and more efficient to do that in the application layer.

But, I do not see the title or description to change very often, probably much less often than the above query, so why don’t you keep title and description in the content collection.

And why have 3 collections when 1 is sufficient and more efficient. Your 3 collections schema looks like an SQL derivative where you do not have access to a flexible schema.

Your content collection could look like:

[
  {
    createdAt : "2022-01-24T18:46:28.781+00:00" ,
    // refId is not needed because it is our _id
    _id : ObjectId("6262d0d9302ff436aa2028f3") ,
    "title" : "title field from the audios collection" ,
    "description" : "description field from the audios collection" ,
    // other fields that are common to both audios and videos
    "audio" : {
      // attributes specific to documents from the audios collection
    }
  },
  {
    createdAt: "2022-01-25T18:46:28.781+00:00",
    _id: ObjectId("625007153e8aaaa2b275f475"),
    "title" : "title field from the videos collection" ,
    "description" : "description field from the videos collection" ,
    // other fields that are common to both audios and videos
    "video" : {
      // attributes specific to documents from the video collection
    }
  },
]

And the next time you publish sample documents, please do not put 3 dots to indicate more fields. With the dots we cannot simply cut-n-paste the document into our system to experiement, we need to edit the dots out. Putting the dots inside a JS comment is however acceptable since cut-n-paste should work.

2 Likes

thanks for the reply steevej,
so i guess this is the only solution at the moment, i tested it and it works as an alternative to what i want.

this was just an example, but in my case I need to join a collection to map a list and then produce a calculation based on it. most of the time this list changes value, so I guess I don’t have to change this field from two collections.

about the 3 dots, yes i will, thanks for letting me know

1 Like