How to join multiple fields and filter data by date

So I have the following query, where I join 2 collections by a single field mainId:

const interval = moment.utc().add(-5, "days"); // number of days may vary
return this.aggregate([
  {
    $lookup: {
      from: 'collectionB',
      localField: 'mainId',
      foreignField: 'mainId',
      as: 'validd',
    },
  },
  {
    $match: {
      $or: [
        { "validd": [] },
        { "validd.createdAt": { $lt: new Date(interval) } }
      ]
    }
  },
]);

it works perfectly! But now I need to join those collection by another field named vid .
I could make this join with two fields work like this:

return this.aggregate([
  {
    $lookup: {
      from: "CollectionB",
      let: {
        mainIdField: "$mainId",
        vidField: "$vid"
      },
      pipeline: [
        {
          $match: {
            $expr: {
              $and: [
                {
                  $eq: ["$mainId", "$$mainIdField"]
                },
                {
                  $eq: [ "$vid", "$$vidField"]
                }
              ]
            }
          }
        }
      ],
      as: "joined_results"
    }
  },
  {
    $unwind: { path: "$joined_results" }
  }
])

But I can’t make both cases work toghether. I need to use both filters.

1 - Make the join with two fields;
2 - Apply the first example, where I only return values if:
2.1 - Itens from collectionA doesnt exists in collectionB
2.2 - row from collectionA exists in collectionB, but the field createdAt is greather than X.

Is it possible?

Post sample documents of your collections.

The documents shared in your other thread do not have a field named vid. You have typos errors between

and

May be they are different collection but it is not obvious.

What about vid vs validd? As a human I can see some resemblance but the $lookup will surely failed using vid if it is validd like your other pipeline.

Is this a pipeline that is applied after the original pipeline? Or is the original pipeline, the solution that was provided in your other thread, the solution of an incomplete requirement?

YES

Same solution as first example once you got your joined_results.

You have to state your requirement completely right at the beginning so that we do not work and provide partial solutions that requires extra work with additional questions.

1 Like

Hello again, Steeve :slight_smile:
Sorry for the confusion and thanks again for taking some type to help.

1 - The miss typo was a mistake I made while typing here instead of copying and pasting.
2 - vidd is a string, different from valid.

CollectionA

name: "first"
mainId: a2345e87-a388-4b72-ae2a-1cd69b7e1330
vid: "abc"

name: "second"
mainId: b2345e87-a388-4b72-ae2a-1cd69b7e1330
vid: "def"

name: "third"
mainId: c2345e87-a388-4b72-ae2a-1cd69b7e1330
vid: "ghi"

CollectionB

name: "first"
mainId: a2345e87-a388-4b72-ae2a-1cd69b7e1330
vid: "abc"
createdAt:2022-05-11T20:44:14.885+00:00

name: "third"
mainId: c2345e87-a388-4b72-ae2a-1cd69b7e1330
vid: "ghi"
createdAt:2022-05-08T20:44:14.885+00:00

So, if I run this query correctly and considering that I want records from 3 days ago what I want as a output is:

1- name = second should be returned, because it does not exists on CollectionB.
2 - name = first should be ignored, because it was created only one day ago.
3 - name = third should be returned, because it was created 4 days ago.

My last failed attempt of querying this solution:

return this.aggregate([
  {
    $lookup: {
      from: 'collectionB',
      localField: 'mainId',
      foreignField: 'mainId',
      let: { vidField: '$vid', mainIdField: '$mainId'},
      pipeline: [
        {
          $match: {
            $expr: {
              $and: [
                { $eq: ["$mainId", "$$mainIdField"] },
                { $eq: ["$vid", "$$vidField"] }
              ]
            }
          }
        }
      ],
      as: 'validd',
    },
  },
  {
    $match: {
      $or: [
        { "validd": [] },
        { "validd.createdAt": { $lt: new Date(interval) } }
      ]
    }
  },
]);

The variable interval is the “age” of the records that I want in days.

That pipeline works for me.

But I do not use moment. Somehow you must be using the wrong date.

With your data I get:

// We get second with an empty array because it is not in collectionB.
{ _id: 'second',
  mainId: 'b2345e87-a388-4b72-ae2a-1cd69b7e1330',
  vid: 'def',
  validd: [] }
// We also get third, with an element dated before the wanted date.
{ _id: 'third',
  mainId: 'c2345e87-a388-4b72-ae2a-1cd69b7e1330',
  vid: 'ghi',
  validd: 
   [ { _id: 'third',
       mainId: 'c2345e87-a388-4b72-ae2a-1cd69b7e1330',
       vid: 'ghi',
       createdAt: 2022-05-08T04:00:00.000Z } ] }

But note again. You write

but then you do $lookup

If you mistype the collection name, then the lookup will return an empty array for all and all documents will match the empty array test of your $or.