Lookup & populate objects in array

I have 2 collections (users and tenants). Simplified, the user looks like this:

{
  "_id": ObjectId("..."),
  "associatedTenants": [{
    "tenantId": ObjectId("A"),
    "role": "foo"
  }, {
    "tenantId": ObjectId("B"),
    "role": "bar"
  }]
}

Simplified tenant model looks like this:

{
  "_id": ObjectId("A"),
  "name": "Lorem"
}

And I’d need to query a user with aggregated tenants, so ideally add a new field tenant next to the tenantId with the actual tenant document, like this:

{
  "_id": ObjectId("..."),
  "associatedTenants": [{
    "tenantId": ObjectId("A"),
    "role": "foo",
    "tenant": {
      "name": "Lorem"
    }
  }, {
    "tenantId": ObjectId("B"),
    "role": "bar",
    "tenant": {
      "name": "Ipsum"
    }
  }]
}

I know it can be done - it seems so simple, but I’ve been banging my head agains the wall because of it :smiley:

So far I’ve tried various combinations of $lookup, $project with $map, $mergeObject with $arrayElemAt, but nothing seems to cut it… The closest answer I cloud find was this: https://stackoverflow.com/questions/60342985, but I just don’t seem to be able to adapt it. Can anyone see a way, please? :slight_smile:

Btw. (a bit unrelated) - if I $lookup the tenants and save them into a new field:

$lookup: {
  from: "tenants",
  localField: "associatedTenants.tenantId",
  foreignField: "_id",
  as: "aggregatedAssociatedTenants"
}

The order of aggregatedAssociatedTenants is different than the order of associatedTenants. Doesn’t $lookup preserve the order of the original array?

have you check the documentation? About half the page starts the examples.
$lookup (aggregation) — MongoDB Manual

It would be better if you give 2-3 sample documents of each collection and your expected output from them. Replace all ObjectId with a number or simple string: {_id:123} or {_id:"abc"} (flexible types). Also add 1-2 more fields if you are not doing full merges. This will enable us to work on real (sample) data and find a solution faster.

1 Like

Yes, I have seen all the examples on the $lookup stage Manual page. That’s how got to try variations on $lookup + $replaceRoot + $mergeObjects + $arrayElemAt. Didn’t help… (I wouldn’t be asking if a haven’t spent a good amount of time reading the docs, looking for answer and trying it on my own first.)

Alright, both the user and tenant schemas are exactly as I have them in db and the desired result is also exactly as it should be, I just removed some unrelated fields and ObjectIds for better readability. I can repost it with actual ObjectIds:
User document:

{
  "_id": ObjectId("63732cb4919eee473ec5cec7"),
  "email": "user@domain.com",
  "associatedTenants": [{
    "tenantId": ObjectId("636ad8dd185b079cf4cc3014"),
    "role": "foo"
  }, {
    "tenantId": ObjectId("636a8ba3185b079cf4cc3012"),
    "role": "bar"
  }]
}

Tenant documents:

[{
  "_id": ObjectId("636ad8dd185b079cf4cc3014"),
  "name": "Lorem",
  "domains": [{ "name": "lorem.example.com" }]
}, {
  "_id": ObjectId("636a8ba3185b079cf4cc3012"),
  "name": "Ipsum",
  "domains": [{ "name": "ipsum.example.com" }]
}]

Desired output:

{
  "_id": ObjectId("63732cb4919eee473ec5cec7"),
  "email": "user@domain.com",
  "associatedTenants": [{
    "tenantId": ObjectId("636ad8dd185b079cf4cc3014"),
    "role": "foo",
    "tenant": {
      "_id": ObjectId("636ad8dd185b079cf4cc3014"),
      "name": "Lorem",
      "domains": [{ "name": "lorem.example.com" }]
    }
  }, {
    "tenantId": ObjectId("636a8ba3185b079cf4cc3012"),
    "role": "bar",
    "tenant": {
      "_id": ObjectId("636a8ba3185b079cf4cc3012"),
      "name": "Ipsum",
      "domains": [{ "name": "ipsum.example.com" }]
    }
  }]
}

I’ve added real ObjectIds. I’ve also added some totally unrelated fields as well (although I don’t see the point - as showcased in the original post, there’s no merge, only adding a new field tenant with the full tenant document next to the tenantId - but in each item of array, that’s the point I can’t seem to get around).

Thank you :slight_smile:

We have users from all backgrounds and at all levels and we are not psychics, so please bear with us and do not expect anyone to know everything about you at first glance. especially, when you are new to a community, this is something unavoidable.

and here is why I had those requests:

  • the difference is not visible right away. I use a “diff” tool to see what you change because you said “tried everything” and it is confusing our expectations.
  • for that purpose, I copy data to JSON files but formatters don’t like having functions in it. Plus it is easy to work with short-and-clear data. so no ObjectId. all we need is field names unless values are related too.
  • “exact” ness in your eyes is not a measure to us. I may find a working solution to what you showed but it may not work on your whole data. hence unrelated fields.

The solution might come in a very simple form, but I can’t promise I will be the one to deliver it. Also, if you tackle it yourself before we do, please share it with the rest of the community.

PS: also in JSON format, it should be {"$oid":"63732cb4919eee473ec5cec7"} to be easily inserted into Atlas to test, for example.

1 Like

Ok, Took me a bit but here it is. As I said, your choice of words, especially “tried everything” confuses us about your level of understanding. So let me explain it so to improve you on what you might have missed on $lookup.

  • “from” other documents, “match” its “foreignField” to “localField” of this document. (or use a $match in pipeline) instead
  • define variables by "let"ting them be the values of local fields before going into the pipeline.
  • think of the pipeline as a fully disconnected query. it is just a query that is run on “tenants” collection with the exception of variables we defined before entering the pipeline.
    • all “$$” variables are special. either defined by us elsewhere (above step, for example), or MongoDB defined them. “$$ROOT” is the document currently under inspection down the pipeline.
  • put the result back into local collection “as” having a new name, or “replace” existing one with the same name.

And the tricky part is this:

  • We need values separately from the array since they are objects with fields we need later (role). so we need an “$unwin” first.
  • “$lookup” will give back an array even when it has 1 item inside, so need to:
    • “$unwind” it again.
    • then “$group” the resulting documents.
[
  {
    "$unwind": {
      "path": "$associatedTenants"
    }
  },
  {
    "$lookup": {
      "from": "tenant",
      "foreignField": "_id",
      "localField": "associatedTenants.tenantId",
      "let": {
        "tId": "$associatedTenants.tenantId",
        "tRole": "$associatedTenants.role"
      },
      "pipeline": [
        {
          "$project": {
            "_id": 0,
            "tenantId": "$_id",
            "role": "$$tRole",
            "tenant": "$$ROOT"
          }
        }
      ],
      "as": "associatedTenants"
    }
  },
  {
    "$unwind": {
      "path": "$associatedTenants"
    }
  },
  {
    "$group": {
      "_id": "$_id",
      "associatedTenants": {
        "$push": "$associatedTenants"
      }
    }
  }
]

PS: a solution to a similar problem seems to exists here: mongodb - $lookup on ObjectId's in an array - Stack Overflow

2 Likes

Sorry for temporarily deleting my above post, if you were around at that time. It had a hiccup I failed to see.

I corrected the problem and edited it. However, keep in mind the solution may not work if you change your field types too far from your example documents. if you will, then follow my explanation to adapt it to new situations or try searching the internet again with your now increased knowledge.

You’re amazing. Thank you! :slight_smile: Especially for the explanation, that’s more valuable as I prefer to understand it than just “blindly” solve it.

And yeah, you’re right (object ids, additional fields, everything), I’m sorry if I sounded antagonistic or ungrateful… Thanks again for the help.

2 Likes

Thanks, as long as we keep understanding of each other, we are cool :stuck_out_tongue_winking_eye:

By the way, I went ahead and asked on another topic if we can do it without $unwind. Though I am a bit lost in the answer, there is another approach you can try (you will need to adapt field names):

See you around :slight_smile:

2 Likes

Similar approach but as per the expected result, you can also use $indexOfArray and $arrayElemAt instead of $filter and $first,

  • $indexOfArray will return an array index of matching tenantId
  • $arrayElemAt to get tenant name of specific index that found by above operator
db.users.aggregate([
  {
    "$lookup": {
      "from": "tenants",
      "localField": "associatedTenants.tenantId",
      "foreignField": "_id",
      "as": "tenants"
    }
  },
  {
    "$addFields": {
      "associatedTenants": {
        "$map": {
          "input": "$associatedTenants",
          "in": {
            "$mergeObjects": [
              "$$this",
              {
                "tenant": {
                  "name": {
                    "$arrayElemAt": [
                      "$tenants.name",
                      { "$indexOfArray": ["$tenants._id", "$$this.tenantId"] }
                    ]
                  }
                }
              }
            ]
          }
        }
      },
      "tenants": "$$REMOVE"
    }
  }
])

Playground

2 Likes

Thanks guys!
Both solutions work quite nicely, but think I’ll go with turivishal’s approach as it’s surprisingly simple (and requires slightly less tuning - the need to explicitly select tenant’s fields is also nice as I was gonna $project it anyway).

2 Likes

Hey,

Thanks for the example. I’m trying something similar, but I need the entire document, not the subset. So in this example, I would need the entire tenant document. I’m just getting to grips with the aggregation pipeline and could do with some help.

Appreciated,

James

I think this is pretty easy, actually. I just use arrayElemAt without the name prefix.

H @James_N_A2

Here, we use “$project” in the pipeline of the “$lookup” to select fields. If you want the whole document, you can use just a “$match” stage instead, and use the “tenantId” we define by “let” before entering the pipeline.

It is also possible without “let” and “pipeline”. check the usual examples for this way.

In case you fail, then please open a new topic, give a minimal dataset to try and the expected result, along with what you tried so far, and also put a link to this post so others can see where else you looked.

1 Like

Hey,

Thanks for the reply! I’ve created a new post here:

Hopefully, that shines some light on what I’m trying to do.

J