Is it possible in one query to get documents on data exists in an array attribute for one of documents in this collection?

Hi,

I have a collection contains users information, as following:

[
  {
    "userId": "1",
    "hasShare": true,
    "name": "aaa",
    "follow": [
      {
        "userId": "2"
      }
    ]
  },
  {
    "userId": "4",
    "hasShare": false,
    "name": "bbb",
    "follow": [
      {
        "userId": "5"
      }
    ]
  },
  {
    "userId": "3",
    "hasShare": false,
    "name": "ccc",
    "follow": [
      {
        "userId": "1"
      },
      {
        "userId": "4"
      }
    ]
  }
]

The follow array in each document has users Id that following this user.

For a certain user e.g. userId = “3”, Is it possible in one query to get the follower user’s name whose have “hasShare = true”?
Note that followers are in the follow array having their id, and details in the same collections

In this example, for userId 3, having 2 followers 1 & 4, userId 1 has the attribute “hasShare” = true (first doc), so the query result should be: {"name": "aaa"}

So the case here is similar to subqueries in relational db. So we have to find users where users Id are in an array, and this array is in one of the documents in the collection.

Thanks

I think a simple $lookup with a pipeline that has a $match stage should work.

Something like the following untested:

match_userId = { "$match" : {
    "userId" : "3"
} }

lookup_hasShare = { "$lookup" : {
    "from" :  collection_name ,
    "localField" : "follow.userId" ,
    "foreignField" : "userId" ,
    "as" : "_result" ,
    "pipeline" : [
        { "$match" : { "hasShare" : true  } } ,
        { "$project" : { "name" : 1 , "_id" : 0 } }
    ] 
} }

db.getCollection( collection_name ).aggregate( [ match_userId  , lookup_hasShare  ] )

Should provide you with the document

{
    "userId": "3",
    "hasShare": false,
    "name": "ccc",
    "follow": [
      {
        "userId": "1"
      },
      {
        "userId": "4"
      }
    ] ,
    "_result" : [
        { "name" : "aaa" }
    ]
}

I will leave the final cosmetic stages as an exercise to the reader.

1 Like

Thanks a lot @steevej

Do you think that the $lookup is mature enough to be working with data Sharding?
For large data, e.g. 1M users in above sample collection, do you think we might have a performance implication for using $lookup?

Thanks

You might have performance issues in 2 cases.

1 - when the load you is too high for the resources you have

OR

2 - when the resources you have are too small for the load you have

You should be happy if the load you have is high. Then you should have the mean to increase the resources.

First, make it work correctly then make it work fast.

Continuous improvement is better than delayed perfection. – Mark Twain

1 Like

Thank you very much @steevej

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.