Best way to create a View from these two collections

Hello there, I am trying to evaluate whether I can create a Mongo view which joins two collections and inspects two arrays of Guids. Ideally, I would like to use a view, since I have a C# API layer which is leveraging IQueryable and expressions through the MongoDB C# driver. If need be, I could also probably use the aggregation pipeline and aggregate().

In a nutshell here is an example of the two collections:

FirstCollection: (probably can have 500,000-600,000 documents max or so)

{
  "_id": "2010-0001"
  "ArrayOfGuids":  [ Guid1, Guid2, Guid8]
  ..other fields not important
},
{
  "_id": "2010-0002"
  "ArrayOfGuids":  [ Guid7, Guid10, Guid5]
  ..other fields not important
},
{
  "_id": "2010-0003"
  "ArrayOfGuids":  [ Guid14, Guid11, Guid3]
  ..other fields not important
}

SecondCollection: (probably will not contain more then 1000 documents)

{
  "_id": "SomeId",
  "name": "SomeName"
  "modified":  DateTime
  "ArrayOfGuids":  [ Guid1, Guid2, Guid3]
  ..other fields not important
}

The “intersection” I need to do is between the 2 collections on the “ArrayOfGuids” properties. For every document in the SecondCollection, if the ArrayOfGuids has any match in FirstCollection, return a projection of:

SecondCollection._id,
SecondCollection.name,
SecondCollection.modified,
FirstCollection._id (as array)

From the data sample above, one projection returned would be

{
   "_id": "SomeId",
   "name": "SomeName"
   "modified":  DateTime
   "mappedOtherIds":  ["2010-0001","2010-0003"]
}

So, since FirstCollection _id “2010-0001” and _id “2010-0003” contain a matching Guid in its corresponding array, it is included in the result. The smaller collection is more or less the primary data to return, but I need to project the array of ids from the FirstCollection where an array item match exists.

The larger FirstCollection does have an index on the “ArrayOfGuids” field and its _id field. Unfortunately, the smaller, second collection does not have an index on the “ArrayOfGuids” column, but the addition of another index is potentially possible.

I have looked and $unwind and flattening both arrays but that didn’t seem to lead me to a proper intersection query. I am also looking to create a view which is as optimal as possible, given I don’t want to revert to doing this in memory at the app tier and using auxiliary caches and such. If this query would be too intensive and slow, I probably have to abandon the idea. Also, restructuring these collections, minus adding an index or so, is not possible at the moment.

If anyone can provide a sample or advice to nudge me in the right direction, that would be great. Thanks in advance.

@Martin_Koslof I recommend using the “aggregate” method with the “$lookup” stage to join the collections based on a matching field. You can then use other stages such as “$project” or “$group” to shape the output of the view as needed. Here is an example of how to create a view from two collections:

db.createView("myView", "collection1", [
 {
   $lookup: {
     from: "collection2",
     localField: "commonField",
     foreignField: "commonField",
     as: "joinedData"
   }
 },
 {
   $unwind: {
     path: "$joinedData",
     preserveNullAndEmptyArrays: true
   }
 }
])

In the above example, the “myView” view is being created based on “collection1”. The “$lookup” stage is used to join “collection1” with “collection2” based on a common field called “commonField”. The resulting documents are then flattened using “$unwind” to create one document per matching pair of documents from the two collections.
Note that in order to create a view in MongoDB, you must have the “createView” privilege on the database.

Hope this was helpful :slight_smile: