Search 2 collections with same text search

Hi,

I have a collection by name businesses. Document in business looks like below

[{
  "_id": {
    "$oid": "60a34c76d8b10300111325ea"
  },
  "services": [
    {
      "$oid": "60a34da258d8b50011859c4d"
    },
    {
      "$oid": "60a34dc7d8b10300111325f7"
    }
  ],
  "description": "Dev",
  "businessName": "Amul Milk Center",
  "businessId": "VD627523"
}]

We have 3 services of the business as below in the services collection. The service collection document looks like below

[{
  "_id": {
    "$oid": "60a34da258d8b50011859c4d"
  },
  "service": "Amul Milk Cow",
  "businessId": {
    "$oid": "60a34c76d8b10300111325ea"
  }
},{
  "_id": {
    "$oid": "60a34dc7d8b10300111325f7"
  },
  "service": "Amul Milk Buffalo",
  "businessId": {
    "$oid": "60a34c76d8b10300111325ea"
  }
},{
  "_id": {
    "$oid": "620ea45b9a94ea160049cfc5"
  },
  "service": "Amul Ice Cream",
  "businessId": {
    "$oid": "60a34c76d8b10300111325ea"
  }
}]

If I Search “Amul Milk” then I should get “businessName”: “Amul Milk Center” from the business collection and the 2 services of the business named as “service”: “Amul Milk Buffalo” & “service”: “Amul Milk Cow” from the services collection.

How can I achieve this?

Hi @Omkar_Lagu,

Thanks for providing the collection document and service collection document samples.

Could you also provide the following information:

  1. MongoDB Version
  2. If the deployment is self-hosted or on Atlas
  3. The expected output
  4. What you have attempted so far
  5. Further information / context regarding the use case

In saying so, the below may be able to achieve what you are after:

/// match text search stage on "Amul Milk". Please note that I have created a text index on the businessName field on the business collection.
stage1 =
{
  '$match':{
    $text:{
      $search:"Amul Milk"
      }
    }
}

/// lookup stage to join business collection on services collection using business collection _id field joined to service collection documents businessId field. In addition to that, the pipeline field contains an array which only matches services with regex "Amul Milk"
stage2 = 
{
  '$lookup': {
    from: 'services',
    localField: '_id',
    foreignField: 'businessId',
    pipeline:[
        {"$match":{
            "$expr": {
                     "$regexMatch":{
                        input:"$service", 
                        regex:"Amul Milk"
                    }
                }
            }}],
    as: 'services'
  }
}

Output when the above stages are used in the aggregation:

businessdb> db.businesses.aggregate([stage1,stage2])
db.businesses.aggregate([stage1,stage2])
[
  {
    _id: ObjectId("60a34c76d8b10300111325ea"),
    services: [
      {
        _id: ObjectId("60a34da258d8b50011859c4d"),
        service: 'Amul Milk Cow',
        businessId: ObjectId("60a34c76d8b10300111325ea")
      },
      {
        _id: ObjectId("60a34dc7d8b10300111325f7"),
        service: 'Amul Milk Buffalo',
        businessId: ObjectId("60a34c76d8b10300111325ea")
      }
    ],
    description: 'Dev',
    businessName: 'Amul Milk Center',
    businessId: 'VD627523'
  }
]

Projecting only the business name and services (for demo purposes) using the above pipeline with an additional $project stage:

/// additional $project stage
stage3 = {$project:{"businessName":1,"services.service":1}}

/// output
businessdb> db.businesses.aggregate([stage1,stage2,stage3])
[
  {
    _id: ObjectId("60a34c76d8b10300111325ea"),
    services: [ { service: 'Amul Milk Cow' }, { service: 'Amul Milk Buffalo' } ],
    businessName: 'Amul Milk Center'
  }
]

For this particular use case it may be better if the services are embedded in the business documents. This is because I am assuming that the number of services are limited, and less likely to be updated frequently (multiple times in a second by multiple processes). This would provide further optimisation and simplification of the aggregation by removing the $lookup stage. However, this is just an assumption based off the current information provided. Of course this may not be the case depending on your particular workload / scenario regarding these two collections.

You may also want to go over the $lookup array considerations notes as well.

Regards,
Jason

1 Like