MongoDB Aggregation Take more than 4 sec And also assume plans can be more (i.e. 1000 Plans) in database and briefs can be 7x compare to the plans

I have 2 collection:-
First one is plan collection
{"_id":{"$oid":"615d54ecd0d7763ba6c05819"},"h2":["6040e39b7429ab001221df6f_6040e39b7429ab001221df70_6040e39b7429ab001221df71_6040e39b7429ab001221df72_6040e39b7429ab001221df73"],"isArchive":false,"isCreateTeam":false,"dueDate":{"$date":"2021-10-28T00:00:00.000Z"},"endDate":{"$date":"2021-11-30T00:00:00.000Z"},"startDate":{"$date":"2021-10-30T00:00:00.000Z"},"name":"EX_PLAN","linkedBrief":"5feb2ecc63f7cb001292c80b","totalBudget":12345,"totalPlanned":0,"buyerAccepted":0,"status":"inDraft","h1":"6040e35522b8b80012668c80_6040e35522b8b80012668c81_6040e35522b8b80012668c82_6040e35522b8b80012668c83","brandLevel":"6040e39b7429ab001221df6f_6040e39b7429ab001221df70_6040e39b7429ab001221df71_6040e39b7429ab001221df72_6040e39b7429ab001221df73","teamSize":"large","marketCode":"GB","currencyCode":"GBP","uid":"dev_000008","clientCode":"dev","tenantId":"dev","createdBy":"00uanynrxPgBNVKsr416","originalId":"615d5474d0d7763ba6c057dc","createdAt":{"$date":"2021-10-06T07:47:00.849Z"},"updatedAt":{"$date":"2021-10-06T07:49:00.274Z"},"updatedBy":"00uanynrxPgBNVKsr416","__v":0}

and second one is briefs collection which are linked to the plans:-

{"_id":{"$oid":"615d54ecd0d7763ba6c05817"},"activeUserType":["planner"],"buyers":[],"image":"","isRecalled":false,"owner":"planner","createdBy":"00uanynrxPgBNVKsr416","buyerAccepted":0,"h3":"6040e4567429ab001221df82_6040e4567429ab001221e09e_6040e4567429ab001221e09f","name":"Digital","plan":{"$oid":"615d5474d0d7763ba6c057dc"},"status":"inDraft","totalPlanned":12345,"originalId":"615d5474d0d7763ba6c057de","clientCode":"dev","tenantId":"dev","createdAt":{"$date":"2021-10-06T07:47:00.861Z"},"updatedAt":{"$date":"2021-10-06T07:49:00.258Z"},"updatedBy":"00uanynrxPgBNVKsr416","currentPlanned":12,"__v":0}

where First collection have original Id present in the form of plan key in second collection.
and here is the aggregation:-
ChannelGroupData is equivalent to Briefs

[
  {
    '$match': {
      'deletedAt': {
        '$exists': false
      }, 
      'linkedBrief': {
        '$exists': true
      }
    }
  }, {
    '$addFields': {
      'planOriginalId': {
        '$toObjectId': '$originalId'
      }, 
      'id': '$originalId'
    }
  }, {
    '$lookup': {
      'from': 'Briefs', 
      'let': {
        'planObjectId': '$planOriginalId'
      }, 
      'pipeline': [
        {
          '$match': {
            '$expr': {
              '$eq': [
                '$plan', '$$planObjectId'
              ]
            }, 
            'deletedAt': {
              '$exists': false
            }, 
            'cloneId': {
              '$exists': false
            }
          }
        }
      ], 
      'as': 'channelGroupData'
    }
  }, {
    '$unwind': '$channelGroupData'
  }, {
    '$group': {
      '_id': {
        'id': '$id', 
        'linkedBrief': '$linkedBrief', 
        'currencyCode': '$currencyCode'
      }, 
      'channelGroupData': {
        '$push': '$channelGroupData'
      }, 
      'channelGroupsForFilter': {
        '$push': '$channelGroupData'
      }
    }
  }, {
    '$project': {
      'channelGroupData': {
        '_id': 1, 
        'name': 1, 
        'totalPlanned': 1, 
        'status': 1
      }, 
      'channelGroupsForFilter': {
        'status': 1
      }
    }
  }, {
    '$addFields': {
      'status': {
        '$cond': {
          'if': {
            '$setEquals': [
              {
                '$setIntersection': [
                  '$channelGroupsForFilter', [
                    {
                      'status': 'complete'
                    }
                  ]
                ]
              }, '$channelGroupsForFilter'
            ]
          }, 
          'then': 'complete', 
          'else': 'inDraft'
        }
      }
    }
  }, {
    '$project': {
      '_id': 0, 
      'id': '$_id.id', 
      'linkedBrief': '$_id.linkedBrief', 
      'currencyCode': '$_id.currencyCode', 
      'channelGroupData': {
        '_id': 1, 
        'name': 1, 
        'totalPlanned': 1
      }, 
      'status': 1
    }
  }
]

Aggregation takes more than 4 sec. Please some one help to optimize this aggregation.

Hi @Siddhant_Jain1 ,

The aggregation seem to be very complex with no selective first match stage ($exists true and false is not selective or using indexes effecintely)

Additionally, you adding fields and doing a $lookup with $expr. This in general is not considered well optimized stages and you should consider 5.0 version where it has improved.

Looking at the aggregation without knowing the actual business logic of the app and perhaps redesign the documents to have some better match criteria and avoid joining is optimising is very complex.

I would say the only good advice I have for the current process is turning this query into a materialsed view using a $merge statement or $out to persist the data and application to query that view periodically updating the data.

Thanks
Pavel

@Pavel_Duchovny if i create an index for the key which i am using in both collection for join or can u just tell me the alternate aggregation using lookup. we cannot change lookup stage but we can modify it

You are joining using calculated fields from the $addFields so an index won’t really help here.

Perhaps this syntax got performance improvements in server 5.0 so consider upgrading your instance to 5.0 version if its not there yet.

Otherwise, consider persisting the data using materialized views.

1 Like

@Pavel_Duchovny can u show me any demo for this actually i m not able to apply this concept (i.e. material design view) on this?

@Pavel_Duchovny can u provide me left outer join aggregation in both collection using merge and aggregation stage as both collection have common key which planObjectid in addfield and plan in brief collection?

Why not to add a merge stage oby the end of the current pipeline and do a merge condition on the relevant fields.

I don’t know the objects and the topology of data through the stags so I hava hard time providing a specific query.

Thanks

@Pavel_Duchovny By adding merge stage it create new Collection and interact with new collection i need to modify the code. Is there an alternative of merge stage? Like i can modify lookup stage any how

Can u try to modify the currenct pipeline so that we dont have to make new collection and work with old collections as well

You can merge back to a collection not necessarily to a new one.