Need in help in optimising the query

we are using this aggregation:

const shotsPipeline = [
            {
              $match: {
                _id: { $nin: req?.alreadyViewedShots },
                restaurants: {
                  $elemMatch: { $eq: restId },
                },
              },
            },
            {
              $group: {
                _id: '$VideoCategory', // Group by 'VideoCategory'
                shots: { $push: '$_id' }, // Collect all documents in an array for each 'VideoCategory'
              },
            },
            {
              $project: {
                shots: {
                  $slice: ['$shots', 1], // Limit the array to 1 elements per 'VideoCategory'
                },
              },
            },
            {
              $project: {
                _id: 0, // Exclude the _id field at this stage
                shots: 1, // Include the 'shots' array
              },
            },
            {
              $unwind: '$shots', // Unwind the array to separate documents
            },
          ];

Our requirement : First remove all alreadyViewedShots shots and find shots where the field “restaurants” in Shots object contain at-least one restId. In next stage it groups all shots based on “VideoCategory” field. in next stage it takes just 1 item from each group of VideoCategory, and then it unwinds the shots. In the first stage we are using restId to get the shots where the field “restaurants” in Shots object contain at-least one restId, we are looping through this pipeline multiple times(with different restIds) until we find a nonempty array of shots obtained from this pipeline, with just 30 documents in our collection the above logic takes min 700-800 milliseconds. we have also used indexing in “VideoCategory” and “restaurants” fields. We want to reduce the latency as our production environment will be having min 1000+ documents in the collection.

can someone please give their inputs/advice on this.

Hi there,
Will try to provide a viable solution (providing schema can provide better insight)
So breaking things at each
At the match step, we can have a separate flag field to view for alreadyViewedShots and can check if the index provides the same or different performance.
At the group step, we can combine group and project with $first to have only one shot.
If you don’t have a pagination issue then unwind at the API level

Hope it helps!

1 Like