I am getting performance issue with this aggregation pipeline?This almost takes 20 sec to give response of 30000 records

Hii, The aggregation pipeline taking almost 20 to 25 sec to execute and give response.But inspite of creating indexes it is still taking time 20 to 22 sec.I guess these lookups takes more time ,Byt why and how I can solve this issue?
Note:It fetches mostly 30400 records. My MongoDB version is 5.0.20…

I have created Indexes,

createdAt: 1
 businessUnitId: 1
 commodityId: 1
 commodityVariantId: 1
 createdBy: 1
 isDeleted: 1
isSLCMQcInspection: 1
commodityDetail.CIDNumber_text

And I am strange whenever I am using explain() with this pipeline getting “MongoInvalidArgumentError: Option “explain” cannot be used on an aggregate call with writeConcern” error.
My syntax for explain() was:

const result = await this.qcInspectionModel.aggregate([],{ explain: true })

Aggregation pipeline:

async getQCResultSLcm(
    filters: GetAllQcInspectionWithFilterSlcmDto,
    businessUnitId: string,
  ) {
    let startDateQuery = {};
    let endDateQuery = {};
    let commoditySearchQuery = {};
    let variantSearchQuery = {};
    const statusQuery = {};
    let cidNumberSearchQuery = {};
    let lotNoSearchQuery = {};
    const businessUnitFilterQuery = {};
    let generalSearchQuery = {};

    if (filters.startDate) {
      startDateQuery = {
        $expr: {
          $gte: [
            '$createdAt',
            {
              $dateFromString: {
                dateString: filters.startDate,
                timezone: '+05:30',
                format: '%m-%d-%Y',
              },
            },
          ],
        },
      };
    }

    if (filters.endDate) {
      endDateQuery = {
        $expr: {
          $lt: [
            '$createdAt',
            {
              $dateAdd: {
                startDate: {
                  $dateFromString: {
                    dateString: filters.endDate,
                    timezone: '+05:30',
                    format: '%m-%d-%Y',
                  },
                },
                unit: 'day',
                amount: 1,
              },
            },
          ],
        },
      };
    }

    // if (filters.startDate) {
    //   startDateQuery = { createdAt: { $gte: DateTime.fromFormat(filters.startDate, "MM-dd-yyyy").setZone("+05:30").toJSDate() } }
    // }
    // if (filters.endDate) {
    //   endDateQuery = { createdAt: { $lt: DateTime.fromFormat(filters.startDate, "MM-dd-yyyy").plus({ days: 1 }).setZone("+05:30").toJSDate() } }
    // }

    if (filters.searchByCommodity) {
      commoditySearchQuery = {
        'commodityData.name': {
          $regex: `${filters.searchByCommodity}`,
          $options: 'i',
        },
      };
    }
    if (filters.searchByVariant) {
      variantSearchQuery = {
        'commodityVariantData.name': {
          $regex: `${filters.searchByVariant}`,
          $options: 'i',
        },
      };
    }

    if (filters.searchByStatus) {
      statusQuery['status'] = filters.searchByStatus;
    }

    if (filters.searchByCIDNumber) {
      cidNumberSearchQuery = {
        $or: [
          {
            'commodityDetail.CIDNumber': {
              $regex: `${filters.searchByCIDNumber}`,
              $options: 'i',
            },
          },
          // {
          //   'businessUnitData.name': {
          //     $regex: `${filters.searchByCIDNumber}`,
          //     $options: 'i',
          //   },
          // },
          // {
          //   'commodityDetail.LOTNumber': {
          //     $regex: `${filters.searchByLotNo}`,
          //     $options: 'i',
          //   },
          // },
          // {
          //   'qcId': {
          //     $regex: `${filters.searchByCIDNumber}`,
          //     $options: 'i',
          //   },
          // },
        ],
      };
    }
    if (filters.searchByLotNo) {
      lotNoSearchQuery = {
        $or: [
          {
            'commodityDetail.LOTNumber': {
              $regex: `${filters.searchByLotNo}`,
              $options: 'i',
            },
          },
        ],
      };
    }
    if (filters.searchByGeneralSearch) {
      generalSearchQuery = {
        $or: [
          {
            qcId: {
              $regex: `${filters.searchByGeneralSearch}`,
              $options: 'i',
            },
          },
          {
            'businessUnitData.name': {
              $regex: `${filters.searchByGeneralSearch}`,
              $options: 'i',
            },
          },
          {
            'userData.name': {
              $regex: `${filters.searchByGeneralSearch}`,
              $options: 'i',
            },
          },
        ],
      };
    }

    if (businessUnitId) {
      businessUnitFilterQuery['businessUnitId'] = new mongoose.Types.ObjectId(
        businessUnitId,
      );
    }
    // const startTime = Date.now();
    const result = await this.qcInspectionModel.aggregate([
      {
        $match: {
          $and: [
            startDateQuery,
            endDateQuery,
            statusQuery,
            businessUnitFilterQuery,
            { isDeleted: false },
            { isSLCMQcInspection: true },
          ],
        },
      },
      {
        $lookup: {
          from: 'mastercommodities',
          localField: 'commodityId',
          pipeline: [
            {
              $project: {
                name: 1,
              },
            },
          ],
          foreignField: '_id',
          as: 'commodityData',
        },
      },
      {
        $unwind: '$commodityData',
      },
      {
        $lookup: {
          from: 'commodityvariants',
          localField: 'commodityVariantId',
          pipeline: [
            {
              $project: {
                name: 1,
              },
            },
          ],
          foreignField: '_id',
          as: 'commodityVariantData',
        },
      },
      {
        $unwind: '$commodityVariantData',
      },
      {
        $lookup: {
          from: 'businessunits',
          localField: 'businessUnitId',
          pipeline: [
            {
              $lookup: {
                from: 'businesses',
                localField: 'businessId',
                foreignField: '_id',
                as: 'businessClientName',
              },
            },
            {
              $unwind: '$businessClientName',
            },
            {
              $project: {
                name: 1,
                businessClientName: '$businessClientName.displayName',
              },
            },
          ],
          foreignField: '_id',
          as: 'businessUnitData',
        },
      },
      {
        $unwind: {
          path: '$businessUnitData',
          preserveNullAndEmptyArrays: true,
        },
      },
      {
        $lookup: {
          from: 'users',
          localField: 'createdBy',
          foreignField: '_id',
          as: 'userData',
          pipeline: [
            {
              $project: {
                firstName: 1,
                lastName: 1,
                _id: 0,
                name: { $concat: ['$firstName', ' ', '$lastName'] },
              },
            },
          ],
        },
      },
      {
        $unwind: {
          path: '$userData',
          preserveNullAndEmptyArrays: true,
        },
      },
      {
        $match: {
          $and: [
            commoditySearchQuery,
            variantSearchQuery,
            generalSearchQuery,
            cidNumberSearchQuery,
            lotNoSearchQuery
          ],
        },
      },
      {
        $sort: {
          createdAt:
            filters.sortOrder && filters.sortOrder != SortOrder.Ascending
              ? SortOrder.Descending
              : SortOrder.Ascending,
        },
      },
      {
        $project: {
          _id: 1,
          status: 1,
          commodityData: 1,
          commodityDetail: 1,
          commodityVariantData: 1,
          createdAt: 1,
          qcId: 1,
          sampleName: 1,
          businessUnitData: 1,
          userData: 1,
          location: 1,
          middlewareStatus: 1,
        },
      },
      {
        $facet: {
          records: [
            { $skip: (filters.pageNumber - 1) * filters.count },
            { $limit: filters.count * 1 },
          ],
          total: [{ $count: 'count' }],
        },
      }
    ]);
    // const endTime = Date.now();
    // const executionTimeMs = endTime - startTime;
    // console.log('Execution time:', executionTimeMs, 'ms');
    return result;
  }

I’m not sure putting “Solve ASAP” on your message title is going to get people running to try and resolve the issue.

Have you tried commenting out pipeline stages and trying to debug what’s causing the timeouts yourself?

There are a load of filters you’re passing into the aggregate query that are not documented so we can’t see what’s going on, you have also not included example documents if someone wanted to try and reproduce the issue.

Have you tried running the query in compass or the shell to get around the explain issue you’re having?

1 Like

Hii @John_Sewell ,thanks for your replay,Yes I have tried by commenting out pipeline stages .The result I got most of the lookups taking time 5 sec to 7 sec for execution.Now I have updated my code ,you can find filters there.But why I am unable to use explain() ,getting ‘MongoInvalidArgumentError: Option “explain” cannot be used on an aggregate call with writeConcern’ error.

Does you node code set the default write concern at a connection level? In code or in the connection string?

No,there is no code which sets the writeConcern.

What’s your full connection string, with redacted username / password?

mongodb+srv://user:password.mongodb.net/test?retryWrites=true&w=majority

Remove the write concern from the connection string and try again

Yes,Now getting ,

{
  explainVersion: '1',
  stages: [
    {
      '$cursor': [Object],
      nReturned: 30104,
      executionTimeMillisEstimate: 4925
    },
    {
      '$lookup': [Object],
      totalDocsExamined: 30103,
      totalKeysExamined: 30103,
      collectionScans: 0,
      indexesUsed: [Array],
      nReturned: 30103,
      executionTimeMillisEstimate: 8556
    },
    {
      '$lookup': [Object],
      totalDocsExamined: 30103,
      totalKeysExamined: 30103,
      collectionScans: 0,
      indexesUsed: [Array],
      nReturned: 30103,
      executionTimeMillisEstimate: 11726
    },
    {
      '$lookup': [Object],
      totalDocsExamined: 30103,
      totalKeysExamined: 30103,
      collectionScans: 0,
      indexesUsed: [Array],
      nReturned: 30103,
      executionTimeMillisEstimate: 18457
    },
    {
      '$lookup': [Object],
      totalDocsExamined: 30103,
      totalKeysExamined: 30103,
      collectionScans: 0,
      indexesUsed: [Array],
      nReturned: 30103,
      executionTimeMillisEstimate: 23009
    },
    {
      '$sort': [Object],
      totalDataSizeSortedBytesEstimate: 55157785,
      usedDisk: false,
      nReturned: 30103,
      executionTimeMillisEstimate: 23014
    },
    {
      '$project': [Object],
      nReturned: 30103,
      executionTimeMillisEstimate: 23154
    },
    {
      '$facet': [Object],
      nReturned: 1,
      executionTimeMillisEstimate: 23232
    }
  ],
  serverInfo: {
    host: '********',
    port: 27017,
    version: '5.0.20',
    gitVersion: '2cd626d8148120319d7dca5824e760fe220cb0de'
  },
  serverParameters: {
    internalQueryFacetBufferSizeBytes: 104857600,
    internalQueryFacetMaxOutputDocSizeBytes: 104857600,
    internalLookupStageIntermediateDocumentMaxSizeBytes: 104857600,
    internalDocumentSourceGroupMaxMemoryBytes: 104857600,
    internalQueryMaxBlockingSortMemoryUsageBytes: 104857600,
    internalQueryProhibitBlockingMergeOnMongoS: 0,
    internalQueryMaxAddToSetBytes: 104857600,
    internalDocumentSourceSetWindowFieldsMaxMemoryBytes: 104857600
  },
  command: {
    aggregate: 'qcinspections',
    pipeline: [
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object]
    ],
    cursor: {},
    '$db': 'agriReach-test'
  },
  ok: 1,
  '$clusterTime': {
    clusterTime: new Timestamp({ t: 1694197055, i: 1 }),
    signature: {
      hash: new Binary(Buffer.from("e6895df3583ae070a1f20e275eb9a3f094981121", "hex"), 0),
      keyId: new Long("7226035368871591938")
    }
  },
  operationTime: new Timestamp({ t: 1694197055, i: 1 })
}

Some of those stages are taking a long time, I don’t have time to take apart and re-work the logic of the pipeline but you may want to think about reworking relationships and storage as opposed to storing things in multiple collections.
Things to think about:

  • Embed data within the document you’re reading
  • Create a materialised view of data
  • Keep copies of data on the parent and update on a write

If this is a well used query then re-shape the data to match it, if it’s not then don’t and take the performance hit.

The original subject said it was slow with 30,000 records, but it’s not really, it’s slow to process 30,000 records and ALL the data that you’re linking in (and unwinding, sorting etc).

We have queries that are massive, basically end of month rec type reports and we either dealt with time for them to run, after all they ran once a month, or for more regular queries we kept the data up-to-date on writes so it was in a fast format to query. There is no point re-calculating data for 30,000 records every time, when only 3 of them change every few days etc.

Sorry not been more helpful, but I think that’s your next approach, take a step back and work out what you really want this query to do and how you can optimise the process as opposed to query.

Maybe someone else has an idea of what to change easily that I’ve missed.

2 Likes

okay,Thank you @John_Sewell for your time…

No one is there who can solved this question?

Did you made any progress on your side?

What did you tried?

@John_Sewell, presented a few things to try like embedding and view, was any useful? Which one you tried? What worked and what failed?

Other People cannot keep working on your issue if you do not followup on the previous advice.