Aggregation query inconsistently failing

My aggregation queries have been failing due to exceeding the time limit. I’ve tried to optimize where I can. If you have the time, I would greatly appreciate it if you could give me some guidance on how I can optimize it further so that I can complete the query. No worries if you do not.

This is the shape of the model I am working with:

    export class AbyssBattle {
      @Field(() => String)
      _id: MongooseSchema.Types.ObjectId;

      @Field(() => String)
      @Prop({ required: true })
      floor_level: string;

      @Field(() => Number)
      @Prop({ required: true })
      battle_index: number;

      @Field(() => Number)
      @Prop({ required: true })
      star: number;

      @Field(() => [String])
      @Prop({
        type: [MongooseSchema.Types.ObjectId],
        ref: 'PlayerCharacter',
        required: true,
      })
      party: MongooseSchema.Types.ObjectId[];
    }

Where party is an array of four PlayerCharacter ObjectId. The shape of the PlayerCharacter model is the following:

    export class PlayerCharacter {
      @Field(() => String)
      _id: MongooseSchema.Types.ObjectId;

      @Field(() => String)
      @Prop({
        type: MongooseSchema.Types.ObjectId,
        ref: 'Character',
        required: true,
      })
      character: MongooseSchema.Types.ObjectId;

      @Field(() => String)
      @Prop({
        type: MongooseSchema.Types.ObjectId,
        ref: 'ArtifactSetBuild',
        required: true,
      })
      artifactSetBuild: MongooseSchema.Types.ObjectId;

      @Field(() => String)
      @Prop({
        type: MongooseSchema.Types.ObjectId,
        ref: 'Weapon',
        required: true,
      })
      weapon: MongooseSchema.Types.ObjectId;
    }

I aggregate on a lot of these fields including character, artifactSetBuild, and weapon, though I do query them individually to split the work.

This is one of the aggregation queries that keep failing. There are currently 556887 AbyssBattle documents, and they are spread at varying proportions throughout each respective floor_level and battle_index.

        return this.abyssBattleModel
          .aggregate([
            {
              $match: {
                floor_level,
                battle_index,
              },
            },
            {
              $lookup: {
                from: 'playercharacters',
                localField: 'party',
                foreignField: '_id',
                as: 'party',
              },
            },
            {
              $unwind: '$party',
            },
            {
              $project: {
                artifactSets: '$party.artifactSetBuild',
                star: 1,
              },
            },
            {
              $group: {
                _id: '$artifactSets',
                count: {
                  $sum: 1,
                },
                avgStar: {
                  $avg: '$star',
                },
                winCount: {
                  $sum: {
                    $cond: { if: { $eq: ['$star', 3] }, then: 1, else: 0 },
                  },
                },
              },
            },
            {
              $lookup: {
                from: 'artifactsetbuilds',
                localField: '_id',
                foreignField: '_id',
                as: 'artifactSets',
              },
            },
            {
              $project: {
                artifactSets: '$artifactSets.sets',
                count: 1,
                avgStar: 1,
                winCount: 1,
              },
            },
            {
              $sort: {
                count: -1,
              },
            },
            {
              $limit: limit,
            },
          ])

In the very first step I filter the documents with floor_level and battle_index. I have created a compound index on both of these fields to optimize this process. I then use a $lookup in order to populate the party field with the PlayerCharacter fields.

Example of the process:

Data sample:

    [
      {
        floor_level: '11-1',
        battle_index: 1,
        party: [ObjectId(1a), ObjectId(1b), ObjectId(1c), ObjectId(1d)],
        star: 3,
      },
      {
        floor_level: '11-1',
        battle_index: 1,
        party: [ObjectId(1a), ObjectId(1b), ObjectId(1c), ObjectId(1d)],
        star: 2,
      },
      {
        floor_level: '11-1',
        battle_index: 1,
        party: [ObjectId(1c), ObjectId(1d), ObjectId(1e), ObjectId(1f)],
        star: 3,
      },
    ];

After $lookup:

      {
        floor_level: '11-1',
        battle_index: 1,
        party: [
          {
            character: ObjectId(2a)
            artifactSetBuild: ObjectId(3a),
            weapon: ObjectId(4a)
          }, 
          {
            character: ObjectId(2b)
            artifactSetBuild: ObjectId(3b),
            weapon: ObjectId(4b)
          },
          {
            character: ObjectId(2c)
            artifactSetBuild: ObjectId(3c),
            weapon: ObjectId(4c)
          },
          {
            character: ObjectId(2d)
            artifactSetBuild: ObjectId(3d),
            weapon: ObjectId(4d)
          },
        ],
        star: 3,
      },

I perform $unwind and use $group to aggregate data on the various fields. Around this stage is when I reach the exceeded time limit error for some of the floor_level and battle_index. I have even tried filtering after the first $lookup on PlayerCharacter by Character but that did not resolve the issue. I cannot currently think of a way to further optimize this query, and I was wondering what could be my next steps. Thank you.

Hello, I cannot tell for sure what the issue is right away, but I see that the above $project stage is not required (it is an unnecessary stage to process). In the case with the $project stage after the second $lookup - move it to after the $limit stage. Try these and see if there is any improvement.

Also, see this for some tips on: Aggregation Pipeline Optimization

1 Like