I want to use indexes on the secondary collection in mongodb

Hi @Aditya_Sharma7

Thank you for sharing the sample data and the necessary information.

There are a few points which I would like to mention after we have triaged the issue.

  1. From the sample data posted and the query posted, the query does not seem to return any data as a response. Can you confirm, if the query is expected to perform in the similar way?

  2. The aggregation query mentioned in he post looks complicated. I noted that you used $exists: false , and $ne: true . Note that in general, indexes and databases don’t perform best when you ask it to find things that doesn’t exist or doesn’t match. For example, instead of $ne: true , would it be possible to use $eq: false ? Can you help me understand to breakdown the query?

  3. Based on the sample data and the expected response posted, I tried to create the aggregation pipeline with a new index created on contact_id.
    The aggregation query looks lie:

db.users.aggregate([
    {
      '$lookup': {
        'from': 'users_data', 
        'localField': '_id', 
        'foreignField': 'contact_id', 
        'as': 'audience_id'
      }
    }, {
      '$unwind': {
        'path': '$audience_id'
      }
    }, {
      '$addFields': {
        'audience_id': '$audience_id.audience_id'
      }
    }
  ])

and return the response as:

....
{
    _id: ObjectId("63fc6a7ae47bdd0489381512"),
    first_name: 'Last',
    last_name: 'Test',
    full_name: 'Last Test',
    phone_number: '9816428369',
    email: 'adityasharmauna340@gmail.com',
    created_on: 1677486714,
    date: ISODate("2023-02-27T08:31:54.881Z"),
    is_new: 0,
    note: '',
    is_bounced: false,
    not_valid: false,
    is_verified: true,
    audience_id: ObjectId("62199c94eb3129e94bdeacb5")
  },
....

As mentioned in the SERVER-22622: Improve $lookup explain to indicate query plan on the “from” collection, the indexUsed parameter of the explain output defines the index used in the $lookup stage of the pipeline.

{
      '$lookup': {
        from: 'users_data',
        as: 'audience_id',
        localField: '_id',
        foreignField: 'contact_id',
        unwinding: { preserveNullAndEmptyArrays: false }
      },
      totalDocsExamined: Long("8"),
      totalKeysExamined: Long("8"),
      collectionScans: Long("0"),
      indexesUsed: [ 'contact_id_1' ],
      nReturned: Long("9"),
      executionTimeMillisEstimate: Long("1")
    },

Can you help me by confirming if the above query help in improving the performance of the query for your use case. If not, please share more details on what you are trying to achieve

Let us know if you have any further questions.
Regards
Aasawari

3 Likes