Finding ID of documents with duplicated elements in nested array

I would like to extract from the collection the IDs of documents that have duplicate IDs of “drives” objects that are nested in the array that is in “streetModel”.
This is my typical document :

  {
    "_id": {
        "$oid": "61375bec4fa522001b608568"
    },
    "name": "Streetz",
    "statusDetail": {},
    "streetModel": {
        "_id": "3.7389-51.0566",
        "name": "Kosheen - Darude - Swedish - Trynidad - Maui",
        "countryCode": "DEN",
        "drives": [{
            "_id": -903500698,
            "direction": "WEST"
            }, {
            "_id": 1915399546,
            "direction": "EAST"
            }, {
            "_id": 1294835467,
            "direction": "NORTH"
            }, {
            "_id": 1248969937,
            "direction": "EAST"
            }, {
            "_id": 1248969937,
            "direction": "EAST"
            }, {
            "_id": 1492411786,
            "direction": "SOUTH"
                }]
    },
    "createdAt": {
            "$date": "2021-09-07T12:32:44.238Z"
        }
    }

In this particular document with the ID 61375bec4fa522001b608568, in “streetModel”, in “drives” array I have got duplicated drives objects with id 1248969937.
I would like to create a query to the database that will return the ID of all documents with such a problem (duplicate “drives”).
Right now I have got this:

db.streets.aggregate([
  {
    $unwind: "$streetModel"
  },
  {
    $unwind: "$drives"
  },
  {
    $group: {
      _id: {
        id: "$_id"
      },
      sum: {
        $sum: 1
      },

    }
  },
  {
    $match: {
      sum: {
        $gt: 1
      }
    }
  },
  {
    $project: {
      _id: "$_id._id",
      duplicates: {
        drives: "$_id"
      }
    }
  }
])

but that’s not it.
I try in many ways to rewrite this query, but unfortunately it doesn’t work.

  • don’t need to $unwind for streetModel field because it is an object and unwind will work only on array fields
  • directly $unwind the object field streetModel.drives
  • $group by main _id and drives._id and get total sum
  • $match the greater than condition
  • you can use $project stage for formatting result
db.streets.aggregate([
  { $unwind: "$streetModel.drives" },
  {
    $group: {
      _id: {
        _id: "$_id",
        drives_id: "$streetModel.drives._id"
      },
      sum: { $sum: 1 }
    }
  },
  { $match: { sum: { $gt: 1 } } }
])

Result would be:

[
  {
    "_id": {
      "_id": ObjectId("61375bec4fa522001b608568"),
      "drives_id": 1.248969937e+09
    },
    "sum": 2
  }
]
1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.