$addToSet with custom comparator

Hi everyone.
Have the following document:

{
    "arr": [
        {
            "id": 1,
            "title": "title1"
        },
        {
            "id": 2,
            "title": "title2"
        }
    ]
}

I need to add a new object to arr field but only if it’s not present there.
I need to be able to specify what “present” means. In most languages that’s done by using “comparators”. How do I provide some sort of comparator to mongo’s $addToSet?

In my case I want to compare objects by id field
So the following objects should be equal:

{id: 1, title: "whatever"}
{id: 1, title: "doesn't matter"}

The official documentation of $addToSet provides only trivial examples when array consists of simple elements like strings or numbers, which obviously don’t need any comparators.

Hello, @Mykola_Ilminsky ! Welcome to the MongoDB community! :wave:

You can’t do that. $addToSet treats object like a big single value. So, from the $addToSet operator’s point of view, if two objects have same field set and exact same values for those fields - those two objects are considered equal. To compare only selected fields in objects, you need to find other solutions.

Let me demonstrate how it can be done without $addToSet.

First, I will create some sample dataset to work with:

db.comparators.insertMany([
  {
    _id: 'A',
    arr: [
      {
        id: 1,
        title: 'title 1'
      },
      {
        id: 2,
        title: 'title 2'
      }
    ]
  },
  {
    _id: 'B',
    arr: [
      {
        id: 1,
        title: 'title 1'
      },
      {
        id: 2,
        title: 'title 2'
      },
      {
        id: 1,
        title: 'title 3'
      }
    ]
  }
]);

Use case 1. Update document arr field with new, but “unique” object.

Let’s suppose we want to update document A by adding brandNewItem into the array of document A. Check with the dataset: item with id=2 already in array, so we should check for its existence and do not insert item with duplicated id.

const brandNewItem = {
  id: 2,
  title: 'title 2',
};

This is how it can be done in the aggregation pipeline:

db.comparators.aggregate([
  {
    $match: {
      _id: 'A', // match specific document to update
    }
  },
  {
    // check if array already contains object with the same id
    $addFields: {
      totalFound: {
        $reduce: {
          input: '$arr',
          initialValue: 0,
          in: {
            $cond: [
              // brandNewItem variable is used
              { $eq: ['$$this.id', brandNewItem.id] }, 
              { $sum: ['$$value', 1] },
              { $sum: ['$$value', 0] }
            ]
          },
        }
      }
    }
  },
  {
    $project: {
      arr: {
        $cond: [
          { $eq: ['$totalFound', 0 ] },
          { 
            // brandNewItem variable is used
            $concatArrays: ['$arr', [brandNewItem]]
          },
          '$arr', // reuturn 'arr' array as it was initally
        ],
      }
    }
  }
]);

The aggregation pipeline above won’t add item with id=2, but if you change it to 3 - item will be added. You can persist the result by adding a $merge stage in the end of the pipeline.

Use case 2: inside $group stage
What if you want to remove items with duplicated id from your arr array?
It can be done like this:

db.comparators.aggregate([
  {
    $match: {
      _id: 'B', // match specific document
    }
  },
  {
    $unwind: '$arr'
  },
  {
    $group: {
      _id: {
        docId: '$_id',
        arrItemId: '$arr.id'
      },
      arr: {
        $addToSet: {
          id: '$arr.id',
          title: '$arr.title'
        }
      }
    }
  },
  {
    $group: {
      _id: '$_id.docId',
      arr: {
        $push: 
          { $arrayElemAt: ['$arr', 0] }
      }
    }
  }
]);

OR
You can go crazy and use $accumulator for the same purpose. Note, that solution with $accumulator may be slower, than the previous one, as it contains custom js-code. Only use the $accumulator operator if the provided pipeline operators cannot fulfill your application’s needs.

db.comparators.aggregate([
  {
    $match: {
      _id: 'B', // match specific document
    }
  },
  {
    $unwind: '$arr'
  },
  {
    $group: {
      _id: '$_id',
      arr: {
        $accumulator: {
          init: function () {
            return {};
          },
          accumulate: function (state, arrItem) {
            state[arrItem.id] = arrItem;
            return state;
          },
          accumulateArgs: ['$arr'],
          merge: function () {
            return {};
          },
          finalize: function (state) {
            return Object.keys(state).map(function (key) {
              return state[key];
            });
          },
          lang: 'js'
        }
      }
    }
  }
]);

An alternative for very simple cases is that rather than using $addToSet use $push but add the following to your query.

{ "arr.id" : { "$ne" : item.id } } 

Using the collection:

{  _id: 101,
  arr: [
    {      id: 1,      title: 'title1'    },
    {      id: 2,      title: 'title2'    }
  ]
}
{
  _id: 102,
  arr: [
    {      id: 1,      title: 'title1'    },
    {      id: 3,      title: 'title3'    }
  ]
}

Using the same:

const brandNewItem = {
  id: 2,
  title: 'title 2',
};

The following update will not succeed:

c.updateOne( { "_id" : 101 , "arr.id" : { "$ne" : brandNewItem.id } } , { "$push" : { "arr" : brandNewItem } } )

while the following will succeed

c.updateOne( { "_id" : 102 , "arr.id" : { "$ne" : brandNewItem.id } } , { "$push" : { "arr" : brandNewItem } } )
1 Like