Aggregation with transactional data should return unique value only

Hi everyone,

seeking help with an aggregations query. I have a collection Matches which has transactional data in it, identified by kit1 and kit2.

This is a sample doc (other fields hidden for simplicity):

{ “_id” : “P5M5NzYGJ5aKk5FSv”, “kit1” : “7jVr2Hul1Vae3bqRVhyKA”, “kit2” : “7lljS20fzXF-uv2aAh3WE” }

I’m querying for all Matches docs where either kit1 or kit2 is equal to a certain identifier (kitRemove') and want to return the identifier that is not kitRemove` (using Meteor framework in my app - hence the use of JavaScript code):

const kitsWithRelationArray = await MatchesRaw.aggregate([
        { $match: { $or: [{ kit1: kitRemove }, { kit2: kitRemove }] } },
        { $project: { kit: { $cond: { if: { $eq: ["$kit1", kitRemove] }, then: "$kit2", else: "$kit1" } }, _id: 0 } },
    ], { session: mongoSession }).toArray();

This is the result I get from the above query:
MongoDb aggregate result

I’m facing two problems with the result:

  1. I get some kit identifiers as duplicates because more than 1 Matches doc can exist for the same kit1 vs kit2 combination.

Example:

{ “_id” : “troqybYEBCPB97cDr”, “kit1” : “7jVr2Hul1Vae3bqRVhyKA”, “kit2” : “85n_Re9XRQCiYQd-VxzhV” }
{ “_id” : “YKXKhaNf7xCrihFPM”, “kit1” : “7jVr2Hul1Vae3bqRVhyKA”, “kit2” : “85n_Re9XRQCiYQd-VxzhV” }

Question 1: How can avoid getting duplicate values like this?

  1. I’m currently getting an array of objects back. While I can certainly handle that, it would be nicer to just get an array of those unique identifiers like this:

[“7lljS20fzXF-uv2aAh3WE”, “7pgnWX288a3V0RdIF-zlq”, “85n_Re9XRQCiYQd-VxzhV”]

Question 2: How can I alter the query to get an array of strings (kit) back?

Many thanks in advance, still learning all the bits and pieces of MongoDb’s powerful functions.

Hello @Andreas_West, here are some ideas and solutions.

Question 1: How can avoid getting duplicate values like this?

In your query, after the $project stage, include this new $group stage, to get the distinct kit values.

{ 
  $group: { 
      _id: null, 
      "distinct_kits": { $addToSet: "$kit" } 
  } 
}

Then extract the distinct_kits from the kitsWithRelationArray:

var distinct_kits_array = kitsWithRelationArray[0].distinct_kits

The distinct_kits_array will be an array of distinct kit values - your final result.


Question 2: How can I alter the query to get an array of strings (kit) back?

There is no way to get an array of strings from the aggregation directly - this is because the aggregation result is always a cursor of document(s) (or object(s)). You need to extract the values as per your application needs within the application code (as I had shown above).


NOTE: I had noticed that in your query’s $match stage you are checking for kit1 or kit2 matching. I am assuming you don’t need the check for kit1 and kit2 are matching.

Thanks @Prasad_Saya for taking time to look at my problem and come up with a solution.

Will implement the additional $group stage step!

It gets rid of the duplicate values and I think I can work around the fact that it’s now an array of length 1 with an object which then has the final array in it that I want:

Screenshot 2021-04-15 at 22.09.15

That is correct, they can never be the same.

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.