Double summation in group stage after 2 unwind ops


The following is the problem illustration:

Initial Document Structure: Suppose a document has two array fields, array1 and array2.

  1. First $unwind: After unwinding array1, if array1 has n elements, you get n documents.
  2. Second $unwind: If you then unwind array2, and if array2 has m elements, each of the n documents will be multiplied by m, resulting in n * m documents.

When a $group stage is then applied to sum a field, this multiplication of documents can lead to overcounting, as each original document is now represented multiple times.

I would like to kindly ask for a generally accepted practice/ pattern to avoid this. Perhaps the choice of using 2 unwinds is the issue? Or can the $group stage be modified to tackle this?

It would be easier for us to understand your use-case if you could share sample documents and expected results.

Definitively, doing $unwind to do array computation within the same document is a waste of time and space. You definitively have to look at $reduce.

Once, the array of each document has been $reduced then you may look at $group for inter-document computations.

@steevej Your reply made me wonder. When is $unwind not a waste of time and space and a good choice?

I am not really sure but I would guess that it is when whatever you want to do cannot be done with $map, $filter or $reduce you use $unwind. It is a good choice when you do not have any other choice.

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.