Double summation in group stage after 2 unwind ops

Vladimir · November 15, 2023, 3:21pm

Hello,

The following is the problem illustration:

Initial Document Structure: Suppose a document has two array fields, array1 and array2.

First $unwind: After unwinding array1, if array1 has n elements, you get n documents.
Second $unwind: If you then unwind array2, and if array2 has m elements, each of the n documents will be multiplied by m, resulting in n * m documents.

When a $group stage is then applied to sum a field, this multiplication of documents can lead to overcounting, as each original document is now represented multiple times.

I would like to kindly ask for a generally accepted practice/ pattern to avoid this. Perhaps the choice of using 2 unwinds is the issue? Or can the $group stage be modified to tackle this?

steevej · November 15, 2023, 6:14pm

It would be easier for us to understand your use-case if you could share sample documents and expected results.

Definitively, doing $unwind to do array computation within the same document is a waste of time and space. You definitively have to look at $reduce.

Once, the array of each document has been $reduced then you may look at $group for inter-document computations.

Vladimir · November 15, 2023, 7:22pm

@steevej Your reply made me wonder. When is $unwind not a waste of time and space and a good choice?

steevej · November 15, 2023, 8:09pm

I am not really sure but I would guess that it is when whatever you want to do cannot be done with $map, $filter or $reduce you use $unwind. It is a good choice when you do not have any other choice.

system · November 20, 2023, 8:09pm

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.