Two $unwind stages for two document fields in same pipeline

I have two fields - both arrays of objects.
I need to unwind both and run a match on each. Right now I have them split into two separate aggregations. $unwind is a mutating stage so is it possible to to do the second $unwind in the same pipeline ?

Working in Node, I can run them separately as promises and using promise.all get both aggregations in one response. I still want to see if I can do this in one aggregation.

It will be easier to help you if you supply sample documents.

You may have as many unwind stage as you have array in your documents. Example:

// Starting with the following collection with 2 arrays
c.find()
{ _id: 1, a: [ 1, 2, 3 ], b: [ 4, 5, 6 ] }

// A pipeline with 2 unwind stages
c.aggregate(  [ { '$unwind': '$a' } , { '$unwind': '$b' } ] )

// Gives the following
{ _id: 1, a: 1, b: 4 }
{ _id: 1, a: 1, b: 5 }
{ _id: 1, a: 1, b: 6 }
{ _id: 1, a: 2, b: 4 }
{ _id: 1, a: 2, b: 5 }
{ _id: 1, a: 2, b: 6 }
{ _id: 1, a: 3, b: 4 }
{ _id: 1, a: 3, b: 5 }
{ _id: 1, a: 3, b: 6 }

unwinds before my match stage didn’t work out well for me - let me explain further:
This is my document, which is representative of a music album.

{
_id: ObjectId('generated by Mongo')
path: 'C:/Music/name_of_album,
root: 'C:/Music'
name: 'name_of_album'
created: (this is just the birthtime of the album folder.
files: [
{
file: 'C:/Music/name_of_album/track,
ext: '.mp3',
artist: // taken from the files metadata,
album: // taken from the file's metadata,
title: // taken from the file's metadata,
genre: // taken from the file's metadata
},
.......
]
// some albums may have subdirectory folders for (multi 'cd' albums)
subdirectories: [
{
directory: 'C:/Music/name_of_album/cd1,
dirFiles: [
{
file: 'C:/Music/name_of_album/cd1/track,
ext: '.mp3',
artist: // taken from the files metadata,
album: // taken from the file's metadata,
title: // taken from the file's metadata,
genre: // taken from the file's metadata
}, ....
]
}
]

I want to aggregate a list of files that are audio only, from both the ‘top level’ files array and the subdirectories dirFiles array.

I used this pipeline and it worked great (individually) -

const pipeline = [
    {
      $unwind: {
        path: '$files',
      },
    },
    {
      $match: {
        $or: [
          { 'files.ext': { $eq: '.ape' } },
          { 'files.ext': { $eq: '.mp3' } },
          { 'files.ext': { $eq: '.flac' } },
          { 'files.ext': { $eq: '.m4a' } },
          { 'files.ext': { $eq: '.ogg' } },
        ],
      },
    },
    {
      $project: {
        file: '$files.file',
        artist: '$files.artist',
        title: '$files.title',
        artists: '$files.artists',
        genre: '$files.genre',
      },
    },
  ];

However when I do all the $unwinds first, then add the match I get a hot mess of files that were not in the match, plus repeats. I kind of understand why that is happening.

I also tried just using a $project stage with $filter - the problem is $filter (unlike JS filter) returns an empty array (along with the _id) where the condition wasn’t met. I’d prefer something like a JS array filter method where ONLY the conditions matched are returned. Maybe it’s possible with $filter :man_shrugging:t5:

My goal is to have another collection with just the files and metadata, using $out to accomplish it.

I tried two separate pipelines (one for the files and another for the subdirectory files) and then put them into a $facet stage. However I got an error complaining about ‘empty objects’ are not permitted in $facet. Some of the metadata objects are empty so perhaps that is the problem.

My solution at this point is having the two separate pipelines and running them separately in a Promise.all.

1 Like

Just wanted to update that I did wind up getting $filter working well. Still haven’t added in the subdirectories yet. I’ll probably go down the $filter path for now and compare. I do like the returned format from unwind but I think I can achieve something similar with $filter.

const pipeline = [
    {
      $set: {
        items: {
          $filter: {
            input: '$files',
            as: 'item',
            cond: {
                 $in: ['$$item.ext', ['.ape', '.flac', '.mp3', '.m4a', '.ogg']],
            },
          },
        },
      },
    },
    {
      $match: {
        items: { $gte: [{ $size: '$items' }, 1] },
      },
    },
    {
      $project: {
        'items.file': 1,
        'items.artist': 1,
        'items.album': 1,
        'items.title': 1,
        'items.genre': 1,
      },
    },
  ];