get distinct values from a group aggregation pipeline

Hello,

I need your help !
In the same aggreation pipeline, I want to retrieve a list of wines and the list of region attached :
The exepected output is :

 wineList : {
        "_id": 47,
        "product_name": "Les menuts",
        "domaine": "Chateau Grace Dieu",
        "vintage": 2015,
        "color": "Rouge",
        "region": "Bordeaux",
        "price": null,
        "quantity": 1,
        "stock": 1,
        "wines": [
            {
                "id": "652008ea98d388c0f1a75f7d",
                "wine_id": 63,
                "quantity": 1,
                "cellar_location": "LF"
            }
        ]
    },
    {
        "_id": 45,
        "product_name": "La cupa",
        "domaine": "Chateau Laroque",
        "vintage": 2020,
        "color": "Rouge",
        "region": "Languedoc-Roussillon",
        "price": null,
        "quantity": 0,
        "stock": 0,
        "wines": [
            {
                "id": "652008ea98d388c0f1a75fb5",
                "wine_id": 61,
                "cellar_location": "LF"
            }
        ]
    },
    {
        "_id": 49,
        "product_name": "La Moniale",
        "domaine": "domaine le portail",
        "vintage": 2021,
        "color": "Rouge",
        "region": "Loire",
        "price": null,
        "quantity": 0,
        "stock": 0,
        "wines": [
            {
                "id": "652008ea98d388c0f1a75fa7",
                "wine_id": 65,
                "quantity": 0,
                "cellar_location": "LF"
            }
        ]
    }
,
{

regionList : {
    "Languedoc-Roussillon",
    "Bordeaux",
    "Loire"
}

}

I use this query to generate the wine List :


WineModel.aggregate(
[{ $match:filter},
  {
    $group: {
      _id: '$main_reference',
      product_name: {
        $first: "$product_name",
      },
      domaine: {
        $first: "$domaine",
      },
      vintage: {
        $first: "$vintage",
      },
      color: {
        $first: "$color",
      },
      region: {
        $first: "$region",
      },
      price: {
        $first: "$price",
      },
      quantity: {
        $sum: "$quantity",
      },
      stock: {
        $first: {
       $cmp: [{$sum: "$quantity"},0]
     }
      },
      wines: {
        $push: {
          id: '$_id',
          wine_id: '$wine_id',
          quantity: '$quantity',
          cellar_location: '$cellar_location'
        }
      },
    }
  }
],
{ maxTimeMS: 60000, allowDiskUse: true })

Thanks a lot for your help !

Your aggregation starts with _id:$main_reference.

The field main_reference which seems to be the main reference as it it use to define the group _id is nowhere to be found in the sample data you provided.

We cannot help.

Ok thanks. Let me add more details :
I use $main_reference as _id because it is the key that differentiates my records. What else should I use ?

The initial aggregation returns this extract :

[
    {
        "_id": 49,
        "product_name": "La Moniale",
        "domaine": "domaine le portail",
        "vintage": 2021,
        "color": "Rouge",
        "region": "Loire",
        "price": null,
        "quantity": 0,
        "stock": 0,
        "wines": [
            {
                "id": "652008ea98d388c0f1a75fa7",
                "wine_id": 65,
                "quantity": 0,
                "cellar_location": "LF"
            }
        ]
    },
    {
        "_id": 36,
        "product_name": "Clos roussots",
        "domaine": "Franck Lamargue",
        "vintage": 2020,
        "color": "Rouge",
        "region": "Bourgogne",
        "price": null,
        "quantity": 2,
        "stock": 1,
        "wines": [
            {
                "id": "652008ea98d388c0f1a75f95",
                "wine_id": 47,
                "quantity": 1,
                "cellar_location": "LF"
            },
            {
                "id": "652008ea98d388c0f1a75f8f",
                "wine_id": 49,
                "quantity": 1,
                "cellar_location": "LF"
            }
        ]
    },
    {
        "_id": 50,
        "product_name": "Chateauneuf-du-Pape",
        "domaine": "Xavier Vignon",
        "vintage": 2018,
        "color": "Rouge",
        "region": "Vallée du Rhône",
        "price": null,
        "quantity": 1,
        "stock": 1,
        "wines": [
            {
                "id": "652008ea98d388c0f1a75fb6",
                "wine_id": 66,
                "quantity": 1,
                "cellar_location": "LF"
            }
        ]
    },

based on the aggregation pipeline in the initial post, how can I add the list of region returned in the first group i.e :
{ "Loire", "Bourgogne","Vallée du Nord"}

Hope this helps !

I really do not understand the above. The field main_reference is not even present in the documents you shared. How can it be the key that differentiates you records?

The values

are the values of the field region so that should be your _id.

The words

make me think that there is something else at stake and that the documents you share are not the original documents from your collection.

You will need to share the origin documents and what ever pipeline is giving you the documents you already share.

Hi @steevej ,

Here an extract of the documents

[{
  "_id": {
    "$oid": "652008ea98d388c0f1a75f94"
  },
  "wine_id": 68,
  "product_name": "Closerie d'Ibry - chemin partagé",
  "domaine": "Saint-Georges d'Ibry",
  "vintage": 2021,
  "color": "Rouge",
  "product_type": "Vin",
  "price": 10,
  "quantity": 1,
  "cellar_location": "LF",
  "__v": 2,
  "cellar": [
    {
      "$oid": "658d9a5dd2a3f55f1ae7ca16"
    }
  ],
  "region": "Languedoc-Roussillon",
  "main_reference": 39
},
{
  "_id": {
    "$oid": "652008ea98d388c0f1a75f91"
  },
  "wine_id": 56,
  "product_name": "Cos D'estournel",
  "domaine": "Cos D'estournel",
  "vintage": 2004,
  "appelation": "Saint-Estèphe",
  "region": "Bordeaux",
  "color": "Rouge",
  "product_type": "Vin",
  "quantity": 1,
  "cellar_location": "LF",
  "__v": 2,
  "cellar": [
    {
      "$oid": "658d9a5dd2a3f55f1ae7ca16"
    }
  ],
  "main_reference": 41
},
{
  "_id": {
    "$oid": "652008ea98d388c0f1a75fb3"
  },
  "wine_id": 6,
  "product_name": "Cuvée Prestige Guillaume ESTREM",
  "domaine": "Chateau Laroque",
  "vintage": 2021,
  "appelation": "Pic Saint-Loup",
  "region": "Languedoc-Roussillon",
  "color": "Rouge",
  "product_type": "Vin",
  "purchase_place": "cadeau",
  "quantity": 1,
  "cellar_location": "LF",
  "__v": 2,
  "cellar": [
    {
      "$oid": "658d9a5dd2a3f55f1ae7ca16"
    }
  ],
  "main_reference": 5
},
{
  "_id": {
    "$oid": "652008ea98d388c0f1a75f7b"
  },
  "wine_id": 59,
  "product_name": "Chateau Beaumont",
  "domaine": "Chateau Beaumont",
  "vintage": 2015,
  "appelation": "Haut-Médoc - Cru bourgeois",
  "region": "Bordeaux",
  "color": "Rouge",
  "product_type": "Vin",
  "quantity": 0,
  "wineAdvisor": "o",
  "main_reference": 44
},
{
  "_id": {
    "$oid": "652008ea98d388c0f1a75fa2"
  },
  "wine_id": 44,
  "product_name": "La Gerbotte",
  "domaine": "domaine de l’Arlot",
  "vintage": 2020,
  "appelation": "Nuits-Saint-Georges",
  "region": "Bourgogne",
  "color": "Blanc",
  "product_type": "Vin",
  "quantity": 1,
  "cellar_location": "LF",
  "wineAdvisor": "o",
  "__v": 2,
  "cellar": [
    {
      "$oid": "658d9a5dd2a3f55f1ae7ca16"
    }
  ],
  "main_reference": 33
},
{
  "_id": {
    "$oid": "652008ea98d388c0f1a75f95"
  },
  "wine_id": 47,
  "product_name": "Clos roussots",
  "domaine": "Franck Lamargue",
  "vintage": 2020,
  "appelation": "Maranges 1er Cru",
  "region": "Bourgogne",
  "color": "Rouge",
  "product_type": "Vin",
  "quantity": 1,
  "cellar_location": "LF",
  "wineAdvisor": "o",
  "__v": 2,
  "cellar": [
    {
      "$oid": "658d9a5dd2a3f55f1ae7ca16"
    }
  ],
  "main_reference": 36
},
{
  "_id": {
    "$oid": "652008ea98d388c0f1a75f64"
  },
  "wine_id": 43,
  "product_name": "Hospice de Beaune",
  "domaine": "Hospice de Beaune",
  "vintage": 2000,
  "appelation": "Pommard",
  "region": "Bourgogne",
  "color": "Rouge",
  "product_type": "Vin",
  "quantity": 1,
  "cellar_location": "LF",
  "etiquette": "bon etat",
  "__v": 2,
  "cellar": [
    {
      "$oid": "658d9a5dd2a3f55f1ae7ca16"
    }
  ],
  "main_reference": 32
},
{
  "_id": {
    "$oid": "652008ea98d388c0f1a75f8f"
  },
  "wine_id": 49,
  "product_name": "100%",
  "domaine": "Xavier Vignon",
  "vintage": 2020,
  "appelation": "Cote du rhone",
  "region": "Vallée du Rhône",
  "color": "Rouge",
  "product_type": "Vin",
  "quantity": 1,
  "cellar_location": "LF",
  "__v": 2,
  "cellar": [
    {
      "$oid": "658d9a5dd2a3f55f1ae7ca16"
    }
  ],
  "main_reference": 36
},
{
  "_id": {
    "$oid": "6536960bed33fb918bb6dffd"
  },
  "wine_id": 95,
  "product_name": "Les benoites",
  "domaine": "domaine Dominique lahaye",
  "vintage": 2020,
  "appelation": "Chassagne-Montrachet",
  "region": "Bourgogne",
  "color": "Blanc",
  "product_type": "Vin",
  "quantity": 1,
  "cellar_location": "NSG",
  "etiquette": "Neuf",
  "__v": 1,
  "cellar": [
    {
      "$oid": "658d9a2fd2cf56da75c8126f"
    }
  ],
  "main_reference": 67
}]

I am still not sure exactly of want you want but I will give it a go.

I think you want a group of group where the top level group is main_reference and the sub-groups are region. Do achieve that you have to group on a compound _id such as

{ $group:
  {
    _id : { main_reference : "$main_reference" , region : "$region" }
} }

/* and in later stages */

{ $group: {
    _id: "$_id.main_reference" ,
    regions: { $push: {
        region : "$_id.region" ,
        /* what ever data you got in the first $group */
    } }
} }

Thanks @steevej for your help. I finally did a separated query and join results in an array manually on the backend side.

1 Like

Sometimes it is indeed better this way. With $group there is always a risk of reaching the size limit especially with $push.