Getting lost with $ObjectToArray

Hello community,

Despite the few hours spent searching here and there for a solution, I cannot get around. Here is my document structure:

"CalculationDetails": [
        {
          "Code": "STEP_1_LABEL",
          "Parameters": {
            "PARAM_1_LABEL": {
              "string": "5.88"
            },
            "PARAM_2_LABEL": {
              "string": "5.0"
            },
            "PARAM_3_LABEL": {
              "string": "0.005"
            }
          },
          "Value": 0.004900000058114529
        },
        {
          "Code": "STEP_2_LABEL",
          "Parameters": {"PARAM_1_LABEL": {
              "string": "5.88"
            },
            "PARAM_4_LABEL": {
              "string": "0.001234"},
          "Value": 5.881234
        },

Main problematic: using the aggregation framework, I aim to build the following strucuture:

"steps":
        {
          "STEP_1_LABEL": 0.004900000058114529,
          "STEP_2_LABEL": 5.881234,
        }

I tried my way around $ObjectToArray, $ArrayToObject and $map, but couldn’t get to a conclusive result because my original document doesn’t have a “k”, “v” shape. I tried renaming “Code” into “k” and “Value” into “v”, but unsuccessfully.

Other information: I’m trying to get to my result without using $unwind, because my collection is already 2M documents large, and each document may contain up to 25 steps. It’s not a problem for MongoDB per se, but it is for PowerBI when I export data using the BI connector (very poor performance).

Also, I’m trying to avoid using $unwind + $group, because Compass can’t deal with such an amount of data with $group, at least with an M30 cluster.

Any idea how to reach the desired output given the constraints above?

Bonus question: ideally, I also wish my output to contain a second object:

"params":
        {
          "PARAM_1_LABEL": "5.88",
          "PARAM_2_LABEL": "5.00",
          "PARAM_3_LABEL": "0.005",
          "PARAM_4_LABEL": "0.001234",
        }

Parameters may appear multiple times (up to one time per step), however I require to output them only once.

For the same reason as stated above, I wish to avoid using an $unwind + $group (even though it would do the job) because the amount of data crashes Compass

Thank you very much for your help and happy new year everyone!

Is the way to go.

First aggregation stage is a $set with a $map to extract the appropriate values from CalculationDetails

map = { "$map" : {
  "input" : "$CalculationDetails" ,
  "in" : { "k" : "$$this.Code" , "v" : "$$this.Value" }
} }
set_1 = { "$set" : { "_steps" : map } }

The second stage is another $set that uses $arrayToObject on the temp. _steps to produce the final steps object.:

set_2 = { "$set" : { "steps" : { "$arrayToObject" : "$_steps" } } }

I will have to think a little bit more for the params part.

You may of course do set_1 and set_2 in a single stage by replacing $_steps by the map but it is easier to understand, to debug and to modify when the stages are kept simple.

1 Like

As for params, I would use $reduce in a $set stage with $mergeObjects to assembles the Parameters in the params result. Something like:

set_params = { "$set" : {
    "params" : { "$reduce" : {
        "input" : "$CalculationDetails" ,
        "initialValue" : { } ,
        "in" : { "$mergeObjects" : [ "$$value" , "$$this.Parameters" ] }
    } }
} }

The field Parameters should be an array. Dynamic field names like PARAM_1_LABEL, … is never a good idea.

2 Likes

Hello @steevej,

Thank you for taking the time to look into my issue! Your solution works perfecty, thanks a lot! As the question was originally about an aggregation, I’ll repost your answer using that syntax, in case it could benefit other people:

[
  {
    $addFields: {
      results: {
        $map: {
          input:
            "$CalculationDetails",
          in: {
            k: "$$this.Code",
            v: "$$this.Value",
          },
        },
      },
    },
  },
  {
    $addFields: {
      results: {
        $arrayToObject: "$results",
      },
    },
  }
]

I decided to use the same variable in both aggregation stages, knowing that the result is saved as a view and that I can return to its aggregation builder in case of debug

1 Like

Thank you @steevej for this insight as well. It gives me almost the output I’m looking for:

params: {
   "PARAM_1_LABEL": {
      string: "5.88"
      },
   "PARAM_2_LABEL": {
      string: "5.00"
      },
(...)

To get the expected result, I added a few tweaks to your proposal:

[
  {
    $addFields: {
      params: {
        $reduce: {
          input:
            "$CalculationDetails",
          initialValue: {},
          in: {
            $mergeObjects: ["$$value", "$$this.Parameters"],
          }
        }
      }
    }
  },
  {
    $addFields: {
      params: {
        $objectToArray: "$params",
      }
    }
  },
  {
    $addFields: {
      params: {
        $map: {
          input: "$params",
          in: {
            k: "$$this.k",
            v: "$$this.v.string",
          }
        }
      }
    }
  },
  {
    $addFields: {
      params: {
        $arrayToObject: "$params",
      }
    }
  }
]

Output:

"params":
        {
          "PARAM_1_LABEL": "5.88",
          "PARAM_2_LABEL": "5.00",
        }
1 Like

I always leave cosmetic stage as an exercise to the reader.

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.