Aggregate two groups on same collection

Given is a collection with following fields:

  • token_bought_address
  • token_sold_address
  • token_bought_amount
  • token_sold_amount
  • taker
  • maker

I’d like to group the data by taker and maker separately.
Every group has a added field amount.
In the taker group the field is a sum of token_bought_amount and in the maker group a sum of token_sold_amount.
In the last step the two groups are merged where amount of taker group is called bought and amount of maker group is called sold.

I can do this in SQL with ease, but have my problems with MongoDB.
Appreciate your help!

How many groups are you looking at in the dataset? One option is facets but be wary that this will then impose a document size limit which is 16mb.

Let’s take this dataset with just two entries as an example

[
  {
    "token_bought_address": "0xAA",
    "token_sold_address": "0xBB",
    "token_bought_amount": 10,
    "token_sold_amount": 5,
    "maker": "0x11",
    "taker": "0x22"
  },
  {
    "token_bought_address": "0xBB",
    "token_sold_address": "0xAA",
    "token_bought_amount": 3,
    "token_sold_amount": 6,
    "maker": "0x22",
    "taker": "0x11"
  }
]

Where taker is the one who bought the tokens and maker the one who sold and I’m only interested in the token 0xBB

The result I’m looking for would be this

[
  {
    "wallet": "0x11",
    "amount": 3,
  },
   {
    "wallet": "0x22",
    "amount": 2,
   }
]

I don’t see how the sample data above gives the result you posted, unless I’m being stupid (which isn’t the first time and won’t be the last…)

Can you post your SQL that you normally do this with as that’ll be a concrete definition.

Sure

with token_balances as (
    select -- tokens sold
        -sum(token_sold_amount) as amount
        , maker as wallet
    from trades
    where token_sold_address = '0xBB'
    group by 2
    
    union all
    
    select -- tokens bought
        sum(token_bought_amount) as amount
        , taker as wallet
    from trades
    where token_bought_address = '0xBB'
    group by 2
)

select
	wallet
        , sum(amount) as amount
    from token_balances
    group by 1

This is the kind of thing I was thinking of, obviously will need a supporting index for the initial match stage and be wary of data volumes that hit the $facet as they cannot make use of indexes:

db.getCollection("token_balances").aggregate([
{
    $match:{
        $or:[
            {"token_bought_address":"0xBB"},
            {"token_sold_address":"0xBB"}
        ]
    }
},
{
    $facet:{
        sold:[
            {
                $match:{
                    "token_sold_address":"0xBB"
                }
            },
            {
                $group:{
                    _id:'$maker',
                    total:{$sum:{$multiply:[-1, '$token_sold_amount']}}
                }
            }
        ],
        bought:[
            {
                $match:{
                    "token_bought_address":"0xBB"
                }
            },
            {
                $group:{
                    _id:'$taker',
                    total:{$sum:'$token_bought_amount'}
                }
            }
        ],
    }
},
{
    $project:{
        allItem:{
            $setUnion:["$sold","$bought"]
        }
    }
},
{
    $unwind:"$allItem"
},
{
    $replaceRoot:{
        newRoot:"$allItem"
    }
},
{
    $group:{
        _id:'$_id',
        total:{$sum:"$total"}
    }
},
{
    $project:{
        _id:0,
        "wallet":"$_id",
        "amount":"$total"
    }
}
])

With you data, I pushed it into a local MariaDB instance:
image
and ran your query on it:

with token_balances as (
    select -- tokens sold
        -sum(token_sold_amount) as amount
        , maker as wallet
    from trades
    where token_sold_address = '0xBB'
    group by 2
    
    union all
    
    select -- tokens bought
        sum(token_bought_amount) as amount
        , taker as wallet
    from trades
    where token_bought_address = '0xBB'
    group by 2
)

select
	wallet
        , sum(amount) as amount
    from token_balances
    group by 1

Which gave this:
image

The Mongo equivalent gave:
image

You could eliminate the $repalceRoot stage from the aggregation pipeline as it’s just to simplify the next stage.

I obviously don’t have your dataset, but have a play with volumes and see how it performs on your data.

1 Like

Thanks you very much

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.