In the first post of this series, we learned how to group currency data based on given time intervals to generate candlestick charts. In the second article, we learned how to calculate simple moving average and exponential moving average on the currencies based on a given time window. Now, in this post we’ll learn how to calculate more complex technical indicators.

## #MACD Indicator

MACD (Moving Average Convergence Divergence) is another trading indicator and provides visibility of the trend and momentum of the currency/stock. MACD calculation fundamentally leverages multiple EMA calculations with different parameters.

As shown in the below diagram, MACD indicator has three main components: MACD Line, MACD Signal, and Histogram. (The blue line represents MACD Line, the red line represents MACD Signal, and green and red bars represent histogram):

- MACD Line is calculated by subtracting the 26-period (mostly, days are used for the period) exponential moving average from the 12-period exponential moving average.
- After we get the MACD Line, we can calculate the MACD Signal. MACD Signal is calculated by getting the nine-period exponential moving average of MACD Line.
- MACD Histogram is calculated by subtracting the MACD Signal from the MACD Line.

We can use the MongoDB Aggregation Framework to calculate this complex indicator.

In the previous blog posts, we learned how we can group the second-level raw data into five-minutes intervals through the `$group`

stage and `$dateTrunc`

operator:

1 db.ticker.aggregate([ 2 { 3 $match: { 4 symbol: "BTC-USD", 5 }, 6 }, 7 { 8 $group: { 9 _id: { 10 symbol: "$symbol", 11 time: { 12 $dateTrunc: { 13 date: "$time", 14 unit: "minute", 15 binSize: 5, 16 }, 17 }, 18 }, 19 high: { $max: "$price" }, 20 low: { $min: "$price" }, 21 open: { $first: "$price" }, 22 close: { $last: "$price" }, 23 }, 24 }, 25 { 26 $sort: { 27 "_id.time": 1, 28 }, 29 }, 30 { 31 $project: { 32 _id: 1, 33 price: "$close", 34 }, 35 } 36 ]);

After that, we need to calculate two exponential moving averages with different parameters:

1 { 2 $setWindowFields: { 3 partitionBy: "_id.symbol", 4 sortBy: { "_id.time": 1 }, 5 output: { 6 ema_12: { 7 $expMovingAvg: { input: "$price", N: 12 }, 8 }, 9 ema_26: { 10 $expMovingAvg: { input: "$price", N: 26 }, 11 }, 12 }, 13 }, 14 }

After we calculate two separate exponential moving averages, we need to apply the `$subtract`

operation in the next stage of the aggregation pipeline:

1 { $addFields : {"macdLine" : {"$subtract" : ["$ema_12", "$ema_26"]}}}

After we’ve obtained the `macdLine`

field, then we can apply another exponential moving average to this newly generated field (`macdLine`

) to obtain MACD signal value:

1 { 2 $setWindowFields: { 3 partitionBy: "_id.symbol", 4 sortBy: { "_id.time": 1 }, 5 output: { 6 macdSignal: { 7 $expMovingAvg: { input: "$macdLine", N: 9 }, 8 }, 9 }, 10 }, 11 }

Therefore, we will have two more fields: `macdLine`

and `macdSignal`

. We can generate another field as `macdHistogram`

that is calculated by subtracting the `macdSignal`

from `macdLine`

value:

1 { $addFields : {"macdHistogram" : {"$subtract" : ["$macdLine", "$macdSignal"]}}}

Now we have three derived fields: `macdLine`

, `macdSignal`

, and `macdHistogram`

. Below, you can see how MACD is visualized together with Candlesticks:

This is the complete aggregation pipeline:

1 db.ticker.aggregate([ 2 { 3 $match: { 4 symbol: "BTC-USD", 5 }, 6 }, 7 { 8 $group: { 9 _id: { 10 symbol: "$symbol", 11 time: { 12 $dateTrunc: { 13 date: "$time", 14 unit: "minute", 15 binSize: 5, 16 }, 17 }, 18 }, 19 high: { $max: "$price" }, 20 low: { $min: "$price" }, 21 open: { $first: "$price" }, 22 close: { $last: "$price" }, 23 }, 24 }, 25 { 26 $sort: { 27 "_id.time": 1, 28 }, 29 }, 30 { 31 $project: { 32 _id: 1, 33 price: "$close", 34 }, 35 }, 36 { 37 $setWindowFields: { 38 partitionBy: "_id.symbol", 39 sortBy: { "_id.time": 1 }, 40 output: { 41 ema_12: { 42 $expMovingAvg: { input: "$price", N: 12 }, 43 }, 44 ema_26: { 45 $expMovingAvg: { input: "$price", N: 26 }, 46 }, 47 }, 48 }, 49 }, 50 { $addFields: { macdLine: { $subtract: ["$ema_12", "$ema_26"] } } }, 51 { 52 $setWindowFields: { 53 partitionBy: "_id.symbol", 54 sortBy: { "_id.time": 1 }, 55 output: { 56 macdSignal: { 57 $expMovingAvg: { input: "$macdLine", N: 9 }, 58 }, 59 }, 60 }, 61 }, 62 { 63 $addFields: { macdHistogram: { $subtract: ["$macdLine", "$macdSignal"] } }, 64 }, 65 ]);

## #RSI Indicator

RSI (Relativity Strength Index) is another financial technical indicator that reveals whether the asset has been overbought or oversold. It usually uses a 14-period time frame window, and the value of RSI is measured on a scale of 0 to 100. If the value is closer to 100, then it indicates that the asset has been overbought within this time period. And if the value is closer to 0, then it indicates that the asset has been oversold within this time period. Mostly, 70 and 30 are used for upper and lower thresholds.

Calculation of RSI is a bit more complicated than MACD:

- For every data point, the gain and the loss values are set by comparing one previous data point.
- After we set gain and loss values for every data point, then we can get a moving average of both gain and loss for a 14-period. (You don’t have to apply a 14-period. Whatever works for you, you can set accordingly.)
- After we get the average gain and the average loss value, we can divide average gain by average loss.
- After that, we can smooth the value to normalize it between 0 and 100.

### #Calculating Gain and Loss

Firstly, we need to define the gain and the loss value for each interval.

The gain and loss value are calculated by subtracting one previous price information from the current price information:

- If the difference is positive, it means there is a price increase and the value of the gain will be the difference between current price and previous price. The value of the loss will be 0.
- If the difference is negative, it means there is a price decline and the value of the loss will be the difference between previous price and current price. The value of the gain will be 0.

Consider the following input data set:

1 {"_id": {"time": ISODate("20210101T17:00:00"), "symbol" : "BTC-USD"}, "price": 35050} 2 {"_id": {"time": ISODate("20210101T17:05:00"), "symbol" : "BTC-USD"}, "price": 35150} 3 {"_id": {"time": ISODate("20210101T17:10:00"), "symbol" : "BTC-USD"}, "price": 35280} 4 {"_id": {"time": ISODate("20210101T17:15:00"), "symbol" : "BTC-USD"}, "price": 34910}

Once we calculate the Gain and Loss, we will have the following data:

1 {"_id": {"time": ISODate("20210101T17:00:00"), "symbol" : "BTC-USD"}, "price": 35050, "previousPrice": null, "gain":0, "loss":0} 2 {"_id": {"time": ISODate("20210101T17:05:00"), "symbol" : "BTC-USD"}, "price": 35150, "previousPrice": 35050, "gain":100, "loss":0} 3 {"_id": {"time": ISODate("20210101T17:10:00"), "symbol" : "BTC-USD"}, "price": 35280, "previousPrice": 35150, "gain":130, "loss":0} 4 {"_id": {"time": ISODate("20210101T17:15:00"), "symbol" : "BTC-USD"}, "price": 34910, "previousPrice": 35280, "gain":0, "loss":370}

But in the MongoDB Aggregation Pipeline, how can we refer to the previous document from the current document? How can we derive the new field (`$previousPrice`

) from the previous document in the sorted window?

MongoDB 5.0 introduced the `$shift`

operator that includes data from another document in the same partition at the given location, e.g., you can refer to the document that is three documents before the current document or two documents after the current document in the sorted window.

We set our window with partitioning and introduce new field as previousPrice:

1 { 2 $setWindowFields: { 3 partitionBy: "$_id.symbol", 4 sortBy: { "_id.time": 1 }, 5 output: { 6 previousPrice: { $shift: { by: -1, output: "$price" } }, 7 }, 8 }, 9 }

`$shift`

takes two parameters:

`by`

specifies the location of the document which we’ll include. Since we want to include the previous document, then we set it to`-1`

. If we wanted to include one next document, then we would set it to`1`

.`output`

specifies the field of the document that we want to include in the current document.

After we set the `$previousPrice`

information for the current document, then we need to subtract the previous value from current value. We will have another derived field “`diff`

” that represents the difference value between current value and previous value:

1 { 2 $addFields: { 3 diff: { 4 $subtract: ["$price", { $ifNull: ["$previousPrice", "$price"] }], 5 }, 6 }, 7 }

We’ve set the `diff`

value and now we will set two more fields, `gain`

and `loss,`

to use in the further stages. We just apply the gain/loss logic here:

1 { 2 $addFields: { 3 gain: { $cond: { if: { $gte: ["$diff", 0] }, then: "$diff", else: 0 } }, 4 loss: { 5 $cond: { if: { $lte: ["$diff", 0] }, then: { $abs: "$diff" }, else: 0 }, 6 }, 7 }, 8 }

After we have enriched the symbol data with gain and loss information for every document, then we can apply further partitioning to get the moving average of gain and loss fields by considering the previous 14 data points:

1 { 2 $setWindowFields: { 3 partitionBy: "$_id.symbol", 4 sortBy: { "_id.time": 1 }, 5 output: { 6 avgGain: { 7 $avg: "$gain", 8 window: { documents: [-14, 0] }, 9 }, 10 avgLoss: { 11 $avg: "$loss", 12 window: { documents: [-14, 0] }, 13 }, 14 documentNumber: { $documentNumber: {} }, 15 }, 16 }, 17 }

Here we also used another newly introduced operator, `$documentNumber`

. While we do calculations over the window, we give a sequential number for each document, because we will filter out the documents that have the document number less than or equal to 14. (RSI is calculated after at least 14 data points have been arrived.) We will do filtering out in the later stages. Here, we only set the number of the document.

After we calculate the average gain and average loss for every symbol, then we will find the relative strength value. That is calculated by dividing average gain value by average loss value. Since we apply the divide operation, then we need to anticipate the “divide by 0” problem as well:

1 { 2 $addFields: { 3 relativeStrength: { 4 $cond: { 5 if: { 6 $gt: ["$avgLoss", 0], 7 }, 8 then: { 9 $divide: ["$avgGain", "$avgLoss"], 10 }, 11 else: "$avgGain", 12 }, 13 }, 14 }, 15 }

Relative strength value has been calculated and now it’s time to smooth the Relative Strength value to normalize the data between 0 and 100:

1 { 2 $addFields: { 3 rsi: { 4 $cond: { 5 if: { $gt: ["$documentNumber", 14] }, 6 then: { 7 $subtract: [ 8 100, 9 { $divide: [100, { $add: [1, "$relativeStrength"] }] }, 10 ], 11 }, 12 else: null, 13 }, 14 }, 15 }, 16 }

We basically set `null`

to the first 14 documents. And for the others, RSI value has been set.

Below, you can see a one-minute interval candlestick chart and RSI chart. After 14 data points, RSI starts to be calculated. For every interval, we calculated the RSI through aggregation queries by processing the previous data of that symbol:

This is the complete aggregation pipeline:

1 db.ticker.aggregate([ 2 { 3 $match: { 4 symbol: "BTC-USD", 5 }, 6 }, 7 { 8 $group: { 9 _id: { 10 symbol: "$symbol", 11 time: { 12 $dateTrunc: { 13 date: "$time", 14 unit: "minute", 15 binSize: 5, 16 }, 17 }, 18 }, 19 high: { $max: "$price" }, 20 low: { $min: "$price" }, 21 open: { $first: "$price" }, 22 close: { $last: "$price" }, 23 }, 24 }, 25 { 26 $sort: { 27 "_id.time": 1, 28 }, 29 }, 30 { 31 $project: { 32 _id: 1, 33 price: "$close", 34 }, 35 }, 36 { 37 $setWindowFields: { 38 partitionBy: "$_id.symbol", 39 sortBy: { "_id.time": 1 }, 40 output: { 41 previousPrice: { $shift: { by: -1, output: "$price" } }, 42 }, 43 }, 44 }, 45 { 46 $addFields: { 47 diff: { 48 $subtract: ["$price", { $ifNull: ["$previousPrice", "$price"] }], 49 }, 50 }, 51 }, 52 { 53 $addFields: { 54 gain: { $cond: { if: { $gte: ["$diff", 0] }, then: "$diff", else: 0 } }, 55 loss: { 56 $cond: { if: { $lte: ["$diff", 0] }, then: { $abs: "$diff" }, else: 0 }, 57 }, 58 }, 59 }, 60 { 61 $setWindowFields: { 62 partitionBy: "$_id.symbol", 63 sortBy: { "_id.time": 1 }, 64 output: { 65 avgGain: { 66 $avg: "$gain", 67 window: { documents: [-14, 0] }, 68 }, 69 avgLoss: { 70 $avg: "$loss", 71 window: { documents: [-14, 0] }, 72 }, 73 documentNumber: { $documentNumber: {} }, 74 }, 75 }, 76 }, 77 { 78 $addFields: { 79 relativeStrength: { 80 $cond: { 81 if: { 82 $gt: ["$avgLoss", 0], 83 }, 84 then: { 85 $divide: ["$avgGain", "$avgLoss"], 86 }, 87 else: "$avgGain", 88 }, 89 }, 90 }, 91 }, 92 { 93 $addFields: { 94 rsi: { 95 $cond: { 96 if: { $gt: ["$documentNumber", 14] }, 97 then: { 98 $subtract: [ 99 100, 100 { $divide: [100, { $add: [1, "$relativeStrength"] }] }, 101 ], 102 }, 103 else: null, 104 }, 105 }, 106 }, 107 }, 108 ]);

## #Conclusion

MongoDB Aggregation Framework provides a great toolset to transform any shape of data into a desired format. As you see in the examples, we use a wide variety of aggregation pipeline stages and operators. As we discussed in the previous blog posts, time-series collections and window functions are great tools to process time-based data over a window.

In this post we've looked at the $shift and $documentNumber operators that have been introduced with MongoDB 5.0. The `$shift`

operator includes another document in the same window into the current document to process positional data together with current data. In an RSI technical indicator calculation, it is commonly used to compare the current data point with the previous data points, and `$shift`

makes it easier to refer to positional documents in a window. For example, price difference between current data point and previous data point.

Another newly introduced operator is `$documentNumber`

. `$documentNumber`

gives a sequential number for the sorted documents to be processed later in subsequent aggregation stages. In an RSI calculation, we need to skip calculating RSI value for the first 14 periods of data and $documentNumber helps us to identify and filter out these documents at later stages in the aggregation pipeline.

More from this series