This topic is all about what is more performant. I am new to MongoDB & NoSQL and this may be obvious to some, but I am trying to figure out the most efficient way of making my back-end allow the front-end to fetch paginated data that combines two fields. I essentially want to know whether plan A or B is best - that is, either storing more data in the database vs storing less data but having to amend, merge and sort the data before returning it to the front-end.
To provide some context, I want the back-end to merge the affiliatePayments
and affiliateCommissions
fields in date order and expose the pagination-friendly result to the front-end via an API.
Current user model (with an example user):
_id: ObjectId('631a5067hcfeb5940686f144')
email: “test@user.com”
hashedPassword: “test”
dateCreated: 2022-09-08T20:28:23.766+00:00
plan: "FREE"
directAffiliateSignup: false
referralPromoCode: "f2f11fed-ab02-46df-8eff-ba3d8cc3fb93"
affiliateRate: 0.3
affiliatePayments: [{date: 2022-07-27, amount: 2, id: someMongoId}, {date: 2022-08-23, amount: 3, id: someMongoId}]
affiliateCommissions: [{date: 2022-08-01, amount: 10, id: someMongoId}, {date: 2022-09-01, amount: 4, id: someMongoId}]
It’ll also need to be able to differentiate between each affiliatePayment and affiliateCommission, so each affiliatePayment will need a type
field with a payment
value and the affiliateCommission will need a type
field with a commission
value.
So the API should return something like this example data:
[{date: 2022-07-27, amount: 2, id: someMongoId, type: ‘payment’}, {date: 2022-08-01, amount: 10, id: someMongoId, type: ‘commission’}, {date: 2022-08-23, amount: 3, id: someMongoId, type: ‘payment’}, {date: 2022-09-01, amount: 4, id: someMongoId, type: ‘commission’}]
As the affiliatePayments
& affiliateCommissions
fields don’t have a type field stored in the database, I assumed the plan to return the correct data would be something like this:
Plan A:
- fetch all affiliatePayments & affiliateCommissions
- add ‘type’ key to each affiliate payment record as ‘payment’
- add ‘type’ key to each affiliate commission record as ‘commission’
- merge affiliatePayments & affiliateCommissions into single array
- sort (merged array) in date order
- limit & skip (just for pagination)
- return data to FE
Plan B, however, would involve simpler processing but more data fields stored in the database: the affiliatePayments
and affiliateCommissions
would each have a type
stored in each array entry, and an affiliateTransactions
field would exist in the database that would already contain the example data as above - which is essentially duplicates of affiliatePayments
and affiliateCommissions
but just merged into an additional affiliateTransactions
field.
Plan B:
- Fetch affiliateTransactions
- limit & skip (just for pagination)
- return data to FE
So my key question is this: which plan is more performant?
Is it better to store a merged affiliateTransactions
list (and extra type fields on affiliatePayments
and affiliateCommissions
) in the database? It’ll essentially copy affiliatePayments
and affiliateCommissions
. Or is it better to store less data in the database & accept the more expensive query processing that’ll presumably use more bandwidth?
Am I correct in thinking I should be reluctant to add extra fields to the database? It would make it much easier, but isn’t it unclean, inefficient and more expensive? I don’t know the scale of the tradeoff between the two options. I could be making wrong assumptions, i.e. does it cost bandwidth to query data more, does it cost more to store more data, etc.
Any advice would be greatly appreciated here. Thank you
Note: I also wonder whether I should get rid of the affiliatePayments
and affiliateCommissions
fields and just have one affiliateTransactions
field (that contains the payments and commissions data)? Note that I have a separate affiliate payments query for just getting the payments, and another for commissions. Having them in separate lists would prevent the need to filter from a single affiliateTransactions
field.