Hi all,
I’m new to data modelling in MongoDB. I have a query regarding the Extended Reference Pattern (ERP).
I’m building a task manager application. As part of modelling, I’ve identified two entities: Project and Task, which have a one-to-many relationship (i.e., one project has many tasks).
A Task document features many fields (e.g., cost/time estimate, title, description, comments, activity, …).
In the project board view, I visualise tasks as cards. The card previews show only a subset of the fields contained in a Task (i.e., title and cost/time estimates). The entire Task info is loaded only when the user clicks on a card.
To get the Task fields I need for the project board, instead of performing a join on Project and Task, I thought I would use ERP to speed up retrieval.
My idea is to embed the subset of the Task fields directly in Project, as follows:
Project:
{
_id: x;
title: bla;
…;
tasks: [
{task_id: x1; title: y; estimate: {cost: 10; time: 10} };
{task_id: x2; title: y; estimate: {cost: 10; time: 10} }
]
}
I have a couple of doubts regarding this design.
First, the estimate of a task may change sometimes. I’ve learnt that ERP is suggested in case the embedded data is stable. On the one hand, I’m tempted to embed the task data, because the project board query is the most important operation of the app and would like to optimise it. On the other, I feel I’ll need numerous writes to keep the duplicated information up-to-date over time.
Second, from the Data Modelling course on MongoDB University I seem to remember that when using ERP the embedding should happen on the many side. In my case, I embed data on the one side.
I’d really appreciate some expert advice on whether it makes sense to use ERP in this instance. Also, I’d like to ask if there is an actionable heuristic to decide when data is too “unstable” to be embedded using ERP?
Thanks a lot.
Valerio