I’m new to mongo, and I’m trying to create a simple kanban project, and I’m kind lost to how should I structure the schemas. Basically, I will have some one-to-many relationships:
User can have multiples boards;
Board can have multiples columns;
Column can have multiples tasks;
Task can have multiple subtasks;
What is the best approach for this scenario?
Should I create 5 collection (Users, Boards, Columns, Tasks and Subtasks) and add parent _id to child or should I have 1 collection (User) and embed everything.
A general rule of thumb while doing schema design in MongoDB is that you should design your database in a way that the most common queries can be satisfied by querying a single collection, even when this means that you will have some redundancy in your database. Based on what you described, it sounds like you have a clear hierarchy of entities, with a user owning multiple boards, each board containing multiple columns, each column containing multiple tasks, and each task containing multiple subtasks.
It may be beneficial to work from the required queries first and let the schema design follow the query pattern but based on the structure you have described, it might make more sense to create embedded documents. From your description, thinking about the application a little bit, then the basic entity that we deal with in a kanban board is the “task card”. This can be a document, embedding all the other information pertaining to that task card like this:
{ _id: <card_id>,
user: <user_id>,
column: <column_id>,
card_index_in_column: <integer>,
card_type: <contains sub-task or not>
card_content: <text or references to sub-tasks>
}
You can then create an index on {user:1, column:1, card_index_in_column:1} since it’s more likely “boards” are per-user. This index will make it faster to display certain columns for certain users and also display the cards in a sorted order per column.
Note that these above points are just general ideas and not strict rules. I’m sure there are exceptions and counter examples to any of the points above, and generally, it’s more about designing the schema according to what will suit the use case best (i.e. how the data will be queried and/or updated), and not how the data is stored.
Please feel free to reach out for anything else as well.