Help with data modeling

Hello everyone,

I’m new to mongo, and I’m trying to create a simple kanban project, and I’m kind lost to how should I structure the schemas. Basically, I will have some one-to-many relationships:

  • User can have multiples boards;
  • Board can have multiples columns;
  • Column can have multiples tasks;
  • Task can have multiple subtasks;

What is the best approach for this scenario?
Should I create 5 collection (Users, Boards, Columns, Tasks and Subtasks) and add parent _id to child or should I have 1 collection (User) and embed everything.

e.g:

user: {
  ...,
 boards: [
   {
      ...,
      columns: [...]
   }
 ]
}

Thanks in advance!

Hey @Leandro_Araujo1,

Welcome to the MongoDB Community Forums! :leaves:

A general rule of thumb while doing schema design in MongoDB is that you should design your database in a way that the most common queries can be satisfied by querying a single collection, even when this means that you will have some redundancy in your database. Based on what you described, it sounds like you have a clear hierarchy of entities, with a user owning multiple boards, each board containing multiple columns, each column containing multiple tasks, and each task containing multiple subtasks.

It may be beneficial to work from the required queries first and let the schema design follow the query pattern but based on the structure you have described, it might make more sense to create embedded documents. From your description, thinking about the application a little bit, then the basic entity that we deal with in a kanban board is the “task card”. This can be a document, embedding all the other information pertaining to that task card like this:

{ _id: <card_id>,
  user: <user_id>,
  column: <column_id>,
  card_index_in_column: <integer>,
  card_type: <contains sub-task or not>
  card_content: <text or references to sub-tasks>
}

You can then create an index on {user:1, column:1, card_index_in_column:1} since it’s more likely “boards” are per-user. This index will make it faster to display certain columns for certain users and also display the cards in a sorted order per column.

In general, you should favor embedding when:

  • You have small subdocuments
  • Your data does not change very frequently
  • Your documents grow by a small amount over time
  • You often need to query this data
  • You want relatively faster reads

and favor normalization when:

  • You have large subdocuments
  • Your data changes very frequently
  • Your documents grow by a large amount over time
  • Your data is often excluded from your results
  • You want relatively faster writes

You can further read the following documentation to further cement your knowledge of Schema Design in MongoDB.
Data Model Design
Factors to consider when data modeling in MongoDB
Compound Indexes

Note that these above points are just general ideas and not strict rules. I’m sure there are exceptions and counter examples to any of the points above, and generally, it’s more about designing the schema according to what will suit the use case best (i.e. how the data will be queried and/or updated), and not how the data is stored.

Please feel free to reach out for anything else as well.

Regards,
Satyam

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.