Schema design - ref vs embedding

Embedding or referencing

Hello everyone,

I have a projetct where i should create development plans for students and much in doubt about how i should model.

Every student will have a plan, all plans have many phases, and each phase contains multiple goals, goals can have comments.

From my study I have learned that its best to use embedding with a lot of CREATE AND GET, and more referencing with UPDATE and Delete. Also we have learned that then we “update” a document with a new value, its more considered as an create than an actual document, even tho the operation in c# mongodb is a updateone, hmmm…

I started with embedding everything in the plan since my use cases are most create / get, but then i started to work more on the mock, i find it inefficent to update a goal “deeply” nested within the plan. Therefore I consider a more hybrid approach, where i embed my phases in the plan and reference my goals from the phases. So my collections will look something like this.

Developmentplan (collection)

  • Phases

  • Reference (goals)

Goals (collection)

  • Comments

The most common use cases are:

  • Create plan

  • Get plan

  • Update goal

  • Add comment

What are the best tips on when to do embedding vs. referencing, I find it hard to decide what to choose…

Thank you in advance for your time.

Hi, if there are lots of phases with lots of goals and lots of comments (by “lot” I mean thousands, or growing number) then it makes sense to reference like you suggest, so that the “update goal” and “add comment” do not have to write the whole document (it updates in place in memory, but the whole document is written to disk at checkpoint).

But, also depending on those numbers, a “get plan” may have to look up many goals. So it depends on the numbers but also on your performance priorities (update vs query)

1 Like