Many to many Relationship schema for youtube like application


Let’s say i have a youtube kind of application, where there are two entities video and user. I want to store information about likes, comments on video. So there would be many to many relationship between users and videos.How to design schema for the use case?

  1. Adding reference of user in video document as array of user ids? in this case array will grow to billion.
  2. Adding reference of video id in user document as array of video ids? in this case videos array will be limited in size.
  3. Any other solution?
    Please help me with the solution.



It is is an excellent question.
The first thing to remember when deciding on a model is to understand how you will query and write the data.
The first phase of our model describes the workload.

If the main query is retrieving the information about a video and from that point, be able to see the users who liked it, then the references probably have to be in the video document.
As you noted, the list could be too long for a given array.
In this case, making use of the Subset Pattern may help you.

Alternatively, if you query the database for a given user to show which videos they like, then the references would be kept in the user documents.

Based on my mental model of youtube, I think the workload would lead us to have the complete list of videos in the user document and making use of the Subset and Computed Patterns in the video document.