There are 2 collections in my app, user and friend. The friend collection will be for storing the the friendship relationship between the users of the app.
In a relational database, the friend collection will have one row for each friendship, which involves 2 users.
In MongoDB, I am advised to store all of a user’s friend in an array field.
What is wrong with storing one document per friendship? Is there really no scenario where this schema would be preferred over storing friends in an array?
The goal is fast queries. The friend collection will be used as the from collection in a $lookup stage of a find query on the user collection. And the friend collection will have compound index on the friends’ user ids.
A general rule of thumb while doing schema design in MongoDB is that you should design your database in a way that the most common queries can be satisfied by querying a single collection, even when this means that you will have some redundancy in your database. Hence, you may have been advised to store all of a user’s friends in an array field.
Storing one document per friendship in MongoDB can lead to an increased number of write operations and result in performance degradation over time, as the size of the collection grows. This is because, with each new friendship, two new documents will have to be inserted and updated.
Using an array to store friends is more efficient, as a single update operation can be used to add or remove a friend.
Additionally, if you have a large number of friends for a single user, storing them in an array can result in exceeding the document size limit, leading to further performance issues. MongoDB has a hard limit of 16MB per document, thus any schema design that can have a document growing indefinitely will hit this 16MB limit sooner or later. In this case, it may be better to implement a separate collection for friends and use a referenced relationship instead, and then you can use the $lookup approach that you mentioned.
In conclusion, the array-based schema is preferred for storing friendships in MongoDB for fast queries, but one document per friendship may be a better choice for certain scenarios, such as if you have a large number of friends for a single user. It ultimately depends on the specific requirements and constraints of your project. I have a similar post where we have discussed some of the things to keep in mind while deciding schema and whether to favor embedding or to use references: Schema design: Many-to-many relationships and normalization - #3 by Satyam