What is the best schema for a blog post for storing the blog content, like, share and comment?

I am building an API backend for a social media platform where users can post blogs and other user can like, share and comment on it.

Along with the basic details of the blog post (title, image, author name and photo), the API should return

  • the last few comments and their profile photo
  • like and comment count
  • profile pic of few liked users

I would like to know the best schema design for this requirement. I am planning to store blog content, like and comment in one collection. Please suggest

2 Likes

Hi @Ummer_Irshad ,

Schema design may vary between different requirements and application data access.

The main guidance is that data that is queried together should be stored together in a document while not hitting any known antipattern.

With the example you describe here and the limited information it sounds like you can use the following schema (extended reference pattern):

Blogs collection

{
_id : ...
Title : ...
PostTime : ... 
nLikes : ...
NumComments : ...,
Author : { 
      UserId : ...,
      ProfilePic : ..., 
      Name : ....
}
...
}

Comments collection

{
 _id : ...,
 BlogId : ... // Reference to blogs
 PostTime : ... ,
 Text : ... 
  Author : { 
      UserId : ...,
      ProfilePic : ..., 
      Name : ....
}
,
nLikes : ...
}

Likes collection :

{
 _id : ...,
 ReferenceId : ... // Reference to blogs or comments
 LikeTime : ... ,
  Author : { 
      UserId : ...,
      ProfilePic : ..., 
      Name : ....
}

Let me know if that makes sense.

Read the following:

Thanks
Pavel

4 Likes

Thanks Pavel for your quick response.

One drawback I could see here is about embedding Author info inside collection. So, as and when a user updates the profile pic, we have to update it in all of his blogs, comments and likes? Is there any way to handle this?

What about the below schema? Here, we are storing blog and comments in a single collection, with a field “type” to distinguish it. If it is a comment, then we will keep the blogId in it’s “parent_id” field. Then, we need to have separate collection for “Users”. And, of course, we need to run a second query to get the profile pic.

Any suggestions?

{
	id: { type: String,  },
	type: { type: String, required: true, enum: ['post', 'comment'] },
	post_uid: { type: String, required: true },
	content: { type: String, required: false },
	author_id: { type: Number, required: true },
	timestamp: { type: Date, required: true },
	vote_count: { type: Number, required: true },
	comment_count: { type: Number, required: true },
	parent_id: { type: String, required: false },
	votes: [
		{
			vote_id: { type: String},
			voted_user_id: { type: Number, required: true },
			timestamp: { type: String, required: true },
		},
	],
})

Hi @Ummer_Irshad ,

Well having the documents in a single collection or 2 seperate collections is not that impacting as long as you will need the same amount of reads/queries to fetch it.

You can potentially store just pointers to the profile pic of each user in the users collection.

However, another idea is to keep the most updated profile pic in a users generic name and to overwrite this picture name with the new one each time. This way you will not need to have the updates. And you will get history of pictures as well .

https://hosting.com/user_123456/profile_current.jpg

Regarding the votes (likes) in an array just note that there is a possible unbound arrays there if this grows to thousands or more in a post or comment that is highly popular … Consider if that is ok to go that path or you need outlier pattern…

Thanks
Pavel

@Pavel_Duchovny
i think he means embedding Author in the document (that exists in the Blogs collection)
so what about this situation ?
if an Author’s name for example change, we have then to query all the blogs of that author and update them
is there a better way of doing this ??
could you explain further this idea : You can potentially store just pointers to the profile pic of each user in the users collection.

Hi @Khammassi_HoussemEdd ,

How many scenarios do change previous posts auther name? Does it make sense for applications?

In case you still want to follow this scenario you can have several options:

  1. Since post also embed userId you can still fetch user details and mention that "user previously known ash “OldUserName”
  2. Update asynchronously the posts of the user starting from the most recent.

The idea is to avoid any kind of lookup for user details for a generic posts view.

Ty

you can use populate() of mongoose.
Using this we just have to provide reference of the author in the blogs.
you can read more about it here mongoose populate

I’m not a specialist in this domain; however, I think you should get in touch with content creators with a large fanbase that can help you with your problem. Social media platforms are used for various purposes, from entertainment to business, and if you want to become famous, you should post quality content.

1 Like

It’s been a hot minute since this thread was active. Anyway, here’s my two cents: I’d recommend splitting your data into separate collections - one for ‘Posts’, another for ‘Comments’, and finally ‘Likes’. This would make your model more flexible and efficient.

Hey all, I have a question, when we have separate collections for comments and likes, but also want to keep track of the number of likes and comments that each Blog document has (like in @Pavel_Duchovny 's example), what would be a good way to keep the new comment/like document insertion/deletion operation and the number of comments/likes incrementation/drecrementation operation atomic so that they are in sync?

@Evdokim_N_A, good try hiding your spam within your ChatGPT generated answer. Your post has been flagged as spam. I have started to follow you to make sure I can flag your spam as soon as possible.

1 Like