What is the best schema for a blog post for storing the blog content, like, share and comment?

I am building an API backend for a social media platform where users can post blogs and other user can like, share and comment on it.

Along with the basic details of the blog post (title, image, author name and photo), the API should return

  • the last few comments and their profile photo
  • like and comment count
  • profile pic of few liked users

I would like to know the best schema design for this requirement. I am planning to store blog content, like and comment in one collection. Please suggest

1 Like

Hi @Ummer_Irshad ,

Schema design may vary between different requirements and application data access.

The main guidance is that data that is queried together should be stored together in a document while not hitting any known antipattern.

With the example you describe here and the limited information it sounds like you can use the following schema (extended reference pattern):

Blogs collection

{
_id : ...
Title : ...
PostTime : ... 
nLikes : ...
NumComments : ...,
Author : { 
      UserId : ...,
      ProfilePic : ..., 
      Name : ....
}
...
}

Comments collection

{
 _id : ...,
 BlogId : ... // Reference to blogs
 PostTime : ... ,
 Text : ... 
  Author : { 
      UserId : ...,
      ProfilePic : ..., 
      Name : ....
}
,
nLikes : ...
}

Likes collection :

{
 _id : ...,
 ReferenceId : ... // Reference to blogs or comments
 LikeTime : ... ,
  Author : { 
      UserId : ...,
      ProfilePic : ..., 
      Name : ....
}

Let me know if that makes sense.

Read the following:

Thanks
Pavel

3 Likes

Thanks Pavel for your quick response.

One drawback I could see here is about embedding Author info inside collection. So, as and when a user updates the profile pic, we have to update it in all of his blogs, comments and likes? Is there any way to handle this?

What about the below schema? Here, we are storing blog and comments in a single collection, with a field “type” to distinguish it. If it is a comment, then we will keep the blogId in it’s “parent_id” field. Then, we need to have separate collection for “Users”. And, of course, we need to run a second query to get the profile pic.

Any suggestions?

{
	id: { type: String,  },
	type: { type: String, required: true, enum: ['post', 'comment'] },
	post_uid: { type: String, required: true },
	content: { type: String, required: false },
	author_id: { type: Number, required: true },
	timestamp: { type: Date, required: true },
	vote_count: { type: Number, required: true },
	comment_count: { type: Number, required: true },
	parent_id: { type: String, required: false },
	votes: [
		{
			vote_id: { type: String},
			voted_user_id: { type: Number, required: true },
			timestamp: { type: String, required: true },
		},
	],
})

Hi @Ummer_Irshad ,

Well having the documents in a single collection or 2 seperate collections is not that impacting as long as you will need the same amount of reads/queries to fetch it.

You can potentially store just pointers to the profile pic of each user in the users collection.

However, another idea is to keep the most updated profile pic in a users generic name and to overwrite this picture name with the new one each time. This way you will not need to have the updates. And you will get history of pictures as well .

https://hosting.com/user_123456/profile_current.jpg

Regarding the votes (likes) in an array just note that there is a possible unbound arrays there if this grows to thousands or more in a post or comment that is highly popular … Consider if that is ok to go that path or you need outlier pattern…

Thanks
Pavel

@Pavel_Duchovny
i think he means embedding Author in the document (that exists in the Blogs collection)
so what about this situation ?
if an Author’s name for example change, we have then to query all the blogs of that author and update them
is there a better way of doing this ??
could you explain further this idea : You can potentially store just pointers to the profile pic of each user in the users collection.

Hi @Khammassi_HoussemEdd ,

How many scenarios do change previous posts auther name? Does it make sense for applications?

In case you still want to follow this scenario you can have several options:

  1. Since post also embed userId you can still fetch user details and mention that "user previously known ash “OldUserName”
  2. Update asynchronously the posts of the user starting from the most recent.

The idea is to avoid any kind of lookup for user details for a generic posts view.

Ty