Storing (possible millions) of comments in a single collection?

kevinadi · August 30, 2022, 2:29am

My plan is to store all comments in a single collection

I think this should be fine. It’s probably a better option vs. putting the comments in an array of sub-document inside e.g. a “post” document, since if a post generated a lot of comments, the “post” document can grow indefinitely, which is probably not what you want.

Question 1: Is this schema good? Will I be able to query these comments (by resourceId ) fast enough even when the collection grows into the millions?

Well “good” is relative I believe as long as the collection is indexed properly (see Create Indexes to Support Your Queries) and if the working set fit in RAM, it should be fast enough. Of course this is also subject to the hardware spec, and whether the hardware can handle the workload or not.

Question 2: I’m also planning to add comments to my blog posts . Instead of the resourceId the comment belongs to, I would need an identifier for the specific post.

To me that use case doesn’t sound too different from the first one you mentioned. If it’s serving the same purpose and you’re expecting a similar usage pattern, I don’t see why you can’t reuse the same schema with minor modifications.

Obligatory caveat: I’m not 100% familiar with the use case you have in mind, so these are just generalized opinion on my part. Before committing to any one solution, I’d recommend you to simulate the workload first to see if the design would work or not

Best regards
Kevin