Populate and Reference performance vs. save the data directly

Sun_23 · November 26, 2023, 9:57pm

Hello,

i am building a social media app and i am a beginner for databases and only discovered recently the populate-functionality and like it a lot especially in combination with the ref-functionality in Schemas.

For Example: When someone write a text and it contains also pictures, i only want to put the reference of the image-documents(the image is hosted on a S3 Buckets and the documents only contain the links, metadata, etc.) in the text-document.

So TextSchema like:
…

images:
[
{
type: mongoose.Types.ObjectId,
ref: “Image”,
},
],

…

Now when someone wants to get the text, i only populate the references and get the url from the images and put it into the response of the server to the frontend. So you have the textbody and the imagelinks with one request.

I like this approach a lot, because i have a clear structure, dont save the same data in two differenct places, when i change the image-document, it will automatically update the data in the text-document, etc.
AND performance wise: i only request the database once (is this even true?).

The other approaches would be:
2. Put the data directly into the textdocument (so dont use ref and dont use populate). But then i would have the same data in to different places.

So now my beginner-question is:
Can i use this a lot (ref the comments of a text in the TextSchema and populate it, ref the texts of the user in the UserSchema and populate it, etc.).
Or is it under the hood not so simple (possibly even doing more than one request)? Has to do the server a lot of work and processing ?

I could not find a satisfactory answer to this question with google

Thanks a lot for the help. If someone has good resources for this topic (Text or Video), it would also be nice.

Kushagra_Kesav · November 27, 2023, 2:31pm

Hey @Sun_23,

Thank you for reaching out to MongoDB Community!

When structuring your schema in MongoDB, a general rule of thumb is to prioritize designing your database to fulfill the most common queries through a single collection. This might result in some redundancy within the database, but optimizing for common queries takes precedence. It’s beneficial to start by identifying the required queries, simplifying them as much as possible, and then shaping the schema around these query patterns.

Yes, this is true. When you consolidate both text and image referencing documents into a single collection, it allows you to request the data in one go without utilizing the $lookup functionality (equivalent to populate in mongoose), making the process more efficient.

However, I would suggest you use mgeneratejs to create sample documents quickly in any number, so the design can be tested easily.

You can further refer to the following documentation to cement your knowledge of Schema Design in MongoDB.

Hope this helps!

Best regards,
Kushagra