I need to design a simple structure where the user’s click actions on a web application are recorded. Example document,
I also have a user_detail collection where user detail is stored.
If some user updates their user_detail say the first name, it should be reflected in the user_action data.
Which approach is best?
Update the user_action collection when user detail is updated. user_action is the largest collection in the DB where all actions are recorded. Is it ok to modify such a large no. of documents though not frequently?
Join fetch user_detail while querying user_action. Is this approach is right?
Consider redesigning the model. If I need to redesign the model, what are some suggestions?
Is there any other way to approaches this.
Your help would be appreciable.
Thanks in advance!
Hello @Santhosh_Kumar, welcome to the community.
Here is a similar post with some answers:
I suppose, for your case,
email props, that you embed in your
user_actions collection, are not updated frequently. Thus, having the de-normalized data model would be the best option.
So, you have 2 collections:
// user document example
// user_actions documents example
Actually, it is more a question about how consistent the user details in
user_actions collection should be with the data in
- if it is NOT critical - you can sync the user details in
user_actions collection in background.
- if it is critical - consider sync with transactions (not recommended).
Hi @slava and @Santhosh_Kumar
depending on your queries to retrieve data, the subset pattern might be interesting. You can keep the latest, frequently retrieved (action)data embedded. Implicit you gain ACID functionality since you only deal with one (user) document. Whenever the embedded data hits a certain amount you move this to a second collection - the “amount” is defined by your process.
When you can use e.g. a userId as Id for the second collection you save the step of extra linking and cascaded deletes…
Just some thoughts, the most important and first thing to do when you design a schema is to identify the quantity, quality, and size of your workload) and how you are going to query the data. I based this on the idea that you have a user centric and not action centric setup.