Data Modeling - User Action Relationship

Santhosh_Kumar · August 7, 2020, 6:22am

I need to design a simple structure where the user’s click actions on a web application are recorded. Example document,

user_action

{
objectId: 90rt93jkfd92,
action: analyze,
app: reply,
user:{
     id: 5
     name: Sandy,
     email: sandy@xyz.com
    }
}

I also have a user_detail collection where user detail is stored.

If some user updates their user_detail say the first name, it should be reflected in the user_action data.

Which approach is best?

Update the user_action collection when user detail is updated. user_action is the largest collection in the DB where all actions are recorded. Is it ok to modify such a large no. of documents though not frequently?
Join fetch user_detail while querying user_action. Is this approach is right?
Consider redesigning the model. If I need to redesign the model, what are some suggestions?

Is there any other way to approaches this.

Your help would be appreciable.
Thanks in advance!

Prasad_Saya · August 7, 2020, 6:27am

Hello @Santhosh_Kumar, welcome to the community.

Here is a similar post with some answers:

https://stackoverflow.com/questions/63295525/mongodb-data-modeling/63295997#63295997

slava · August 7, 2020, 11:36am

Welcome, @Santhosh_Kumar!

I suppose, for your case, name and email props, that you embed in your user_actions collection, are not updated frequently. Thus, having the de-normalized data model would be the best option.

So, you have 2 collections: user_actions and users:

// user document example
{
  _id: 'U1',
  name: 'Sandy',
  age: 22,
  address: 'addr',
  email: 'sandy@xyz.com',
},

// user_actions documents example
[
  {
    _id: 'A1',
    action: 'login',
    app: 'store',
    user: {
      id: 'U1',
      name: 'Sandy',
      email: 'sandy@xyz.com',
    },
  },
  {
    _id: 'A2',
    action: 'logout',
    app: 'store',
    user: {
      id: 'U1',
      name: 'Sandy',
      email: 'sandy@xyz.com',
    },
  },
]

Actually, it is more a question about how consistent the user details in user_actions collection should be with the data in users collection.

if it is NOT critical - you can sync the user details in user_actions collection in background.
if it is critical - consider sync with transactions (not recommended).

michael_hoeller · August 7, 2020, 4:39pm

Hi @slava and @Santhosh_Kumar

depending on your queries to retrieve data, the subset pattern might be interesting. You can keep the latest, frequently retrieved (action)data embedded. Implicit you gain ACID functionality since you only deal with one (user) document. Whenever the embedded data hits a certain amount you move this to a second collection - the “amount” is defined by your process.
When you can use e.g. a userId as Id for the second collection you save the step of extra linking and cascaded deletes…

Just some thoughts, the most important and first thing to do when you design a schema is to identify the quantity, quality, and size of your workload) and how you are going to query the data. I based this on the idea that you have a user centric and not action centric setup.

Cheers,
Michael