Modelling for beginners: different owner entities

LJacobs · December 9, 2021, 2:24pm

I came across an interesting task today. Let’s assume I have an application that has Users, Items and Organizations.

A user can be a member of multiple organizations, but doesn’t have to be a member of any organizations. Items can be owned by either users or organizations. How would a good MongoDB schema for this look like?

Most commonly Items are queried. Usually (but not always!) the user querying the item is either the owner of the item(s) queried or a member of the organization that owns the item(s).

I’m not very experienced with document oriented databases and thought about this for a while, but did not come to a satisfying conclusion. The only thing that very quickly occured to me: Oh my god, I’m happy I don’t have to do this in a relational database

tapiocaPENGUIN · December 9, 2021, 4:07pm

I would look at this page for a One to Many - Embedded or One to Many - Reference to get an idea. But here is something you could do:

You could create an items collection (especially if you have “common” items multiple users could own

{
item_id: 1234
name: Item 1
description: item 1 description
}

Then you can just query the item collection to get information about the item.

Then in either the owner/user that has the item you could create an array of items:

{
user_id: 55555
items: [1234, 1445]
}

Also, let’s say an item description changes you just have to change it in the items collection and when ever you reference the “item_id” it will be the updated info.

You can do the same thing with the organizations users

{
org_id: 9999
users: [55555,77777]
}

LJacobs · December 9, 2021, 8:16pm

Wouldn’t that be a bad practice? I thought having some entities being direct children of others would be a lot more performant than having them seperate using references and júst “joining” them together?

tapiocaPENGUIN · December 9, 2021, 9:05pm

No. It is a design pattern on the MongoDB website

But a better answer would be it depends more on the type of data/use case you have, if the items are the same but multiple people can own them (ie items from a supermarket are all the same name,description)
From the mongodb docs example of books:

Consider the following example that maps publisher and book relationships. The example illustrates the advantage of referencing over embedding to avoid repetition of the publisher information.

This allows you to not have to repeat putting the entire item object into each user/org that owns it. This is more efficient in space because the user documents are smaller. Also if you need to update an item, you just need to update it in the items collection and the references will be updated as well when ever the id is queried.

The other way to handle one to many is embedding and here the items are “unique” to a user. This way only requires one query instead of two but there is no “items” collection so to get an items description you have to query on a user.items.

   "user_id": 55555,
   "name": "Sample User",
   "items": [
                {
                  "item_id": "1234",
                  "description": "Item 1234 description"
                },
                {
                  "item_id": "5678",
                  "description": "Item 5678 description"
                }
              ]
 }

Both handle one to many relationships (which is the nice part and challenging part) of MongoDB.

LJacobs · December 9, 2021, 9:45pm

OK, that makes sense. Interesting stuff, but it’s difficult to wrap your head around if you’ve been developing with relational databases for years.
This is mostly an academic example to understand how things are working, so for the sake of argument let’s say that all items are unique and specific to that user or org that owns it. If I understand correctly it would probably be the best to just split this into 3 different collections and have the id’s as references?

tapiocaPENGUIN · December 9, 2021, 10:01pm

If they are unique and specific to the user/org I would embed the items in the user/org document.

   "user_id": 55555,
   "name": "Sample User",
   "items": [
                {
                  "item_id": "1234",
                  "description": "Item 1234 description"
                },
                {
                  "item_id": "5678",
                  "description": "Item 5678 description"
                }
              ]
 }

I would do this because the items are unique to the user, so to find the items you need to know the user. This way you are also only returning a single document (the user/org document) and you have all their items/descriptions. Plus if they are unique you won’t need to update them for multiple users you can just find/update on the user document.

LJacobs · December 9, 2021, 10:13pm

Let’s say I wanted to query all the items a user has access to (either directly or through an org) in this case. That would mean I query the user first, take the items embedded there, then take the org_ids from the user, query those orgs and take all the items found there as well. Is this correct?

tapiocaPENGUIN · December 9, 2021, 10:20pm

Yes that is correct, but with proper indexes on the user_id and org_id field this should be quick. Because you are looking for specific user_id or org_id values.

LJacobs · December 9, 2021, 10:28pm

OK, thank’s a lot! I think I will still need quite some time to get good at MongoDB, but this helped a lot to understand things better