I came across an interesting task today. Let’s assume I have an application that has Users, Items and Organizations.
A user can be a member of multiple organizations, but doesn’t have to be a member of any organizations. Items can be owned by either users or organizations. How would a good MongoDB schema for this look like?
Most commonly Items are queried. Usually (but not always!) the user querying the item is either the owner of the item(s) queried or a member of the organization that owns the item(s).
I’m not very experienced with document oriented databases and thought about this for a while, but did not come to a satisfying conclusion. The only thing that very quickly occured to me: Oh my god, I’m happy I don’t have to do this in a relational database
Wouldn’t that be a bad practice? I thought having some entities being direct children of others would be a lot more performant than having them seperate using references and júst “joining” them together?
But a better answer would be it depends more on the type of data/use case you have, if the items are the same but multiple people can own them (ie items from a supermarket are all the same name,description)
From the mongodb docs example of books:
Consider the following example that maps publisher and book relationships. The example illustrates the advantage of referencing over embedding to avoid repetition of the publisher information.
This allows you to not have to repeat putting the entire item object into each user/org that owns it. This is more efficient in space because the user documents are smaller. Also if you need to update an item, you just need to update it in the items collection and the references will be updated as well when ever the id is queried.
The other way to handle one to many is embedding and here the items are “unique” to a user. This way only requires one query instead of two but there is no “items” collection so to get an items description you have to query on a user.items.
OK, that makes sense. Interesting stuff, but it’s difficult to wrap your head around if you’ve been developing with relational databases for years.
This is mostly an academic example to understand how things are working, so for the sake of argument let’s say that all items are unique and specific to that user or org that owns it. If I understand correctly it would probably be the best to just split this into 3 different collections and have the id’s as references?
I would do this because the items are unique to the user, so to find the items you need to know the user. This way you are also only returning a single document (the user/org document) and you have all their items/descriptions. Plus if they are unique you won’t need to update them for multiple users you can just find/update on the user document.
Let’s say I wanted to query all the items a user has access to (either directly or through an org) in this case. That would mean I query the user first, take the items embedded there, then take the org_ids from the user, query those orgs and take all the items found there as well. Is this correct?