So I’ve read the blog posts linked by @Norberto and @brianhei, and in the absence of the upcoming course on schema design and data modeling, I thought I’d check my understanding here. I hope it’s not straying too off-topic.
Taking a collection of recipes as an example, because it’s used as an example in one of those posts, most of the data in a recipe consists of a list of ingredients and a list of instructions. One of the instructions or one of the ingredients doesn’t make a great deal of sense on its own outside the context of the recipe, so in a document database it makes sense to embed both of them into the recipe document.
However, in my personal data model for a recipe, each ingredient consists of 3 things; what the ingredient is (e.g. flour), how much of it (e.g. 250) and the unit of measurement (e.g. grams). OK so far, my recipe document can have an array of ingredient child documents, each of which has those 3 fields.
But we humans are inconsistent in the way we measure things. Some people measure liquids in multiples or fractions of pints, some in millilitres, some in cups. Some people measure solid ingredients in grams, some in ounces, some in cups (which I’m sure is strictly speaking a liquid measure, but when I get my recipes by searching the web I keep finding myself measuring out a 1/2 pint of flour, what’s that all about? But that’s definitely out of scope for this forum).
So, say I’m collecting recipes from all over the world, written by people who use different units of measurement, and one of my requirements for the recipe application is that it’ll display the ingredients using my preferred units of measurement rather than the units of measurement in the original recipe.
Forget for now how we convert from grams to ounces etc, I’m sure I can work that bit out on my own. But how should we store that relationship between the ingredient quantity in (for example) grams and the ingredient quantity in (for example) ounces? I’m reasonably sure we don’t want to embed the quantity in both grams and ounces in every ingredient in every recipe, because the relationship between a gram and an ounce isn’t something which is specific to that recipe, it’s common to all recipes.
So I suspect that the unit of measurement, along with its relationship with other units of measurement, actually belongs in a separate collection, with all the extra $lookups required to support my requirement to see the ingredients in my preferred units of measurement?
There are other things to think about here, such as the fact that MongoDB doesn’t enforce foreign key constraints between collections (e.g. to prevent someone entering a recipe which specifies the amount of some ingredient in units which the application doesn’t recognise) and that if I go down that route then my application will need to do that sort of validation itself rather than relying on the database to do it, but at this point I’m not too worried about that.
I just want to know, in this scenario, is it a good idea to put units of measurement into their own collection and have the overhead of extra $lookups, or would it be better to embed the quantity of ingredients in all known units of measurements directly in my recipe document and have the overhead of using more disk space?
Sorry, that’s a rather long question, but I think it’s quite an important one for people like me who are on the journey from working with relational databases to working with document databases. And I hope I’ve framed it in a way which will be useful for other people on the same journey.