Advice needed in design

Hello there,

I would like to kindly ask for an advice about how should one go when designing a databases/collections around such a scenario:

Let’s suppose we are working with 3 data collections:

PRODUCTS, CATEGORIES and POLICIES

PRODUCTS belong to CATEGORIES. CATEGORIES have POLICIES attached that govern how PRODUCTS are handled.

An example use case:

PRODUCT: Milk
CATEGORY that the product is assigned to: Dairy products
POLICIES assigned: 1. Storage policy 2. Handling and shipping policy

The goal is to be able to attach POLICIES to CATEGORIES and then attach PRODUCTS to the said CATEGORIES so that they would have access to POLICIES and would be handled accordingly.

Now, one of the immediately available solutions online would be to create a reference to the CATEGORY in the PRODUCT document and then create another reference in the CATEGORY document pointing to specfic POLICIES assigned.

However, I cannot seem to find a solution to the following issues:

  1. If an existing POLICY that is referenced inside CATEGORY is modified, how could one make the changes permeate already existing PRODUCTS attached to the CATEGORY and thus inheriting the existing attached POLICIES?

  2. And what if we have thousands or millions of PRODUCTs belonging to a given CATEGORY? What would be a rational way of dealing with the problem 1?

Thanks for the help!

Hello @David_Aames, Welcome to the MongoDB Community forum!

Lets consider this sample data:

product:

{
  name: 'Milk',
  category: 'CAT-A',
  description: 'Cow milk'
}

category:

{
  name: 'CAT-A'
  policies: [ 'P-1', 'P-2' ],
  description: 'Category A'
}

policies:

{
  name: 'P-1',
  description: 'Cold storage',
  other_details: '...'
}

Now, if the policy P-1 changes its description to “Very cold storage”, what happens?

In such scenario, all the products with category CAT-A (which has policy P-1) has the updated policy. So, when you access (or query) this product it will have the updated policy.

The query on the product is a “join” operation on these two collections using the $lookup aggregation pipeline stage.

Hi @David_Aames ,

When I think about it, I don’t see a justification for a category connection collection. What I think is that you can have a category indexed field and add the associated policies to that , potentially embedding low frequncy changed attributes of a policy in that array.

Products

{
 name: 'Milk',
 category: 'Dairy products',
 policies : [ [{ id : 'P-1', name : "Storage"}  , { id :  'P-2' , name : "Shipping" }],
 description: 'Cow milk'
}

Policies

{
  id: 'P-1',
 name. : "Storage",
  description: 'Cold storage',
 
  other_details: '...'
}

Obviously, when you change things on policies, your application will re query the indexed policyId when you need to show it to the user or run flows that require this data. For example when you report on a product to a a supplier you will query a product whithin a category listing available policies. Once they require to look into each policy you query the data from the policies collection with all the latest policy data…

Thanks
Pavel

Hello @Pavel_Duchovny ,
Thank you for your warm welcome and detailed reply. Could you kindly elaborate on:

  1. Why you don’t see a justification for a category connection collection?
  2. Could you please explain what do you mean by
    a) Category indexed field
    b) Adding associated policies to that and embedding low frequency.

Thank you!

@David_Aames ,

I don’t see a reason to have an additional collection just to connect products and policies… If you have this collection you will need to do $lookups in your query and I want to avoid that…

MongoDB does not need to have this relational normalized concepts and we need to get as much data in a document as we can for our queries to just run on a single collection doing indexed access.

  1. Could you please explain what do you mean by
    a) Category indexed field
    b) Adding associated policies to that and embedding low frequency.

So in the products collection if we index the category we can fetch all products related to a specific category easily :

db.products.createIndex( { category : 1 });

db.products.find({ categotry : "Dairy products'"});

For the second comment. If we have fields that should be presented on a product main page/area and they are not highly updated , for example in my opnion a policy name should rarely change therefore it might make sense to have it in the product “policies” array. Another field we can think of is how many policies there are per product, this can be also added to the product document.

Whenever we insert a new policy we create it in the “policies” collection and then we can push it using $push or $addToSet update into a specific product “policies” array to gether with an increase of numberOfPolicies in that same product document…

Thanks
Pavel