Retroactively updating related documents when another collection is modified

RENOVATIO · May 30, 2022, 12:29pm

Hello,

Could you kindly advice what would be the best way performance-wise to retroactively update documents in one collection if a change is applied to a related field in a different collection?

An example:

If we were to create a new product:

{
    "name": "John Doe",
    "address": "some street,24",
    "groupName": "group-a"

}

Now let’s suppose mr. John Doe was assigned to group-a because of his address. group-a refers to a document in a separate collection:

{
    "name": "Group A",
    "nameKey": "group-a",
    "settings": []
}

The problem I am experiencing is what to do should the “name” be changed for our Group A, meaning that the field “nameKey” would also be regenerated. For example a name is changed to Group B and so the nameKey is regenerated to group-b, which means that what is saved on John Doe’s document is outdated and no longer valid.

Could you kindly tell me what approach you would take, please?

Thank you very much

steevej · May 30, 2022, 12:46pm

You simply should not refer to documents from other collections with dynamic data. The _id field is unique, it is indexed and cannot be modified.

You may use transactions to update both collections when changing a referee to update its referrers.

Change stream can also be used to do a delayed update of the referrers.

I think groupName in first collection will be better named groupNameKey.

RENOVATIO · May 30, 2022, 5:31pm

Thanks for your reply and for a detailed answer, however I have some comments regarding your first 2 sentences:

You simply should not refer to documents from other collections with dynamic data.

Could you tell me what should be done instead? If 2 documents have their reason to exist separately (e.g. they standalone serve other components) and yet they both belong on the correctness of each other, what can be done?

The _id field is unique, it is indexed and cannot be modified.

I guess yours is a very good insight here. I could obviate this need by simply using the _id key, however the more legible group-a is very alluring.

Also, finally. You mentioned transactions, whatabout using updateMany instead? Is it a poor choice?

EDIT: In general, I have seen many APIs providing unique names generated from entered names to access resources, as they are easier to remember than database _ids. Based on your extensive experience, how would you say these systems solve the problem of having a name change?

Thank you!

steevej · May 31, 2022, 1:38am

I did not write that you should not have 2 documents. I wrote that one document should not refer to another document using a field that is mutable, dynamic, modifiable. Since the field groupName can be changed is not a good choice to maintain a relation from collection 1 to collection 2. Since _id cannot be changed, is unique and is indexed. As you see making the relation field modifiable it causes update difficulties.

updateMany works within a single collection. What you want is to change groupName in 1 collection to be reflected into another collection.

The problem of having a name change is not related to the way you get the name. It is related to the fact that your use-case allow to change. There is no issue in having groupName:group-a from one collection refer to another document in another collection using nameKey:group-a, the issue is

The problem is there because you want to change it and use it as a key. Yes show group-a is more user friendly than an _id, but users are not supposed to see your internal structure. Hopefully when you display a group your it name:“Group A” rather than it nameKey:“group-a”.

RENOVATIO · June 1, 2022, 8:46pm

Thank you for your detailed and quality reply!

system · June 6, 2022, 8:47pm

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.