One to One Relationship : Embedded Document vs rolled up data elements in Same Collection

In a Document, if I have one to one relationship with a specific data set, there are two way we can define the collection:

  1. Option 1
{
	_id: ObjectId("6159e58a11f8dcbed7883d56"),
	CustomerID: 123456,
	FirstName: 'John',
	LastName: 'Smith',
	MiddleName: 'Jr.',
	UserType: 'Business',
	address: [
	  {
		addresstype: 'Primary',
		address1: '33rd Street',
		city: 'New York',
		state: 'NY',
		zipcode: '45875'
	  }
	],
	createduser: 'system',
	createddate: ISODate("2021-10-03T17:16:58.748Z"),
	updateduser: 'system',
	updateddate: ISODate("2021-10-03T17:16:58.749Z")
}
  1. Option 2
{
	_id: ObjectId("6159e5ae11f8dcbed7883d57"),
	CustomerID: 123456,
	FirstName: 'John',
	LastName: 'Smith',
	MiddleName: 'Jr.',
	UserType: 'Business',
	addresstype: 'Primary',
	address1: '33rd Street',
	city: 'New York',
	state: 'NY',
	zipcode: '45875',
	createduser: 'system',
	createddate: ISODate("2021-10-03T17:17:34.507Z"),
	updateduser: 'system',
	updateddate: ISODate("2021-10-03T17:17:34.507Z")
}

I understand how you define the model as its one to one is up to you as both ways work fine. The use case of sub document for address could be to make the data look cleaner and segregated in Option 1.

When it comes to using the address data in aggregation, you just need use doted notation in Option 1 to get your data elements.

So, my main Q is, if we have huge number of documents in this collection, does Option 1 with address has sub document has any downside in whatsoever fashion when it comes to Mongo DB internal working on how it stores the data and how easy is it to access the data?

Or, this has no bearing in any way and either way works exactly the same in all aspects?

Thanks

Hi @Vikram_Bade ,

Great question. When a user is expected to have only one address only it doesn’t really matter that much if its stored as :

  • separate field/s
  • nested object
  • nested array object.

The only difference might be in that case is the index you create on nested objects or array is called a multilkey index which has advantages or disadvantages considering its limitations.

Having said that, if a user is expected to have several addresses throughout the life of a document , which is highly likely , keeping an array of addresses makes much more sense in all terms.

An embedded array is mostly useful to define a 1 to few relationship (up until several 100s of sub docs).

I suggest to read :

Thanks
Pavel

Thanks @Pavel_Duchovny for your response.

I guess address was a wrong example as we can have multiple addresses. :slight_smile:

I was more referring to data elements which are one to one and the advantages/disadvantages/no difference between how Mongo DB works if we have the below types of documents:

the below are better examples:

    {
        _id: ObjectId("615aa51fa7ee0b0fbcd8c0c2"),
        CustomerID: 123456,
        FirstName: 'John',
        LastName: 'Smith',
        MiddleName: 'Jr.',
        UserType: 'Business',
        moreinfo: [
          {
            gender: 'Male',
            age: 46,
            dateofbirth: ISODate("1975-08-01T00:00:00.000Z")
          }
        ],
        createduser: 'system',
        createddate: ISODate("2021-10-04T06:54:23.329Z"),
        updateduser: 'system',
        updateddate: ISODate("2021-10-04T06:54:23.329Z")
      }


    {
        _id: ObjectId("615aa572a7ee0b0fbcd8c0c3"),
        CustomerID: 123456,
        FirstName: 'John',
        LastName: 'Smith',
        MiddleName: 'Jr.',
        UserType: 'Business',
        gender: 'Male',
        age: 46,
        dateofbirth: ISODate("1975-08-01T00:00:00.000Z"),
        createduser: 'system',
        createddate: ISODate("2021-10-04T06:55:46.644Z"),
        updateduser: 'system',
        updateddate: ISODate("2021-10-04T06:55:46.644Z")
      }

My thought was on accessibility, replication, storage etc. of the above options if they make any difference.

@Vikram_Bade , There is no obvious benefits of embedding for single field data.

Its more a readability or programming considerations. For example if its embedded and you need to pass this data to another component you can pass it as a whole object someFunction(doc.moreinfo).

You see my point?

Best regards,
Pavel

1 Like

@Pavel_Duchovny Understood and makes sense. I wanted to be sure I am not missing. Thanks for your time.

I would like to add my 2 cents by recommending the reading of Building with Patterns: The Attribute Pattern | MongoDB Blog.

Consider applying it to:

to become:

"history" :
[
 { "action" : "create" ,
   "user" : "system" ,
   "date" : ISODate("2021-10-04T06:55:46.644Z")
 } ,
 { "action" : "update" ,
   "user" : "system" ,
   "date" : ISODate("2021-10-04T06:55:46.644Z")
 }
]

Advantages are:

  1. single multi-key index can serve all queries, otherwise you need an index with createduser, createddate and another one with updateduser,updateddate.
  2. easier to add an action type (like “delete”) without changing schema and index
  3. simpler to do queries like what did steevej user did to my documents,
    { "history.user" : "steevej" } let you extract all documents I created or updated. Otherwise you have to query updateduser and createduser. And your query has to change the day you add a new action type.

Thanks @steevej for the advise and the article link. Much appreciated.