What is the best way to implement simple relationships (with query speed in mind)

Stephan_06935 · March 25, 2022, 10:21am

I just want to get some advice on the best way to set up simple references:
I am trying to decide whether I should:

Create 2x collections and have the documents from col1 reference documents in col2 using a $lookup or rather
create 1x collection and embed the reference data within the main document?

Note, the relationship is a one to many relationships. As an example, collection 1 is a “bank” and collection 2 is “cars”. The bank can finance many cars, but a car can only be referenced to one bank.

The reason for my question is in terms of execution speed. If I have 1x bank document but half a million car documents.
When I run a query, will the query be:
a) faster with and array of embedded documents or
b) $lookup (assuming cars is a collection of its own)?

Appreciate any effort and assistance.

Prasad_Saya · March 25, 2022, 1:22pm

Hello @Stephan_06935,

The solution also depends upon what kind of query (or queries) you are thinking about. When there are two related entities and you are querying them - one of the points to consider is what are the important queries.

Another point is that in case the bank is financing millions of cars over a period of time - this is about the many side of the relationship growing all the time. This rules out the embedding the car data within the bank document itself as an array of nested documents.

So, the next point is that you can store the car references only, for example the id value, within the bank document. This is also a case of the same unending array.

That points to another option, that is store the bank information in the car document. This will allow query the cars for specific bank, a car with its bank info, etc. You can make it an extended reference - store not just the reference id, but also include some of the bank information like name, and branch which is often used and not changed often within the car document.