Unsure on scenarios efficacy for either $lookup or embedding

Ben_Gibbons · July 25, 2022, 9:20am

I have a scenario where I will be displaying widgets in my app which exist as a 1 to 1 user enrolment. These widgets are linked 1 to 1 with learning material. If I reference the material with a lookup A sharding becomes impossible and B the searches on these widgets will be slow (an alternative would be populate but that seems less efficient with larger sets).

Alternatively I can embed the learning cover image / description and control details into the widget documents which will speed up the search but slow down the generation of the doucment and result in lots of redundancy and the need to keep this nested data from being stale. This also seems poor. Are these really the only two routes to go or am I missing something?

Thanks

kevinadi · July 28, 2022, 6:24am

Hi @Ben_Gibbons

Are these really the only two routes to go or am I missing something?

It’s hard to say without some examples documents, and how you’re expecting them to grow over time.

There are advantages & disadvantages for both approaches, which are discussed in these links:

Alternatively I can embed the learning cover image / description and control details into the widget documents which will speed up the search but slow down the generation of the doucment and result in lots of redundancy

Perhaps a question to answer is: which operation is more common compared to the other? If you’re expecting search to be >90% of the operation, then there’s definite advantage in embedding, with the tradeoff of more housekeeping. About housekeeping: are you expecting to process a lot of documents (i.e. millions) to keep them up to date? If it’s not too many, it might be ok.

Thus I would say that there’s not really one correct answer here. You might find that embedding works well for your use case, but I can have an almost identical schema design, but find that my day-to-day workload doesn’t support embedding so well. I believe no one can answer this for you definitively, since it largely depends on your specific situation and your workload projection.

Best regards
Kevin

Ben_Gibbons · July 29, 2022, 9:42am

The issue is the data will be expecting heavy search usage AND semi regular editing. Additionally the embedding causes a large upload latency to the point where I am having to use service workers to handle a part of the endpoint. This seems like such a weird constraint why not just have decent lookup functionality.