Queries could occur both ways, as in I might need to look up all the players at a club, or all the clubs a player is a member of.
On which side should I store the linked field? Is there a method to it? Is there a means of measuring performance, particularly as datasets get very large?
While it depends somewhat on the queries you’ll be running, my inclination would be to store clubs in player records, but that’s because I’m thinking you’ll have a lot more players in each club than clubs for each player?
As long as things are indexed correctly it might not matter also.
Your thinking is similar to my thinking. But the usage may end up more like WhatsApp, where you are added to dozens, may be even 3-digits number of chats (or clubs) per player. Some will manage and prune the list over time. Others will just leave it.
Do you know any good published videos or tutorials around the performance aspects of document design, which would cover indexes? Introductory level, opposed to guru! Forewarned is forearmed and all part of the learning…
I have actually been through the M320 course, and similar materials.
I wasn’t sure if there were other tutorials or articles specifically related to this question of collection structure, performance and how introduction to measuring that. Is it “Plans” or something similar?
Hello @Dan_Burt,
There is no definitive best schema. All depends on how the data is queried and updated
This article may help
For the performance.
you can use explain()
to get the winning plan for a query (without execution the query)
to execute the winning plan for a query and get metrics about the execution
to investigate every possible plan
The query optimizer chooses the winning plan empirically. e.g. the fastest plan to get the first 100 documents is the winner. the winning plan is then cached.