Hello,
I’m currently working on my first full stack application and I’m building the backend with node.js + typegoose.
I have a good understanding of concepts and limits but I’m really not confident in my abilities and end up questioning every step I do. The MongoDB schema has been one of those things that kept changing and delaying the release of the project. Mostly because it’s quite easy to remodel and it’s dynamic ![]()
So the project:
- I will communicate with an open API (wargaming API for world of tanks) and store player statistics for data aggregation and visualisation.
- The API doesn’t provide player session data (daily games played) which will be calculated and stored as well.
- There are millions of players in each of 4 clusters (EU, NA, RU, ASIA)
Initially, I thought I will create 1 big document about a player and store everything there:
- Player details (~30 fields)
- Player’s tank details (array of max 700 objects that have 15 fields and one nested object)
- Player’s session stats (array of sessions on each update, e.g. it will have 1 object per day if updated daily)
- historical data (array of object with 5-6 fields of player to keep historical info)
Now I already understand that player session needs its own collection, but I can’t decide between the 2 models I have in mind:
- Document _id will be same as player _id and it will have 1 field sessions that will contain an array of sessions.
- Document _id will be unique and new record for each update, so bunch of documents as sessions instead of an embedded array.
The question about player’s tank statistics - should I keep it on the player document or also move it to it’s own collection with player_id+tank_id? The reason I ask is that a lot of aggregation will be based on this, and I think nested array within player collection might not be the best thing to do?
Sorry for the long post and thanks for any help!
I have attached a sample “unoptimized” document