Schema design for Ratings system

Piers_Ebdon · September 8, 2020, 7:05pm

Hi Everyone

I am working on an app which has a ratings system, where one set of users can rate a set of players on a per match basis.

I want to be able to aggregate the data from all the users on a per match basis coupled with long term averages for each player.

I am struggling to work out the best way to model this using MongoDB. I have come up with the following solution.

For the averages on a per match basis:
I would have two fields for each player sub document within a match document consisting of totalRatingsScore and totalNumberOfRatings. These would be updated each time a user submits a rating for a player using the $inc operator with +1 for the totalNumberOfRatings and adding the user rating that is between 1 - 10 for the totalRatingsScore. To get the average for that match I could then simply use $avg.

Does this solution scale if in the dream scenario, thousands of people are submitting ratings, that are likely to happen at around the same time after a match finishes?

For the long term averages:
I would have a ratings field on a Player document that would have sub documents consisting of a matchID, totalNumberOfRatings and totalRatingsScore that would be updated in a similar fashion to the per match basis. To get the long term average for the player would then involve calculating the average for each match and then getting the average from all the match averages.
Both scenarios would be updated using a transaction.

What do you think about this solution? How would you tackle it?

Cheers

Joe_Drumgoole · September 9, 2020, 1:59pm

Some questions to think about:

Are there a large number of players per game?
What is the typical ratio of users to players?
What about outliers? Can you have a game with a very large number or users?

The challenge with a document per game or a document per player is if the number of players is large or the number of users is large everyone will be trying to update that document at once which will mean writes are queued up on the document.

Better to have a Ratings collection which captures the follow for each rating:

{
   "user" : <username or id> # The user who is giving a rating
   "game" : <game id> # The game that was being played
  "rating" : <rating value> # rating score
 "player" : <username or id> # rating score
}

Now to get all the ratings for a game you can query on game. To get all the ratings per player you can query on player.

You could just run aggregations on this collection to get averages. As the collection grows you could collection the data for a single game or a single player in a document in another collection.

Piers_Ebdon · September 11, 2020, 1:45pm

HI @Joe_Drumgoole

Firstly thanks for responding to my question. I really appreciate how helpful MongoDB employees and dev advocates are within these forums.

Apologies for the slow response. I am in the process of moving cities within the UK and haven’t had any quality time outside of work to respectfully reply to your response.

Are there a large number of players per game?
So I am talking about a soccer match, which has 11 starting players per team and up to 7 subs, depending on the competition, who can also participate in the game. In my scenario, each match only focuses on a user’s preferred team, so there are up 18 players per game and up tp 30 players who are likelyto play for a team in a whole season.

What is the typical ratio of users to players?
I currently don’t have any users but hope to very soon! Aiming to be out for when iOS 14 is released which I reckon will be the middle of next month.

What about outliers? Can you have a game with a very large number or users?
There could be a large numbers of users for each game. I am building a platform where the people creating the games already have large social media followings which range from tens of thousands to hundreds of thousands. I haven’t got anyone on board yet but theoretically there could be a large number.
This is my biggest concern, which may turn out to be a complete pipe dream, but I would really lke to be able to structure the database correctly to handle this, as in my day job I am an iOS develper and so don’t have commerical experience handling backends and databases.

In regards to your answer of having a ratings collection, one of the calls I need to be able make is to be able to fetch the long term average for every player within a team who has been rated, from every user who follows that team for the particular person who has created that match.
Would it be possible and efficient to fetch this within a potentially large number of documents, say 1m +?
This seems like a quick win but as someone who is fairly new to MongoDB, how easy would it be to then move away from this structure if it is no longer performant?

Joe_Drumgoole · September 11, 2020, 2:59pm

So every player has a a list of ratings. Those ratings are created by users and appended to the ratings collection.

To calculate a rolling average for each player you just need the current average and the previous number of entries. So a change stream on the ratings could update the player document with the current new average and the fields required to calculate the next rolling average.

That should work for millions of entries and peformance is capped by how fast new rattings are added. I would suggest you collect the change streams entries and update the averages once a second, a minute an hour. The period can be controlled by the load and the number of ratings arriving.

Piers_Ebdon · September 17, 2020, 7:35pm

Hi Joe

This sounds great, although I haven’t used a change stream before and will need to dig into that.

I’ve got the whole weekend to give this a go and will let you know how it goes.

Thanks again for the help

Piers_Ebdon · September 21, 2020, 6:00pm

Hey @Joe_Drumgoole

So I was able to spend a bit of time looking into using change streams over the weekend just gone.

Like you said I just need a change stream to watch inserts on a ratings collection. I just want to confirm, the change stream would be setup when my app begins and is not triggered by an api call right?

Joe_Drumgoole · September 21, 2020, 7:53pm

You have to call watch to setup a change stream. Then you loop over the cursor getting change events. Make sure you keep track of the resume token so when you restart you start where you finished.