How would one design a database to model a dating App like Tinder?

In a dating app like Tinder, users slide left and right to indicate who the have rejected and who they have liked.

The problem is, most users are going to easily have 1000s of rejects and 1000s of likes.

How would you store these many to many relationships? For most users, they will not fit in one document, so the outlier pattern doesn’t apply here.

Hi @Big_Cat_Public_Safety_Act ,

The outlier pattern may apply on a matched collection where you will keep an id of the matcher user and its matched array in the past x seconds. Since apps like tinder can swipe users pretty fast we won’t want to do an insert or update call on each swipe but either accumulate them in an array or write them as separate match documents into a matched collection.

{
  matcher : "id1",
  matched : [ "id2", "id5" ... Up to 100 or last 10s],
  matchSeq : 1,
writeDate : "2023-01-10"
 ..
}

Or hold small matched document:

{
  matchId : "aaa"
  match : [ {"matcher": "id1" ,
                 "matched" : "id5" },
                   {"matcher": "id5" ,
                 "matched" : "id1" }]
  matchedDate : ...,
  IsMatched : true

}

Now in the small matched document I will also write it periodically with a bulk upsert write.

You can find a match using an upsert on if matched is my user and if not setOnInsert.

Obviously index the predicts and consider ttl index for old matches which expired.

Thanks
Pavel

So then we have to put all of the matched data in a collection separate from the user collection, correct?

I would vote for that yes.

But this is according to the design I have in my mind for CRUD and queries of a system like that.

You might find something else useful there is no one model fits all…

So when the user logs in and the app must query for a list of suggestions, it must determine the lists of rejected and match users, so that the query won’t show them again. It must do something like

db.col.aggregate({
  {
    $match: {
        userID:  { $nin: [ /* already rejected or already matched  useres*/ ] }
      }
  }
])

Since the array, [ /* already rejected or already matched useres*/ ], is expected to be in the 1000s and can easily exceed the 16MB limit, what is the solution to this problem that this query must handle?

Hi @Big_Cat_Public_Safety_Act ,

So there are several ways to achieve that, what i prefer is that the endpoint in the API that delivers the batch of profiles to render in the application is doing a 2 phase query :

var matchedProfiles = db.matches.find({matcher : myId})

profiles = db.profiles.aggregat({  { $near : { gelocation : ... } }, gender : ... }, 
// Add a sortby an index compund to the geo one to add some randomisation and better score
{$sort : {lastModified : -1, score : 1}},
{
  $limit : 1000
}
]
)

// Optimization turn the matchedProfile to be dict by a matchedId 


profiles.filter( !matchedProfiles[profileId]))

While (profiles.length < 200) 
{
//  randomize sort  
profiles = profiles + search again
}

Run async process to get the next batch ready for next access. 

send 200 results back


So in this process we first fetch the list of users to avoid , then we get the profiles based on our general location and some index based sorting to have some fundamental smart randomisation .

Only then we limit to leta say a 1000 results while the app needs only 200 for first init batch .

We filter out matched or rejected users and pass 200 back. In case we missed the 200 mark we add another randomisation search. This can be done in server memory or some cache service. Its not a complex task for modern languages

Once we decide to return we already async prepare the next batch for when the user scroll through the 200 users (probably not in the near few seconds hopefully

Ty
Pavel

To be honest I believe that the real big systems have some suphisticated caching mechanisms or even stores.

So they might do this filtering there or even pre prepair candidates for you once you log out for fast login next time. And if you changed location drastically they will buy time by prompting confirm your location while preparing a new batch of users for you.

Thanks