How to setup and config Realm MongoDB? What partition_value should I use on the client?

Aurelio_Petrone · October 5, 2020, 11:08pm

Hello guys!

I’m finally starting to move my app from Realm.io to MongoDBRealm and I’d like to discuss with you about my goals and how to reach them in a proper way.

GOAL

Design a mongo db schema and a Realm configuration for a react-native and GraphQL webapp.

DATA MODEL

UserData

Only the owner can read/write.

UserPermission

Only the system can read/write. It’s used for user_custom_data.

Content

Each content has an id. The user can’t create/update/delete any document. The user can read a document only if it’s in a list. This list could be stored in user_custom_data and should be not editable by the user.

A content contain also the total number of likes that is updated by a trigger on every “Like” (see below) insertion/deletion.

Like

User can insert document and delete it’s own documents.

WHAT I DID UNTIL NOW

I started creating a new react-native app and installing realm@beta. After struggling a little bit I setted up everything and tried some CRUD operation with some random models. Everything fine. On Realm cloud I used the query-based sync model so I could reach every specifications using a single realm and the fine grained permissions system. Here it seems a little bit different.

I created a collection of Content. Each document has a specific _pid (partition_id).

After I setted up the user_custom_data creating the collection UserPermissions on Atlas UI and associating that to user in Realm UI. In UserPermissions I have a list (the array “readable_contens”) of ids that are the id of the document I can read of the class Content.

I setted up the Sync to use _-pid as partition key. So, as the documentation says, I should be able to use something like this in the Sync rules:

{ "%%user.custom_data.readable_contents" : "%%partition" }
*in the read rules

Perfect! It should works. I have to insert an “or” in the rule to include also every content in the class UserData and Like the user own and I should be good (ok I have to write also the trigger for synching the likes to the “total likes” field in the corrispondent “Content” document.

So at the end should be something like this

{ "%OR" : { "%%user.custom_data.readable_contents" : "%%partition", "%%user.id" : "%%partition", } }
*in the read rules

I don’t need even a function for this. But here is the big problem. When I went back to my react-native app to see this:

sync: {
  user: user,
  partitionValue:  **AND NOW?***,
}

Now that I use a rule like that on the Realm Sync panel, what partition value should I use? It seems it gives an error if I omit that but it accept only “numbers, strings and objectId” as value so I can’t use neither the user.id nor the user_custom_data.readable_contents (that is an array). Anyway I can’t think of what partitionValue rapresents in this Sync rules scenario.

What am I doing wrong?

Ian_Ward · October 6, 2020, 1:12pm

@Aurelio_Petrone You can only open a realm with a single partitionKey value - however, you can open multiple realms in your client side app - each with a different partitionKey value

Aurelio_Petrone · October 6, 2020, 2:17pm

@Ian_Ward so this is not the good way of approaching the problem.

I tought I could just do something like this.

Create just two partition values: one is the user id and another one could be public.
I still need a way to prevent the user to access to all the “public content”. So, if I understand well how it works, i’d. need a rule like this:

{ "%%user.custom_data.readable_contents" : "%%document.id" }
*in the read rules

I tried to use %%root and %%this but in both cases I get an error (that I read from the logs in the admin panel) that say “Don’t know how to expand %%root (or %%this)”

Thanks for the feedback

Ian_Ward · October 6, 2020, 7:26pm

@Aurelio_Petrone You are approaching this problem by thinking about this in the lens of legacy Realm Sync with query-based sync. MongoDB Realm does not QBS, the partitioning strategy of MongoDB Realm is analogous to “full-sync” in legacy Realm.

Aurelio_Petrone · October 6, 2020, 7:30pm

So it’s impossible to prevent users to read just a part of the content if the partition key is the same? How should I do in this case? There are like 3k different contents and I need a different “part of partition” for each user (and there are thousands of them so a lot of possible “partitions”).

@Ian_Ward

Ian_Ward · October 7, 2020, 2:16am

Well that it depends on what your public schema looks like and which queries you are using to seed QBS client with the legacy realm model. If you’d like to share them I’d be happy to comment. There are workarounds to migrating to the new partition model which I am happy to suggest

Aurelio_Petrone · October 7, 2020, 11:50am

This is the schema I have in mind (at least part of it, but it believe it’s enough for this example).

UserData{
   id : string, // user id
   partitionId : string,  // it's the user id again since I can chose only one partition key and I still want access to id
   readContents: string[], // when an user. read a content,  the id will go here to track the read contents
   name: string // just the user name
}

User can read and write documents in this collection

UserCustomData {
   _id :  string  //the user id for linking the user custom data to this document. 
   canReadDocument:  string[] // a list of content id i have access to
}

User can only read documents in this collection

Content {
    id : string // The id of the content
    partitionId : string // Partition id, in this case it's "public"
    title:  string // Just the title
}

User can only read some of the documents. I mean that there should be a realm on the server but user should have only access to part of it (like QBS, hope to see it soon also on MongoDBRealm).The content ids user have access to are written in user_custom_data

This way I’d be. able to use just two parttion values: “public” and “user_id”

Thanks @Ian_Ward

Ian_Ward · October 7, 2020, 10:01pm

So there are two workarounds that support a more flexible syncing method in the new MongoDB Realm partition sync model.

One of these is that you can use a different partitionKey field for two separate realms that are opened on the client side. A MongoDB Realm cloud app only allows for a single partitionKey field when you enable sync, however, there is nothing preventing you from creating a second cloud app with a partitionKey field and connecting it to the same Atlas backend. You would need to clone your configuration over and you would have to auth twice but this enables some more flexibility. For instance, you can imagine an app where there are salespeople in the field and you only want salespeople to see their own leads and contacts. However their manager should see an amalgamation of all the salespeople’s leads and contacts that report to them. A manager would login to the managerCloudRealmApp and use the managerId as the partitionKey where as a salesperson would login to the salespersonCloudRealmApp and use the salespersonId as the partitionKey. For example your Lead document could look like this -

Lead {
    _id : string // The id of the lead
    salespersonId : string // The id of the salesperson who owns this Lead
    managerId:  string // The id of the manager whom the salesperson reports to
}

Another option is to denormalize the data and create copies of the data in each individual user’s realm. You can use Database Triggers - https://docs.mongodb.com/realm/triggers/database-triggers/

To copy changes from one content document that receives sync changes to any other user’s realm which also has that content document. You can use a contentId or similar as stable identifier - when changes occur, you query for any other documents that also have that contentId and apply the same changes to the document. You can imagine optimizations to this, such as a lookup table or embedding metadata in the content document if needed.

I hope this helps

Aurelio_Petrone · October 7, 2020, 10:24pm

I think I need both of the workarounds. I think I need

1 Realm read-ony for the content (it contain a synchedd copy of the root content stored in a separate collection) partitioned by used_id
1 Realm read/write for the user_data partitioned by user_id

So, just to confirm. Since I have two different behaviours (read - read/write) I need two Realm App.

I have also a setting object for the common app settings that I used to store in a collection with a single document. I think is better to use some kind of service like Firebase Remote Config since we are already using firebase.

Ian_Ward · October 8, 2020, 3:06am

@Aurelio_Petrone I don’t think you need two different Realm Apps you can just use partitionId for a single cloud app. For your UserData collection you would have partitionId : "user=Ian' and for the Content collection you would have partitionId : "content=Ian" both are different partitions but they correspond to the same user. You would then duplicate the content as needed. You can see more about this strategy here -
https://docs.mongodb.com/realm/sync/partitioning/#example-partitioning-data-across-groups

Jay · October 8, 2020, 3:13pm

@Ian_Ward

May I ask for a bit of clarity on option #1 a different partitionKey field for two separate realms

Is the suggestion to create two different apps in the console and have them both point to the same Atlas dataset? The Salesperson App has a salesperson_partition_key and the Manager App has a manger_partition_key. So then all the Lead objects would have a associated properties

class LeadClass: Object {
   @objc dynamic var some_data = "" //whatever data for this lead object
   @objc dynamic var salesperson_partition_key = ""
   @objc dynamic var manager_partition_key = ""
}

Then when a salesperson logs in the salesperson app using their salesperson_partition_key, they only see their data.

When a manager logs into the manager app with their partition key, they can see all of the data from all the salespeople.

Or is there another component whereas each salesperson has their own distinct Realm? Or something else?

Ian_Ward · October 8, 2020, 11:03pm

@Jay You got it. It’s the same data just accessed through separate cloud apps based on permissions. So depending on the user’s role, you would have different code paths that injected the correct appId and partitionKey value when opening the realm.

Aurelio_Petrone · October 10, 2020, 4:39pm

Thanks, at the end I did just 1 App and 1 partitionKey.

I used a field called “_partition” that could be:

“reader=xxxx”, for read-only documents
“owner=xxxx”, for read/write documents

*where xxxx is the user id.

So the Sync rules are now

// read rules

{  
  "%%true": {
    "%function": {
      "name": "canRead",
      "arguments": [
        "%%partition"
      ]
    }
  }
}

// write rules

{
  "%%true": {
    "%function": {
      "name": "canWrite",
      "arguments": [
        "%%partition"
      ]
    }
  }
}

These are the two function

// canWrite

exports = async canRead (partition) => {
 
  console.log("Valuto i permessi di lettura")
  
  if(partition == 'owner='+context.user.id || partition == 'reader='+context.user.id){
    return true
  }else{
    return false
  }
  
}

// canWrite

exports = async (partition) => {
  console.log("Valuto i permessi di lettura")
  
  if(partition == 'owner='+context.user.id){
    return true
  }else{
    return false
  }
  
}

In my example I have different collections:

contents, a collection of all the contents of a single “project” (or Realm App). This collection is not included in the Sync.
contents*, a collection containing a document made by adding to the original “content” from the content collection and a partition key with value “reader=user_id”. This way the user will have read-only access to this document. This means that there will be n copy of a document for each user. For example, the average number of contents a user can read is 150 and there are 10000 users. This means that in this collection there are 1500000 documents. I’m pretty confident that these numbers could be handled pretty well by Atlas.
users, a collection of user data. Each document has a partition key with the value “owner=user_id”. So the users should be able to read/write their own data

The collections “contents” and “contents*” are synched by a Database Trigger. This is the code I used to sync it. (that maybe could be written a little better ). This is a simple behavior but works for me. When a content document is deleted the synched contents just lose their “sync” instead of being deleted too.

 exports = function (changeEvent) {


    let collection, doc;
    const docId = changeEvent.documentKey._id;



    switch (changeEvent.operationType) {

        // If you make an update from the MongoDb Atlas Web GUI the system considers it a "replace" operation.
        case "replace":


            doc = changeEvent.fullDocument;

            collection = context.services.get("eDojo").db("Project_1").collection("contents*");

            const query1 = { "_eid": docId, "sync": true };
            const update1 = { $set : { title : changeEvent.fullDocument.title, sync: true } };


            collection.updateMany(query1, update1).then(({matchedCount})=>{
              console.log(matchedCount)
            });


            break;
            
        case "update":


            doc = changeEvent.fullDocument;
            collection = context.services.get("eDojo").db("Project_1").collection("contents*");

            const query3 = { "_eid": docId, "sync": true };
            const update3 = { $set : { title : changeEvent.fullDocument.title, sync: true } };


            collection.updateMany(query3, update3).then(({matchedCount})=>{
              console.log(matchedCount)
            });


            break;

        case "delete":

            collection = context.services.get("eDojo").db("Project_1").collection("contents*");

            const query2 = { "_eid": docId };
            const update2 = {
                $set: { "sync": false }
            };

            collection.updateMany(query2, update2);




            break;
    }


};

Now there shouldn’t be big problems also with the trigger. I read somewhere that the maximum number of triggers that can be executed is 1000/s. Is this true? Can I extend this number by upgrading my plan?

If so, let’s do this example. I could store the number of “likes” directly in the “content” document. If a user puts likes I execute a cloud function that updates the number of likes in the document. So if 10000 users have access to that content it means that I have to update 10000 documents in “contents*” to sync that like. If 1000 triggers/s is right this means 10 seconds. But if 1000 users do that at the same time? It will generate a very long cue. Should I worry about that or Atlas is capable of doing that?

So two questions please @Ian_Ward :

Is Atlas capable of managing this long cues
If so, should I use this kind of practices or should I try to design something better (like store this info in a user-related collection of “content meta” decreasing this way the number of documents from N(n of users)xN(n of documents per user) to N (number of users)

Btw, now I can connect to two realms using this code (where you can see also the schema I’m using). It’s partial.

const credentials = Realm.Credentials.emailPassword("xxx", "xxxx");

const user = await Db.app.logIn(credentials);


const config = {

    schema: [contentsSchema],

    sync: {

        user: user,

        partitionValue: "reader=" + user.id

    },

};

const config2 = {

    schema: [userSchema],

    sync: {

        user: user,

        partitionValue: "owner=" + user.id

    },

};

try {

    Db.realm = await Realm.open(config);

    Db.realm2 = await Realm.open(config2);

} catch (error) {

    console.log("Error:", error.message)

}}

The schema:

 export const contentsSchema = {

    name: 'contents*',

    properties: {

        _id: 'objectId?',

        _eid: 'string?',

        _partition: 'string?',

        sync: 'bool?',

        title: 'string?',

    },

    primaryKey: '_id',

};

export const userSchema = {

    name: 'user',

    properties: {

        _id: 'string?',

        _partition: 'string?',

        privateData: 'string?',

    },

    primaryKey: '_id',

};

Hope this helps somebody.

Aurelio_Petrone · October 13, 2020, 5:58pm

@Ian_Ward @kraenhansen any feedback? Could this sync via triggers mechanism work?

Ian_Ward · October 13, 2020, 7:56pm

We currently cap Triggers at 1k ops/s but we can raise this for an individual app and are looking into raising the overall limit in the future. But one thing you should consider is that you can update a high number of documents in a Trigger, so a single Trigger should be able to update 10k documents.

We think that a single trigger should likely be able to update all the documents. It might be simpler to do this if you kept a metadata document for each piece of content that stored a list of users who had it saved. Those metadata documents would only need to be updated/read by Triggers so partitioning wouldn’t be an issue. This wouldn’t cut down on the number of Triggers – It’s more for simplicity/efficiency of code if you had a single trigger updating all documents based on a new like OR if the data was structured in such a way that it was time consuming/not scalable to get all the users who had liked a certain piece of content within the Trigger.

Depending on volume of likes you expect you may want to have a user’s like add +1 to their copy of the content and then add/increment a separate document tracking pending likes for a document and only fire a Trigger to update all users likes when the document counting hits certain thresholds. That would cut down on the write amplification quite a bit. And this would cut down on the actual number of triggers by firing less frequently, but the cutdown is only meaningful if there are lots of likes to the same content within a certain time period.