Migrating from Legacy Realm Sync to MongoDB Realm Guide

All the objects have primary keys - I left that out in the examples. We use a base class as shown below.

If is full sync.

class BaseObject: Object {

   @objc dynamic var id = UUID().uuidString
   @objc dynamic var tstamp: Date = Date()

   override static func primaryKey() -> String? {
      return "id"
   }
}

I there any plan to provide additional guidance for migrating existing data ?

As mentioned in one of my previous posts there is no explanation of how one would migrated existing data if there are any relationships. The trivial example shows migration of a single table that has no relationship.

How is one supposed to migrate existing data and existing relationships ? Could you include an example of how to migrate two tables, for example a simple master-detail relationship like and invoice with line items.

Am I correct in assuming that one would have to write custom scripts to migrate each object and each relationship ?

And if so then what is the correct way to create the relationships in MongoDB. For example - how does one recreate the RealmSwift List property in MongoDB ?

@Duncan_Groenewald Relationships are stored as the _id value of the target document casted as a string. A List property would be an array of these _ids - if you are interested in seeing how this would look I’d recommend going through a tutorial to familiarize yourself with some of the concepts -

@Ian_Ward - I must be missing something - that tutorial doesn’t really tell me anything new.

  1. Is there any way to generate the MongoDB schema from an existing RealmSwift client schema ? If so how? If not does this mean one has to manually generate a MongoDB schema that corresponds with the client schema ?

  2. Similarly is there any way to load data to MongoDB from an existing realm file ? For example open a local realm file as well as a synced realm and copy objects to or create new objects in the synced realm ? Although it is not clear how one might do that given it would not be possible to have to versions of the same Realm object in the same application - one being the original object and the other being the object with the new “_id” primary key. Unless the document or table could somehow be renamed in MongoDB after the data has been loaded.

@Duncan_Groenewald If you ran the tutorial and then inspected the way the data is persisted using the Atlas collection viewer you would see how relationships are persisted in Atlas. Based on your previous question, I believe this would be something new for you?

You can use developer mode to instantiate your Sync schema on the server side using your schema from the client side - https://docs.mongodb.com/realm/sync/enable-development-mode

I did the same but doing this will throw exception when save if the Partition is not set explicitly in code. I would prefer the Partition to be inferred automatically when save. Is it possible?

This may not address that question but based on prior posts and our current app, you don’t need to explicitly set the partitionKey property on the object in code. You can actually omit the partition key from the object completely and it will automatically be assigned when written based on the partition key specified when opening realm.

This behavior is not outlined very well in the documentation IMO. For example; this task object has no partitionKey (in Swift)

class TaskClass: Object {
    @objc dynamic var _id: ObjectId = ObjectId.generate()

    @objc dynamic var name = ""
    @objc dynamic var status = TaskStatus.Open.rawValue

and when written to Realm has the required _partitionKey automatically populated

let task = TaskClass()
task.name  = "Task No Partition Property"

and when written

@Ian_Ward - Can you tell me if the following approach for converting existing Realm Cloud data is likely to work:

  1. All objects have the property ‘id’ : String as the primary key.

  2. We are converting all objects using a modified version of the original script for loading a local realm file to Realm Cloud, except we are creating another local file and converting the ‘id’ property to ‘_id’. This appears to work fine and we ignore any objects with ‘__’ prefix e.g. ‘__Class’, ‘__User’…

    var target_realm_schema = JSON.parse(JSON.stringify(source_realm_schema));

     target_realm_schema.forEach((objSchema) => {
    
         for (var key in objSchema.properties) {
             const prop = objSchema.properties[key];
    
             if (key === 'id' && prop.type === 'string' && objSchema.name[0] !== '_') {
                 var target_prop = JSON.parse(JSON.stringify(prop));
                 target_prop.name = '_id';
                 target_prop.mapTo = '_id';
                 objSchema.primaryKey = '_id';  
                 objSchema.properties['_id'] = target_prop;
                 delete objSchema.properties['id'];
             }
    
         };
     });
    

and then copy the id value to the _id property

           if (key === 'id') {
                copy['_id'] = obj[key];
            } else {
                copy[key] = obj[key];
            }
  1. We convert the RealmSwift client to use ‘_id’ instead of ‘id’ and open the realm file created in 2). App seems to work fine.

  2. Now using the same script for loading the Realm Cloud we modify it to do essentially the same thing copying data into a v10.1.1 synced Realm.

We seem to have run into a problem with the system tables with ‘__’ prefix since they don’t have ‘_id’ primary keys - are these required for v10.1.1. SDK or can we remove these from the v10.1.1 schema ?

Regards

OK I have removed the system tables from the schema and now get past the schema validation and the schema is now appearing in MongoDB Atlas but there appears to be no data in Atlas.

Opening the realm and querying data from the node.js client returns data.

Any idea why data is not visible in Atlas ?

Script below:

const Realm = require('realm');


// Copy the object properties
// If it is an object then set the property to null - we update this in second cycle
// If it is a list then set the list to empty - we update these in second cycle 
var copyObject = function(obj, objSchema, targetRealm) {
    const copy = {};
    for (var key in objSchema.properties) {
        const prop = objSchema.properties[key];
        if (!prop.hasOwnProperty('objectType')) {
            copy[key] = obj[key];
        }
        else if (prop['type'] == "list") {
            copy[key] = [];
        }
        else {
            copy[key] = null;
        }
    }

    // Add this object to the target realm
    targetRealm.create(objSchema.name, copy);
}

var getMatchingObjectInOtherRealm = function(sourceObj, source_realm, target_realm, class_name) {
    const allObjects = source_realm.objects(class_name);
    const ndx = allObjects.indexOf(sourceObj);

    // Get object on same position in target realm
    return target_realm.objects(class_name)[ndx];
}

var addLinksToObject = function(sourceObj, targetObj, objSchema, source_realm, target_realm) {
    for (var key in objSchema.properties) {
        const prop = objSchema.properties[key];
        if (prop.hasOwnProperty('objectType')) {
            if (prop['type'] == "list") {
                var targetList = targetObj[key];
                sourceObj[key].forEach((linkedObj) => {
                    const obj = getMatchingObjectInOtherRealm(linkedObj, source_realm, target_realm, prop.objectType);
                    targetList.push(obj);
                });
            }
            else {
                // Find the position of the linked object
                const linkedObj = sourceObj[key];
                if (linkedObj === null) {
                    continue;
                }

                // Set link to object on same position in target realm
                targetObj[key] = getMatchingObjectInOtherRealm(linkedObj, source_realm, target_realm, prop.objectType);
            }
        }
    }
}

var copyRealm = async function(app, local_realm_path) {
    // Open the local realm
    const source_realm = new Realm({path: local_realm_path});
    const source_realm_schema = source_realm.schema;

    // Create the new realm (with same schema as the source)

    var target_realm_schema = JSON.parse(JSON.stringify(source_realm_schema));
    console.log("target: ", target_realm_schema)
    var IX = target_realm_schema.findIndex(v => v.name === 'BindingObject')
    target_realm_schema.splice(IX,1);
    IX = target_realm_schema.findIndex(v => v.name === '__Class')
    target_realm_schema.splice(IX,1);
    IX = target_realm_schema.findIndex(v => v.name === '__Permission')
    target_realm_schema.splice(IX,1);
    IX = target_realm_schema.findIndex(v => v.name === '__Realm')
    target_realm_schema.splice(IX,1);
    IX = target_realm_schema.findIndex(v => v.name === '__Role')
    target_realm_schema.splice(IX,1);
    IX = target_realm_schema.findIndex(v => v.name === '__User')
    target_realm_schema.splice(IX,1);

    console.log("target: ", target_realm_schema)
    
    const target_realm = await Realm.open({
        schema: target_realm_schema,
        sync: {
          user: app.currentUser,
          partitionValue: "default",
        },
      });

    target_realm.write(() => {
        // Copy all objects but ignore links for now
        source_realm_schema.forEach((objSchema) => {
            console.log("copying objects:", objSchema['name']);
            const allObjects = source_realm.objects(objSchema['name']);

            allObjects.forEach((obj) => {
                copyObject(obj, objSchema, target_realm)
            });
        });

        // Do a second pass to add links
        source_realm_schema.forEach((objSchema) => {
            console.log("updating links in:", objSchema['name']);
            const allSourceObjects = source_realm.objects(objSchema['name']);
            const allTargetObjects = target_realm.objects(objSchema['name']);

            for (var i = 0; i < allSourceObjects.length; ++i) {
                const sourceObject = allSourceObjects[i];
                const targetObject = allTargetObjects[i];

                addLinksToObject(sourceObject, targetObject, objSchema, source_realm, target_realm);
            }
        });
    });
}


  async function run() {

    try {

        const app = new Realm.App({ id: appId });

        const credentials = Realm.Credentials.emailPassword(username, password);

        await app.logIn(new Realm.Credentials.anonymous());
    
        await copyRealm(app, source_realm_path);

    } catch(error) {
        console.log("Error: ", error)
    }

    

  }


  run().catch(err => {
    console.error("Failed to open realm:", err)
  });

There doesn’t appear to be any call to MongoDB via the node.js driver - how are you inserting data into Mongo? This appears to only be opening the realm on the legacy Realm cloud - where/how are you inserting it to Atlas?

I am not using the MongoDB Realm migration guide script here since it does not seem to handle the relationships correctly when data is converted to JSON. The script I modified is the original one provided for migration to Realm Cloud from a local file. I am just opening a synced MongoDB Realm using SDK v10.1.1 instead and then copying the data from the local realm file into the ‘target_realm’.

The ‘source_realm’ is just a local realm file. The ‘target_realm’ is the MongoDB synced realm.

Shouldn’t it be possible to open the new Synced Realm and then add data to the synced realm from the client ? It seems to have created all the same synced realm subdirectories and files on the local machine - the top level directory is called “~/mongodb-realm/appID”.

const target_realm = await Realm.open({
        schema: target_realm_schema,
        sync: {
          user: app.currentUser,
          partitionValue: "default",
        },
      });

Oh I see - yeah that should be possible. What do the logs say?

Which logs ? On the Atlas portal is just shows connect and disconnect and session start and session end.

The console output from Node.js is as follows:

Enter to quit
> ERROR: Connection[1]: Reading failed: Broken pipe
Connection[1]: Connection closed due to error
Connection[1]: Connected to endpoint '3.210.32.164:443' (from '10.0.1.171:50279')
ERROR: Connection[1]: Reading failed: End of input
Connection[1]: Connection closed due to error
Connection[1]: Connected to endpoint '3.210.32.164:443' (from '10.0.1.171:50280')
ERROR: Connection[1]: Writing failed: Broken pipe
Connection[1]: Connection closed due to error
Connection[1]: Connected to endpoint '3.210.32.164:443' (from '10.0.1.171:50281')
ERROR: Connection[1]: Reading failed: Broken pipe
Connection[1]: Connection closed due to error
Connection[1]: Connected to endpoint '3.210.32.164:443' (from '10.0.1.171:50282')

So what is going on here ? Is there some limit on the size of sync data ? If so shouldn’t the sync client create smaller packets automatically ?

UPDATE:

OK I just tried loading a single table using the same script and that data is now showing up in Atlas.

@Ian_Ward is there some kind of problem with loading data using the method I have outlined above if the total amount of data exceeds 16MB ? Even if I break up the write(){} transactions into smaller chunks the sync still seems to be failing. Is there any way for the client to control things so that the sync limit of 16MB does not get exceeded ?

I also keep getting the following message

@Ian_Ward - I have raised a support ticket with details of the MongoDB Realm appID and scripts that were used to load the data - It might be easier for you guys to look at the scripts and logs yourselves as there are multiple errors that don’t mean much to me.

Just a followup - it seems that if I add a delay between each write to the synced realm the data gets synced correctly. Initially I had a 1 second delay but seemed to get a bunch of errors and data never seemed to show up in Atlas, and then some data showed up but not other data.

Anyway I bumped the delay to 10 seconds and now all the data seems to have loaded successfully even though there appear to be a bunch of errors in the logs. It’s not really clear if these errors are indication of fatal sync errors or not. In any event it seems the correct number of records exist in the synced realm and if I create a new client and connect to the same Realm App the data that gets synced to the client also contains the correct number of records.

That’s a good start but relying on a client delay between writes to the realm for successful sync of newly created records is a rather fragile approach - if this is indeed the reason why data as not being synced originally.

I will try a few more tests since it is possible I had done something wrong when setting up Atlas/MongoDB Realm during my initial attempts.

FYI I am just deleting the Realm App and then the Atlas database and recreating them, setting them to Dev Mode and then running the client load script. It seems I have to keep the javascript client application running for some time after the copy process completes to allow the sync to complete - I am not that familiar with the javascript APIs but perhaps some way to check when any sync is completed would be a more elegant solution that simply waiting on console input to prevent the javascript application from ending.

I have checked the migration guide and have made changes accordingly on the app side code.
Old data is populated in atlas.
User will not be having his auth credentials in realm, how to handle them?
Eg:
User data is migrated and is present in the atlas db.
When old user tries to login it will fail, because that user in not present in sync.
Forecefully register that user in auth, and call login again ?
Apart from this any other thing which i should be aware of ?

@spam_mail There is an admin API available on Realm Cloud which enables to programmatically create User accounts - Atlas App Services API
For this thought, you would need the user’s password… which I hope you do not have! :smiley:

If the there is user specific data, data that only certain user’s should have access to, then you will need to set up permissions. One way to do this is to create a collection in Atlas that has a mapping of users to the data they have access to. Users to the new Realm Cloud will need to re-register and when they do you could use an Auth trigger to set the permissions -

The trigger fires when a new user registers, it performs a lookup on the collection where you have a mapping of users to permissions, and sets the user’s custom user data - https://docs.mongodb.com/realm/users/enable-custom-user-data/
The custom user data is what contains information on what partitions that user has access to on MongoDB Realm and is pulled from legacy Realm Cloud and stored in Atlas.

Of course the above is for when you had user-specific permissions, if not, then your re-registration flow can be a lot easier.

Hope this helps

-Ian

1 Like

Hi Ian,

Still stuck in the conversion step.
You wrote “A List property would be an array of these _ids” - so I converted my data to be as such and now lists are represented as array of strings, but it raise an error because in the realm scheme it still defined as a list of another table (schema).

  1. should I change the realm scheme to be an array of strings instead of a list ?
  2. if the answer to 1 is yes - do I need to change my code and now add another query in my code to fetch the record of these id’s from the connected table when needed? because until now I could simply access them via the main table object (e.g. User.userDevcies[0].DeviceType

this is an example of my realm User schema with a list property (userDevices ):
User.schema = {

name:'User',

primaryKey: '_id',

properties: {

    _id: {type: 'objectId',optional:false},
    _partition:  {type: 'string',optional:false},
    id                : 'string',    
    userID            : {type: 'string'},
    registrationDate  : {type: 'date'},
    userDevices       : {type:'UserDeviceInfo[]'},
}

}

Thank you

Yuval

No - you should still use a list of Relationships in your schema but the new Realm Sync will store that data as an array of strings in the MongoDB document. This is because there is no concept of relationships in MongoDB so this is what we landed on. For your schema, but on the Realm Client and the Realm Server - the field will still be a List of Objects.

Hi Ian,

Just to make sure im not confusing -
I have realm scheme which is in my code, and the Realm Sync schema which is defined in the MongoDB.
In the Ralm schema I keep the List and in the matching schema on the Realm Sync I set the same property as an array of strings as in the below example?

“userDevices”: {
“bsonType”: “array”,
“items”: {
“bsonType”: “string”
}
},

If I do that I get an error that the schemas are not matching :

The following changes cannot be made in additive-only schema mode:

  • Property ‘User.userDevices’ has been changed from ‘array< string >’ to ‘array< UserDeviceInfo >’.