Migrating from Legacy Realm Sync to MongoDB Realm Guide

Migration finding #1

It seems that the Realm App does not accept Realm Objects that has “float” data types, even though the documentation says otherwise on https://docs.mongodb.com/realm/dotnet/objects/ and Realm .NET | Realm.

I changed my types from float to double, deleted and re-created the Realm App (I had to do this!), and was then able to continue

I am using Beta 1 on .NET, trying to migrate a system currently in production on Realm Cloud

Migration finding #2

The migration guide states that the partions field could be defined as

[MapTo("_partition")]
public string _partition { get; set; }

and the documentation recommends that the partition field is required.

I order for that to work and avoid exceptions, I had to add the [Required] attribute and ended up with:

[Required]
[MapTo("_partition")]
public string Partition { get; set; }

I believe the Migration guide and the documentation should mention that.

I am using Beta 1 on .NET, trying to migrate a system currently in production on Realm Cloud

Be sure to wipe your local state/simulator if you are making type changes like this. Then reconnect in developer mode. You may need to delete the old schema on the sync server side if you already saved it with the float unsupported type

1 Like

I have had a look at the guide but it does not appear to explain how existing data should be migrated from Realm Cloud to Mongo Realm particularly when there may be multiple relationships like the following:

Do we have to write custom scripts to extract the data from the existing Realm Cloud file and reload it into the new format.

Are there any suggestions as to how existing data should be migrated - it’s a non-trivial task to change the primary key on every database object and to recreate each object, list and object reference.

// Old Schema

class Parent: Object {
    @objc dynamic var id = UUID().uuidString
    @objc dynamic var name: String = ""

    let children = List<Child>()
}

class Child: Object {
    @objc dynamic var id = UUID().uuidString
    @objc dynamic var name: String = ""
    @objc dynamic var parent: Parent?

}


// New Schema

class Parent: Object {
    @objc dynamic var _id: ObjectId = ObjectId.generate()
    @objc dynamic var name: String = ""

    let children = List<Child>()
}

class Child: Object {
    @objc dynamic var _id: ObjectId = ObjectId.generate()
    @objc dynamic var name: String = ""
    @objc dynamic var parent: Parent?

}
2 Likes

@Duncan_Groenewald There is a private API you can use to to map the user facing code to the underlying column name. You can see it used here:

We will look to expose this in a public API in an upcoming release but you can still use it today if you’d like.

You’ve lost me.

What do you mean by “user facing code” ?
What do you mean by “underlying column name” ?
What does this have to do with migrating existing data to Realm Mongo ?

You don’t appear to have answered my questions:

  1. Do we have to write custom scripts to extract the data from the existing Realm Cloud file and reload it into the new format.

  2. Are there any suggestions as to how existing data should be migrated - it’s a non-trivial task to change the primary key on every database object and to recreate each object, list and object reference.

@Duncan_Groenewald

I think Ian responded to the wrong thread as the code he posted was for PermissionUser objects which was not what you asked. That being said, a couple of things:

User facing code: the code that we see and the functions we use are documented in the API guide. Realm has a number of ‘hidden’ functions like _realmColumnNames on the PermissionUser object which are not in documented in the API or exposed - but you can still use them temporarily (not needed in your case)

The underlying column name point is not relevant to your question (as far as I can tell).

Looking at what you posted, I am not seeing you’re using a primary key on your objects. You need to add the _id property (which is an additive change) as shown but other than than the objects should transition pretty easily.

Is your project using Partial Sync or Full sync or is it local only?

All the objects have primary keys - I left that out in the examples. We use a base class as shown below.

If is full sync.

class BaseObject: Object {

   @objc dynamic var id = UUID().uuidString
   @objc dynamic var tstamp: Date = Date()

   override static func primaryKey() -> String? {
      return "id"
   }
}

I there any plan to provide additional guidance for migrating existing data ?

As mentioned in one of my previous posts there is no explanation of how one would migrated existing data if there are any relationships. The trivial example shows migration of a single table that has no relationship.

How is one supposed to migrate existing data and existing relationships ? Could you include an example of how to migrate two tables, for example a simple master-detail relationship like and invoice with line items.

Am I correct in assuming that one would have to write custom scripts to migrate each object and each relationship ?

And if so then what is the correct way to create the relationships in MongoDB. For example - how does one recreate the RealmSwift List property in MongoDB ?

@Duncan_Groenewald Relationships are stored as the _id value of the target document casted as a string. A List property would be an array of these _ids - if you are interested in seeing how this would look I’d recommend going through a tutorial to familiarize yourself with some of the concepts -

@Ian_Ward - I must be missing something - that tutorial doesn’t really tell me anything new.

  1. Is there any way to generate the MongoDB schema from an existing RealmSwift client schema ? If so how? If not does this mean one has to manually generate a MongoDB schema that corresponds with the client schema ?

  2. Similarly is there any way to load data to MongoDB from an existing realm file ? For example open a local realm file as well as a synced realm and copy objects to or create new objects in the synced realm ? Although it is not clear how one might do that given it would not be possible to have to versions of the same Realm object in the same application - one being the original object and the other being the object with the new “_id” primary key. Unless the document or table could somehow be renamed in MongoDB after the data has been loaded.

@Duncan_Groenewald If you ran the tutorial and then inspected the way the data is persisted using the Atlas collection viewer you would see how relationships are persisted in Atlas. Based on your previous question, I believe this would be something new for you?

You can use developer mode to instantiate your Sync schema on the server side using your schema from the client side - https://docs.mongodb.com/realm/sync/enable-development-mode

I did the same but doing this will throw exception when save if the Partition is not set explicitly in code. I would prefer the Partition to be inferred automatically when save. Is it possible?

This may not address that question but based on prior posts and our current app, you don’t need to explicitly set the partitionKey property on the object in code. You can actually omit the partition key from the object completely and it will automatically be assigned when written based on the partition key specified when opening realm.

This behavior is not outlined very well in the documentation IMO. For example; this task object has no partitionKey (in Swift)

class TaskClass: Object {
    @objc dynamic var _id: ObjectId = ObjectId.generate()

    @objc dynamic var name = ""
    @objc dynamic var status = TaskStatus.Open.rawValue

and when written to Realm has the required _partitionKey automatically populated

let task = TaskClass()
task.name  = "Task No Partition Property"

and when written

@Ian_Ward - Can you tell me if the following approach for converting existing Realm Cloud data is likely to work:

  1. All objects have the property ‘id’ : String as the primary key.

  2. We are converting all objects using a modified version of the original script for loading a local realm file to Realm Cloud, except we are creating another local file and converting the ‘id’ property to ‘_id’. This appears to work fine and we ignore any objects with ‘__’ prefix e.g. ‘__Class’, ‘__User’…

    var target_realm_schema = JSON.parse(JSON.stringify(source_realm_schema));

     target_realm_schema.forEach((objSchema) => {
    
         for (var key in objSchema.properties) {
             const prop = objSchema.properties[key];
    
             if (key === 'id' && prop.type === 'string' && objSchema.name[0] !== '_') {
                 var target_prop = JSON.parse(JSON.stringify(prop));
                 target_prop.name = '_id';
                 target_prop.mapTo = '_id';
                 objSchema.primaryKey = '_id';  
                 objSchema.properties['_id'] = target_prop;
                 delete objSchema.properties['id'];
             }
    
         };
     });
    

and then copy the id value to the _id property

           if (key === 'id') {
                copy['_id'] = obj[key];
            } else {
                copy[key] = obj[key];
            }
  1. We convert the RealmSwift client to use ‘_id’ instead of ‘id’ and open the realm file created in 2). App seems to work fine.

  2. Now using the same script for loading the Realm Cloud we modify it to do essentially the same thing copying data into a v10.1.1 synced Realm.

We seem to have run into a problem with the system tables with ‘__’ prefix since they don’t have ‘_id’ primary keys - are these required for v10.1.1. SDK or can we remove these from the v10.1.1 schema ?

Regards

OK I have removed the system tables from the schema and now get past the schema validation and the schema is now appearing in MongoDB Atlas but there appears to be no data in Atlas.

Opening the realm and querying data from the node.js client returns data.

Any idea why data is not visible in Atlas ?

Script below:

const Realm = require('realm');


// Copy the object properties
// If it is an object then set the property to null - we update this in second cycle
// If it is a list then set the list to empty - we update these in second cycle 
var copyObject = function(obj, objSchema, targetRealm) {
    const copy = {};
    for (var key in objSchema.properties) {
        const prop = objSchema.properties[key];
        if (!prop.hasOwnProperty('objectType')) {
            copy[key] = obj[key];
        }
        else if (prop['type'] == "list") {
            copy[key] = [];
        }
        else {
            copy[key] = null;
        }
    }

    // Add this object to the target realm
    targetRealm.create(objSchema.name, copy);
}

var getMatchingObjectInOtherRealm = function(sourceObj, source_realm, target_realm, class_name) {
    const allObjects = source_realm.objects(class_name);
    const ndx = allObjects.indexOf(sourceObj);

    // Get object on same position in target realm
    return target_realm.objects(class_name)[ndx];
}

var addLinksToObject = function(sourceObj, targetObj, objSchema, source_realm, target_realm) {
    for (var key in objSchema.properties) {
        const prop = objSchema.properties[key];
        if (prop.hasOwnProperty('objectType')) {
            if (prop['type'] == "list") {
                var targetList = targetObj[key];
                sourceObj[key].forEach((linkedObj) => {
                    const obj = getMatchingObjectInOtherRealm(linkedObj, source_realm, target_realm, prop.objectType);
                    targetList.push(obj);
                });
            }
            else {
                // Find the position of the linked object
                const linkedObj = sourceObj[key];
                if (linkedObj === null) {
                    continue;
                }

                // Set link to object on same position in target realm
                targetObj[key] = getMatchingObjectInOtherRealm(linkedObj, source_realm, target_realm, prop.objectType);
            }
        }
    }
}

var copyRealm = async function(app, local_realm_path) {
    // Open the local realm
    const source_realm = new Realm({path: local_realm_path});
    const source_realm_schema = source_realm.schema;

    // Create the new realm (with same schema as the source)

    var target_realm_schema = JSON.parse(JSON.stringify(source_realm_schema));
    console.log("target: ", target_realm_schema)
    var IX = target_realm_schema.findIndex(v => v.name === 'BindingObject')
    target_realm_schema.splice(IX,1);
    IX = target_realm_schema.findIndex(v => v.name === '__Class')
    target_realm_schema.splice(IX,1);
    IX = target_realm_schema.findIndex(v => v.name === '__Permission')
    target_realm_schema.splice(IX,1);
    IX = target_realm_schema.findIndex(v => v.name === '__Realm')
    target_realm_schema.splice(IX,1);
    IX = target_realm_schema.findIndex(v => v.name === '__Role')
    target_realm_schema.splice(IX,1);
    IX = target_realm_schema.findIndex(v => v.name === '__User')
    target_realm_schema.splice(IX,1);

    console.log("target: ", target_realm_schema)
    
    const target_realm = await Realm.open({
        schema: target_realm_schema,
        sync: {
          user: app.currentUser,
          partitionValue: "default",
        },
      });

    target_realm.write(() => {
        // Copy all objects but ignore links for now
        source_realm_schema.forEach((objSchema) => {
            console.log("copying objects:", objSchema['name']);
            const allObjects = source_realm.objects(objSchema['name']);

            allObjects.forEach((obj) => {
                copyObject(obj, objSchema, target_realm)
            });
        });

        // Do a second pass to add links
        source_realm_schema.forEach((objSchema) => {
            console.log("updating links in:", objSchema['name']);
            const allSourceObjects = source_realm.objects(objSchema['name']);
            const allTargetObjects = target_realm.objects(objSchema['name']);

            for (var i = 0; i < allSourceObjects.length; ++i) {
                const sourceObject = allSourceObjects[i];
                const targetObject = allTargetObjects[i];

                addLinksToObject(sourceObject, targetObject, objSchema, source_realm, target_realm);
            }
        });
    });
}


  async function run() {

    try {

        const app = new Realm.App({ id: appId });

        const credentials = Realm.Credentials.emailPassword(username, password);

        await app.logIn(new Realm.Credentials.anonymous());
    
        await copyRealm(app, source_realm_path);

    } catch(error) {
        console.log("Error: ", error)
    }

    

  }


  run().catch(err => {
    console.error("Failed to open realm:", err)
  });

There doesn’t appear to be any call to MongoDB via the node.js driver - how are you inserting data into Mongo? This appears to only be opening the realm on the legacy Realm cloud - where/how are you inserting it to Atlas?

I am not using the MongoDB Realm migration guide script here since it does not seem to handle the relationships correctly when data is converted to JSON. The script I modified is the original one provided for migration to Realm Cloud from a local file. I am just opening a synced MongoDB Realm using SDK v10.1.1 instead and then copying the data from the local realm file into the ‘target_realm’.

The ‘source_realm’ is just a local realm file. The ‘target_realm’ is the MongoDB synced realm.

Shouldn’t it be possible to open the new Synced Realm and then add data to the synced realm from the client ? It seems to have created all the same synced realm subdirectories and files on the local machine - the top level directory is called “~/mongodb-realm/appID”.

const target_realm = await Realm.open({
        schema: target_realm_schema,
        sync: {
          user: app.currentUser,
          partitionValue: "default",
        },
      });

Oh I see - yeah that should be possible. What do the logs say?

Which logs ? On the Atlas portal is just shows connect and disconnect and session start and session end.

The console output from Node.js is as follows:

Enter to quit
> ERROR: Connection[1]: Reading failed: Broken pipe
Connection[1]: Connection closed due to error
Connection[1]: Connected to endpoint '3.210.32.164:443' (from '10.0.1.171:50279')
ERROR: Connection[1]: Reading failed: End of input
Connection[1]: Connection closed due to error
Connection[1]: Connected to endpoint '3.210.32.164:443' (from '10.0.1.171:50280')
ERROR: Connection[1]: Writing failed: Broken pipe
Connection[1]: Connection closed due to error
Connection[1]: Connected to endpoint '3.210.32.164:443' (from '10.0.1.171:50281')
ERROR: Connection[1]: Reading failed: Broken pipe
Connection[1]: Connection closed due to error
Connection[1]: Connected to endpoint '3.210.32.164:443' (from '10.0.1.171:50282')

So what is going on here ? Is there some limit on the size of sync data ? If so shouldn’t the sync client create smaller packets automatically ?

UPDATE:

OK I just tried loading a single table using the same script and that data is now showing up in Atlas.

@Ian_Ward is there some kind of problem with loading data using the method I have outlined above if the total amount of data exceeds 16MB ? Even if I break up the write(){} transactions into smaller chunks the sync still seems to be failing. Is there any way for the client to control things so that the sync limit of 16MB does not get exceeded ?

I also keep getting the following message