What's the best migration strategy to handle duplicate primary keys across different Realms?

What’s the best migration strategy to handle duplicate primary keys across different Realms?

In my legacy realm sync instance, I’m assigning each user their own realm and saving some common API data (like Food info) to those realms using the IDs from the API as the primary key (like a foodId). That means I have multiple legacy realms with objects that share a primary key.

This was fine pre-MongoDB Realm, but now since all user data is stored in collections, I can’t use those same primary keys without syncing issues. Unfortunately, users were able to mutate this data, so migrating it to a common partition is not an option.

What’s a good approach to migrating this data over without conflicts?

I’ve considered making my new schema contain both the API ID and an autogenerated _id. This would require me to make queries and updates client side with only the API ID and I wouldn’t get the benefits of primary key indexing. Is this my only option?

“This data”? Does this mean the user can change the primary key on objects?

I think it would help if we knew how your primary keys were being used; did you directly reference the primary key in code or were references made to the object itself instead? Did you use the primary key to perform updates? That kind of information will determine how the data is transitioned.

Primary keys were never muted. Other fields could be mutated though like the food name or nutrition info.

The primary key is directly referenced in code when querying objects by ID. Since all objects have primary keys, updates are performed using realm.add(object, update: .modified)

Example:

In a legacy Realm, User A and User B both get a Food object from the API. Both users save that object to their respective realms with a primary key of foodId.

// on successful fetch from API
let foodEntity = FoodEntity()
foodEntity.name = json.foodName
foodEntity.id = json.foodId // 'id' is the primary key

realm.write {
   realm.add(foodEntity, update: . modified)
}

Their realms now both look like this:

// FoodEntity:

{
    "name" : "Eggs",
    "id" : "food_id"
}

User B decides to edit the name of the food they just saved

let foodEntity = realm.objects(FoodEntity.self).filter("id == food_id")[0]

realm.write {
    foodEntity.name = "Organic Eggs"
    realm.add(foodEntity, update: .modified)
}

And now User A’s realm looks like:

// FoodEntity:

{
    "name" : "Eggs",
    "id" : "food_id"
}

But User B’s realm looks like:

// FoodEntity:

{
    "name" : "Organic Eggs",
    "id" : "food_id"
}

So now when I’m migrating over their legacy data, I run into an issue where both of those objects have the same primary key, but different fields values.

I hope this example was clear.

This is a great is example of why disassociating an objects primary key from other data is a good idea. Going forward this is a good design pattern:

@objc dynamic var _primaryKey == ObjectID() // or UUID().uuidString

How were the primary key’s (id) for each users FoodEntity generated in the first place? Your object looks likes this

let foodEntity = FoodEntity()
foodEntity.name = json.foodName
foodEntity.id = json.foodId //  <- where did this come from? How generated?

Looking at your write, it’s not based on a primary key, it’s just updating that object regardless of what the primary key is

realm.write {
    foodEntity.name = "Organic Eggs"
    realm.add(foodEntity, update: .modified)
}

Because of that it’s not clear how you’re using the primary key of id.

As a side note, there’s no reason to filter for objects that have primary keys if you know the key - it can be accessed directly. So instead of this

let foodEntity = realm.objects(FoodEntity.self).filter(“id == food_id”)[0]

you can do this

let specificFood = realm.object(ofType: FoodEntity.self, forPrimaryKey: “1234”)

The primary keys come from a 3rd party API. In this example, there is a common database of foods each with their own unique IDs. I poorly architected this codebase a few years ago, unfortunately.

I’m not sure what you mean by this. The primary key never gets generated or updated client-side. We set the primary key as the ID of an object in a 3rd party database. Every other field except the primary key can be updated client-side. The primary key is really just used for performing queries and upserts.

Thanks for taking the time to help out with this btw.

Are you saying you’re taking the primary key from the ‘master food database’ and making that the primary key of the FoodEntity for each user? In other words suppose your json database contains a pizza food

food_id_1111 {  <- the key
   foodName = "Pizza"
}

food_id_2222 {
   foodName = "Burger"
}

foood_id_3333 {
   foodName = "Tacos"
}

Then when User_A creates a food item it’s this

FoodEntity {
   id = food_id_1111 <- matches the key from the food database?
}

and when User_B creates a food item it’s the same data with the same primary key?

FoodEntity {
   id = food_id_1111 <- matches the key from the food database?
}

Yes, that’s exactly what I’m doing

Well, ouch.

Obviously primary keys much be unique but my question from above stills stands; how are you currently using the primary keys? Are you storing the actual primary keys in Lists or are you storing reference to the objects (which is how it should be)?

Suppose a user has a List of his favorite foods. Here are some food items based on the FoodEntity stored in the ‘master food database’

food_id_1111 {
   foodName = "Pizza"
}

food_id_2222 {
   foodName = "Burger"
}

foood_id_3333 {
   foodName = "Tacos"
}

then the user is this

class UserClass: Object {
    @objc dynamic var _id = UUID().uuidString
    let favoriteFoods = List<FoodEntity>()
}

so what’s stored in favoriteFoods are the FoodEntity objects, not the primary keys.

I know you said this

let foodEntity = realm.objects(FoodEntity.self).filter("id == food_id")[0]

above but where is the food_id coming from in that code?

I am attempting to come up with a migration path here but it’s dependent on what specifically you’re doing/how your storing the actual primary keys

I’m storing references to the objects exactly as you described like this:

class UserClass: Object {
    @objc dynamic var _id = UUID().uuidString
    let favoriteFoods = List<FoodEntity>()
}

The food ID is coming from an API. A user can query the API for a list of foods using a search query.

FoodAPI.getFoodsForQuery(query: "Eggs") { foodJsons, error in
    // Populate list with foodJsons
}

When a user clicks on one of those foods, we query their realm first to see if they have an edited version of that food saved locally.

let foodEntity = realm.object(ofType: FoodEntity.self, forPrimaryKey: foodJson.foodId)
if let food = foodEntity {
    // use food entity
} else {
    // use food json 
}

Got it. I think the best solution (kind of what you said in your original question) is to change the local model and the query.

class FoodClass: Object {
    @objc dynamic var _id = ObjectId() //this is the new primary key
    @objc dynamic var _partition = "" //the users uid
    @objc dynamic var foodId = "" //this will be a copy of the old primary key
    @objc dynamic var someProperty = ""

    override static func indexedProperties() -> [String] {
       return ["foodId"]
    }
}

and the query would change to

if let maybeFood = realm.objects(FoodClass.self).filter("foodId == %@"), foodIdFromDatabase).first {
   //do something with maybeFood
}

The migration should be pretty straightforward - the code would iterate over each users existing FoodEntity, create a new FoodClass (per above) assigning their uid to the object’s _partition property and copying the old primaryKey property to the foodId property.

They can then be written to a collection for sync’ing and will be specific to each user by their partition key.

Indexing can be a bit tricky and situational - in this case however, it seems you’ll be running equality queries frequently so adding indexing on the foodId property would be appropriate and improve performance.

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.