Hello, I’m using flexible sync with dev mode off and I added a required bool to the Service
object: isArchived
, in the iOS app and in the schema.
After the automatic client reset, the new realm has some issues: all Service
objects have new ids and the Appointment
objects, which hold an optional reference to a Service
object, now have null instead.
How can I recover the user data? (I have a backup of the realm file)
I would also like to understand how this whole schema change and client reset process works, to avoid future issues. I read all the documentation, many times actually, but I still have some open questions:
-
What exactly happens during an automatic client reset? I looked at the afterReset
callback and the after
Realm has some data but different ids, which in my opinion beats the purpose of recovery since the Appointment
objects are now unusable with the null references.
-
Why is there a need for a client reset at all when adding a new property with a default value? And why can’t I do a simple migration like with a local Realm?
-
I tried to remove the isArchived
property and I got a client reset error. In the documentation there is no guidance (or example) on what to do in this situation to recover the data. There is just this code sample:
func handleClientReset() {
// Report the client reset error to the user, or do some custom logic.
}
-
In the docs I’m instructed to call SyncSession.immediatelyHandleError
after handling the client reset but I don’t understand what this does exactly besides making a copy of the old Realm file and creating a new, empty one.
-
I tried to use the afterReset
callback to delete everything from the after
Realm and copy everything from before
, but then I got another sync error. Is this a valid approach?
-
Are schemaVersion
and migrationBlock
used at all for synced Realms?
-
I tried to writeCopy
on the before
realm but I got this error: All client changes must be integrated in server before writing copy
.
-
How can I avoid issues like this which lead to data loss in production if I make simple changes to the schema? I read about the partner collection strategy but it didn’t seem necessary at this point when I also updated the iOS client with the new Bool.
Please help me with these questions, it’s a very serious issue and I already spent two days on it trying everything I could.
Also, what is the relationship between the before
and after
realms with the realm I open initially? Is after
the same as the original one? What happens if I try to add objects to it? And how come the filePath
is the same?
Update: I tried to drop the database then copy all objects from the before
realm (using the backup file) to the newly created realm, and after a few seconds I’m getting this error:
2023-06-20 14:19:02.576016+0300 BeePalApp[47583:35385783] Sync: Connection[5]: Session[5]: A previous reset was detected of type: 'Recover' at: 2023-06-20 11:18:07
2023-06-20 14:19:02.579105+0300 BeePalApp[47583:35385783] Sync: Connection[5]: Session[5]: A fatal error occured during client reset: 'A previous 'Recover' mode reset from 2023-06-20 11:18:07 did not succeed, giving up on 'Recover' mode to prevent a cycle'
Error Domain=io.realm.sync Code=7 "A fatal error occured during client reset: 'A previous 'Recover' mode reset from 2023-06-20 11:18:07 did not succeed, giving up on 'Recover' mode to prevent a cycle'" UserInfo={error_action_token=<RLMSyncErrorActionToken: 0x60000089cbd0>, NSLocalizedDescription=A fatal error occured during client reset: 'A previous 'Recover' mode reset from 2023-06-20 11:18:07 did not succeed, giving up on 'Recover' mode to prevent a cycle', recovered_realm_location_path=/Users/madalin/Library/Developer/CoreSimulator/Devices/22B46ACA-DB24-43FF-9C1F-5E599264E822/data/Containers/Data/Application/B11DB35D-D738-41CC-BA7B-F02186FD8466/Documents/mongodb-realm/billy-jgeoz/recovered-realms/recovered_realm-20230620-141902-dW0mnokH} Optional(<RLMSyncSession: 0x6000006a9180> {
state = 1;
connectionState = 0;
realmURL = wss://ws.eu-central-1.aws.realm.mongodb.com/api/client/v2.0/app/billy-jgeoz/realm-sync;
user = 64917413171d5b33b3615b0c;
})
2023-06-20 14:19:02.586948+0300 BeePalApp[47583:35385783] Sync: Connection[5]: Disconnected
I also tried the following:
- drop the database, restart sync
- delete synced realm file
- launch the app with a clean slate, no errors
- delete everything from the synced realm and copy everything from the local backup
Then I got this error in the syncManager errorHandler:
2023-06-20 16:17:06.429906+0300 BeePalApp[60784:35596271] Sync: Connection[2]: Session[2]: Received: ERROR "Bad client file identifier (IDENT)" (error_code=208, try_again=false, error_action=ClientReset)
2023-06-20 16:17:06.446290+0300 BeePalApp[60784:35596271] Sync: Connection[2]: Disconnected
Error Domain=io.realm.sync Code=7 "Bad client file identifier (IDENT)" UserInfo={Server Log URL=https://realm.mongodb.com/groups/64591410ffb83f492c3916c7/apps/645915213a82d1d7fbafefc9/logs?co_id=6491a6d29a5ae54fdc0db8b6, recovered_realm_location_path=/Users/madalin/Library/Developer/CoreSimulator/Devices/22B46ACA-DB24-43FF-9C1F-5E599264E822/data/Containers/Data/Application/FCF45155-184D-401F-890D-FB42DE2F941F/Documents/mongodb-realm/billy-jgeoz/recovered-realms/recovered_realm-20230620-161706-s0XP5egR, error_action_token=<RLMSyncErrorActionToken: 0x6000026ab9f0>, NSLocalizedDescription=Bad client file identifier (IDENT)} Optional(<RLMSyncSession: 0x60000289bbc0> {
state = 1;
connectionState = 0;
realmURL = wss://ws.eu-central-1.aws.realm.mongodb.com/api/client/v2.0/app/billy-jgeoz/realm-sync;
user = 64917413171d5b33b3615b0c;
})
Adding properties does not require a client reset.
Did you add the isArchived
field to all of your documents on the server prior to adding it as a required field in the server-side schema? Objects disappearing as part of the initial client reset with recovery means that the object creations had been synchronized to the server, but the documents for them either no longer exist or are unsyncable - such as if they’re missing a required field.
Automatic client resets work by downloading a fresh copy of the Realm, and then modifying the existing file to make it compatible with the server-side state. The before
Realm passed to the callback is the Realm frozen before making these changes, while the after
Realm is a live view of the same file. It should normally not be neccesary to do anything in the callback and it’s mostly just informational.
Schema versions and migration blocks are not used for synchronized Realms.
Deleting everything in the after
Realm and attempting to copy over the data is an extremely bad idea. At best it’s deleting all objects created by other clients which had not yet been synchronized to the current one and result in a ton of extra network usage.
If your current state is that you have a Realm file which you need to recover data from and are okay with discarding all server-side state, then you should do each of the steps you did (drop database, restart sync, delete local synced Realm file), but instead of trying to copy the backup file into place, you’ll need to open the backup file in read-only mode (with no sync configuration) and then copy the objects from it into a newly created synchronized Realm.
Thank you for the explanations, this helped. In the end that worked (drop db, restart sync etc) but is this a good practice for a production app? If I drop the database then all the data is distributed on the client apps, which might get deleted and clients won’t be able to recover their data on reinstall/login.
Is there a better way to handle breaking schema changes in production? I’m not sure what “perform client reset” means in the documentation.
Right now I created another app+cluster to isolate some of these use cases and test how to best handle these scenarios.
I have a Client
object and I added a new required parameter isArchived
to the backend schema. After restarting sync and the client app, the beforeReset
and afterReset
callbacks were called and the after
realm was missing the Client
object I had created earlier. I don’t this this should be happening, right?
Maybe the reason is (as you mentioned) that I didn’t add the required field to all documents in the database beforehand. Is there an easy way to do that? And what is the order of operations which also minimizes downtime? For example, it could be: terminate sync, run the pipeline to add the new property to all documents, deploy schema change, start sync. Or maybe I could do all these things in a single deployment using a draft from CLI.
It still seems weird since now I can create Client
documents from the iOS app without adding the isArchived
property to the Swift Object. Why is this working but the client reset doesn’t?
Lastly, can you give me some details on the following?
- In the docs I’m instructed to call
SyncSession.immediatelyHandleError
after handling the client reset but I don’t understand what this does exactly besides making a copy of the old Realm file and creating a new, empty one.
Dropping the database and restarting sync will generally result in all data being lost, so it is not something I would particularly recommend doing.
When adding properties to the schema you should not be terminating or restarting sync. That is the step which causes client resets, and it is not required when adding properties. You should run a pipeline to add the required field to the documents* and then deploy the schema change without ever terminating sync, and no downtime at all is required.
If you don’t restart sync but add a required field which isn’t present in your documents you’ll see similar behavior, just without the client reset. The documents missing the field will be marked as unsyncable and the client-side objects corresponding to them will be deleted. After the server-side schema change has been synced to the client, objects created by the client will contain the new field even without any updates to the client app (with a default value of zero/nil/false/empty as appropriate for the type).
* There is a race condition here in active production usage: between when you run the pipeline and when the schema change is completed new documents may be created by clients which are missing the new field. If a temporary bit of weirdness is acceptable (some objects may disappear and then reappear on the client) you can rerun the pipeline after changing the schema. If not, you can use triggers to add the field to newly created documents during the migration period.
That function is for manual client resets and you shouldn’t call it from the automatic client reset handler. The documentation for manual client resets is somewhat vague because it proved to be something which was very difficult to implement in apps, and even more difficult to provide any sort of general guidance on how to implement them. The flow for manual resets roughly is:
- Make a backup copy of the current local file
- Delete the local file and redownload it from the server
- Do some sort of data recovery to extract unsynchronized changes from the backup file.
Step 2 requires that the existing Realm be closed, which is a somewhat complicated thing (we can’t just unilaterally close the file as you may be reading from it on another thread). As a result, we normally do most of this the next time the Realm is opened rather than immediately upon receiving the error. SyncSession.immediatelyHandleError
tells us to instead do it right away. This requires that you’ve ensured that you no longer have any references to the Realm, which in practice is a rather difficult thing to do.
This makes sense, so basically whenever I make a schema change that might mess with existing documents, I should have a pipeline and a trigger in place to make sure there is no data loss.
Now if I want to remove a required property, can I avoid a client reset error and the need for manual handling it by running a pipeline to update the documents to match the schema? Or is it better to employ the partner collection strategy?
Can you also tell me if there is a possibility of automatic client reset creating the corresponding objects in the new realm with different _id
values? This happened during my previous tests but I’m not sure if I did something wrong of this is a valid scenario. Basically, the objects that had the schema change had different ids in the after
realm compared to before
, and also the objects that were directly referencing them had null values instead.
Removing a required field is currently not supported and is where you’d need the partner collection approach. We’re currently working on adding support for removing fields and switching fields between optional and required without restarting sync, but no ETA on that being released.
The concept of a “corresponding object” with a different _id
value doesn’t make much sense. The _id field is the objects’ unique identifiers, so two objects with different _ids are just two unrelated objects. Are you doing some sort of data initialization where you check if any objects exist, and if none do create a set of expected objects? If so then what you saw happen could make sense: Client A creates objects, and then some other objects linking to those objects. Schema is updated on the server without adding the required field to the existing documents, so all existing documents are marked as non-syncable. Client B starts and creates a new set of objects. Client A starts and gets a client reset due to sync being restarted. The fresh data downloaded doesn’t contain any of the objects it created (as they’re unsyncable) but does have the objects Client B created after the schema change. All of the links are pointing to null because the objects they linked to have been deleted.
1 Like