Large initial transfer size

I posted a question earlier about time to complete initial sync.

Moving to an M10 was helpful in speeding this up. However, on further invetsigation, it appears that the total bytes transfered is around 5x larger than my napkin calculations initially indicated and about 2x larger than the “Uncompressed data size” reported in my Atlas connection.

It’s not clear to me why. Is there an Atlas setting I can trigger to do some compacting?


I can recommend taking a look through this advanced guide I think you may want to try playing with the max offline time option once you understand the tradeoffs. It might help you reduce the incoming changeset size if you decide it is acceptable for your application.

1 Like

Thanks @James_Stone. We are still in development so lots of room to experiment.

Backend compaction runs regularly on all synced clusters as part of Realm Sync.

Can I or Mongo Support trigger this manually, I see no reference in the guide? Otherwise I’ll have to try setting the max time to 1 day.

We just changed our models to move the _id to be a UUID (from String) and to remove a String field from the model. This resulted in the size of the data being synced doubling(!) which we did not expect, but the Uncompressed data size shrunk which we did expect.

I don’t understand what the additional data is that the client SDK pulls down. I’d assumed the uncompressed data size plus an additional amount for structure. But the delta between atlas size and transfered bytes is wide. I have been through the guide, but is there some documentation that might explain more of this (outside of history)?

Can I or Mongo Support trigger this manually

The history compaction takes place automatically for any changes that can be compacted outside the max offline time. But depending on what type of pattern of changes your app makes, your history may not have very many changes eligible for compaction. Setting this to 1 day is a good idea for experimentation to observe the effects, but is very aggressive for a production app.

A UUID is always going to consume 16 bytes, how long were your strings previously?

How are you measuring bytes transferred to the client? Are you observing a sync progress listener, or some other way? If you are comparing the size of the Realm file on disk to the state in Atlas, be aware that a Realm file may contain multiple versions of the state to allow readers at different versions, so unless you are comparing a “compacted” Realm (which can be achieved by Realm.writeCopyTo) then the comparison is apples to oranges.

As you correctly noted, the data that the client pulls down is the history of changes for that partition, not the final state. This is necessary for conflict resolution, but it means that observing the bytes transferred is not the same as the bytes stored to disk on the client. Is this a sufficient explanation of the delta between Atlas size and transferred bytes? I am not aware of any existing documentation on this, but I think it is meant to be an implementation detail that you don’t have to worry about.

If you are concerned about optimizing the bytes transferred, consider how over time it is much more efficient to only be sending the delta of changes to objects (how Realm sync works by sending history) compared to sending the entire state every time (how you might go about implementing this via some HTTP service).

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.