HomeLearnHow-toRealm Partitioning Strategies

Realm Partitioning Strategies

Published: Apr 30, 2021

  • Realm
  • Mobile
  • Swift
  • ...

By Andrew Morgan

Rate this article

Realm partitioning can be used to control what data is synced to each mobile device, ensuring that your app is efficient, performant, and secure. This article will help you pick the right partitioning strategy for your app.

MongoDB Realm Sync stores the superset of your application data in the cloud using MongoDB Atlas. The simplest strategy is that every instance of your mobile app contains the full database, but that quickly consumes a lot of space on the users' devices and makes the app slow to start while it syncs all of the data for the first time. Alternative strategies include partitioning by:

  • User
  • Group/team/store
  • Chanel/room/topic
  • Geographic region
  • Bucket of time
  • Any combination of these

This article covers:

#Prerequisites

The first part of the article has no prerequisites.

The second half shows how to set up partitioning and open Realms for a partition. If you want to try this in your own apps and you haven't worked with Realm before, then it would be helpful to try a tutorial for iOS or Android first.

#Introduction to MongoDB Realm Sync Partitioning

MongoDB Realm Sync lets a "user" access their application data from multiple mobile devices, whether they're online or disconnected from the internet. The data for all users is stored in MongoDB Atlas. When a user is logged into a device and has a network connection, the data they care about (what that means and how you control it is the subject of this article) is synchronized. When the device is offline, changes are stored locally and then synced when it's back online.

There may be cases where all data should be made available to all users, but I'd argue that it's rare that there isn't at least some data that shouldn't be universally shared. E.g., in a news app, the user may select which topics they want to follow, and set flags to indicate which articles they've already read—that data shouldn't be seen by others.

In this article, I'm going to refer to "users", but for some apps, you could substitute in "store," "meeting room," "device," "location," ...

Why bother limiting what data is synced to a mobile app? There are a couple of reasons:

  • Capacity: Why waste limited resources on a mobile device to store data that the user has no interest in?
  • Security: If a user isn't entitled to see a piece of data, it's safest not to store it on their device.

The easiest way to understand how partitions work in MongoDB Realm Sync is to look at an example.

Shape collections (circles, stars, and triangles) in Atlas database. Each shape in those collections also has a color. Smaller Realm databases each contain different shapes with the same color.
MongoDB Realm Sync Partitions

This example app works with shapes. The mobile app defines classes for circles, stars and triangles. In Atlas, each type of shape is stored in a distinct collection (circles, stars and triangles). Each of the shapes (regardless of which of the collections it's stored in) has a color attribute.

When using the mobile app, the user is interested in working with a color. It could be that the user is only allowed to work with a single color, or it could be that the user can pick what color they currently want to work with. The backend Realm app gets to control which colors a given user is permitted to access.

The developer implements this by designating the color attribute as the partition key.

A view in the mobile app can then open a synced Realm by specifying the color it wants to work with. The backend Realm app will then sync all shapes of that color to the mobile Realm, or it will reject the request if the user doesn't have permission to access that partition.

There are some constraints on the partition key:

  • The application must provide an exact match. It can specify that the Realm it's opening should contain the blue colored shapes, or that it should contain the green shapes. The app cannot open a synced Realm that contains both the red and green shapes.
  • The app must specify an exact match for the partition key. It cannot open a synced Realm for a range or pattern of partition key values. E.g. it can't specify "all colors except red" or "all dates in the last week".
  • Every collection must use the same partition key. In other words, you can't use color as the partition key for collections in the shapes database and username for collections in the user database. You'll see later that there's a technique to work around this.
  • You can change the value of the partition key (convert a red triangle into a green triangle), but it's inefficient as it results in the existing document being deleted and a new one being inserted.
  • The partition key must be one of these types:

    • String
    • ObjectID
    • Int
    • Long

The mobile app can ask to open a Realm using any value for the partition key, but it might be that the user isn't allowed access to that partition. For security, that check is performed in the backend Realm application. The developer can provide rules to decide if a user can access a partition, and the decision could be any one of:

  • No.
  • Yes, but only for reads.
  • Yes, for both reads and writes.

The permission rules can be anything from a simple expression that matches the partition key value, to a complex function that cross-references other collections.

In reality, the rules don't need to be based on the user. For example, the developer could decide that the "happy hour" chat room (partition) can only be opened on Fridays.

#Choosing the Right Strategy(ies) for Your App

This section takes a deeper look at some of the partitioning strategies that you can adopt (or that may inspire you to create a bespoke approach). As you read through these strategies, remember that you can combine them within a single app. This is the meta-strategy we'll look at last.

#Firehose

This is the simplest strategy. All of the documents/objects are synced to every instance of the app. This is a decision not to partition the data.

You might adopt this strategy for an NFL (National Football League) scores app where you want everyone to be able to view every result from every game in history—even when the app is offline.

Consider the two main reasons for partitioning:

  • Capacity: There have been less than 20,000 NFL games ever played, and the number is growing by less than 300 per year. The data for each game contains only the date, names of the two teams, and the score, and so the total volume of data is modest. It's reasonable to store all of this data on a mobile device.
  • Security/Privacy: There's nothing private in this data, and so it's safe to allow anyone on any mobile device to view it. We don't allow the mobile app to make any changes to the data. These are simple Realm Sync rules to define in the backend Realm app.

Even though this strategy doesn't require partitioning, you must still designate a partition key when configuring Realm Sync. We want all of the documents/objects to be in the same partition and so we can add an attribute named visible and always set it to true.

#User

User-based partitioning is a common strategy. Each user has a unique ID (that can be automatically created by MongoDB Realm). Each document contains an attribute that identifies the user that owns it. This could be a username, email address, or the Id generated by MongoDB Realm when the user registers. That attribute is used as the partitioning key.

Use cases for this strategy include financial transactions, order history, game scores, and journal entries.

Consider the two main drivers for partitioning:

  • Capacity: Only the data that's unique to the users is stored in the mobile app, which minimizes storage.
  • Security/Privacy: Users only have access to their own data.

There is often a subset of the user's data that should be made available to team members or to all users. In such cases, you may break the data into multiple collections, perhaps duplicating some data, and using different partition key values for the documents in those collections. You can see an example of this with the User and Chatster collections in the Rchat app.

#Team

This strategy is used when you need to share data between a team of users. You can replace the term "team" with "agency," "store." or any other grouping of users or devices. Examples include all of the point-of-sale devices in a store or all of the employees in a department. The team's name or ID is used as the partitioning key and must be included in all documents in each synced collection.

The WildAid O-FISH App uses the agency name as the partition key. Each agency is the set of officers belonging to an organization responsible for enforcing regulations in one or more Marine Protected Areas. (You can think of an MPA as an ocean-based national park.) Every officer in an agency can create new reports and view all of the agency's existing reports. Agencies can customize the UI by controlling what options are offered when an officer creates a new report. E.g., an agency controlling the North Sea would include "cod" in the list of fish that could have been caught, but not "clownfish". The O-FISH menus are data-driven, with that data partitioned based on the agency.

  • Capacity: The "team" strategy consumes more space on the mobile device than the "user" partitioning strategy, but it's a good fit when all members of the team need to access the data (even when offline).
  • Security/Privacy: This strategy is used when all team members are allowed to view (and optionally modify) their team's data.

#Channel

With this strategy, a user is typically entitled to open/sync Realms from a choice of channels. For example, a sports news app might have channels for soccer, baseball, etc., a chat app would offer multiple chat rooms, and an issue tracker might partition based on product. The channel name or ID should be used as the partitioning key.

  • Capacity: The mobile app can minimize storage use on the device by only opening a Realm for the partition representing the channel that the user is currently interacting with.
  • Security/Privacy: Realm Sync permissions can be added so that a user can only open a synced Realm for a partition if they're entitled to. For example, this might be handled by storing an array of allowed channels as part of the user's data.

#Region

There are cases where you're only currently interested in data for a particular geographic area. Maps, cycle hire apps, and tourist guides are examples.

If you recall, when opening a Realm, the application must specify an exact match for the partition key, and that value needs to match the partition value in any document that is part of that partition. This restricts what you can do with location-based partitioning:

  • You can open a partition containing all documents where location is set to "London".
  • You can't open a partition containing all documents where location is set to "either London or South East England".
  • The partition key can't be an array.
  • You can't open a partition containing all documents where location is set to coordinates within a specified range.

The upshot of this is that you need to decide on geographic regions and assign them IDs or names. Each document can only belong to one of these regions. If you decided to use the state as your region, then the app can open a single synced Realm to access all of the data for Texas, but if the app wanted to be able to show data for all states in the US then it would need to open 50 synced Realms.

  • Capacity: Storage efficiency is dependent on how well your choice of regions matches how the application needs to work with the data. For example, if your app only ever lets the user work with data for a single state, then it would waste a lot of storage if you used countries as your regions.
  • Security/Privacy: In the cases that you want to control which users can access which region, Realm Sync permissions can be added.

In some cases, you may choose to duplicate some data in the backend (Atlas) database in order to optimise the frontend storage, where resources are more constrained. An analog is old-world (paper) travel guides. Lonely Planet produced a guide for Southeast Asia, in addition to individual guides for Vietnam, Thailand, Cambodia, etc. The guide for Cambodia contained 500 pages of information. Some of that same information (enough to fill 50 pages) was also printed in the Southeast Asia guide. The result was that the library of guides (think Atlas) contained duplicate information but it had plenty of space on its shelves. When I go on vacation, I could choose which region/partition I wanted to take with me in my small backpack (think mobile app). If I'm spending a month trekking around the whole of Southeast Asia, then I take that guide. If I'm spending the whole month in Vietnam, then I take that guide.

If you choose to duplicate data in multiple regions, then you can set up Atlas database triggers to automate the process.

#Time Bucket

As with location, it doesn't make sense to use the exact time as the partition key as you typically would want to open a synced Realm for a range of times. The result is that you'd typically use discrete time ranges for your partition key values. A compatible set of partition values is "Today," "Earlier this week," "This month (but not this week)," "Earlier this year (but not this month)," "2020," "2000-2019," and "Twentieth Century."

You can use Atlas scheduled and database triggers to automatically move documents between locations (e.g., at midnight, find all documents with time == "Today" and set time = "Earlier this week". Note that changing the value of a partition key is expensive as it's implemented as a delete and insert.

  • Capacity: Storage efficiency is dependent on how well your choice of time buckets matches how the application needs to work with the data. That probably sounds familiar—time bucket partitioning is analogous to region-based partitioning (with the exception that a city is unlikely to move from Florida to Alaska). As with regions, you may decide to duplicate some data—perhaps having two documents for today's data one with time == "Today" and the other with time == "This week".
  • Security/Privacy: In the cases that you want to control which users can access which time period, Realm Sync permissions can be added.

Note that slight variations on the Region and Time Bucket strategies can be used whenever you need to partition on ranges—age, temperature, weight, exam score...

#Combination/Hybrid

For many applications, no single partitioning strategy that we've looked at meets all of its use cases.

Consider an eCommerce app. You might decide to have a single read-only partition for the entire product catalog. But, if the product catalog is very large, then you could choose to partition based on product categories (sporting good, electronics, etc.) to reduce storage size on the mobile device. When that user browses their order history, they shouldn't drag in orders for other users and so user-id would be a smart partitioning key. Unfortunately, the same key has to be used for every collection.

This can be solved by using partition as the partition key. partition is a String and its value is always made up of a key-value pair. In our eCommerce app, documents in the productCatalog collection could contain partition: "category=sports" and documents in the orders collection would include partition: user=andrew@acme.com.

When the application opens a synced Realm, it provides a value such as "user=andrew@acme.com" as the partition. The Realm sync rules can parse the value of the partition key to determine if the user is allowed to open that partition by splitting the key to find the sub-key (user) and its value (andrew@acme.com). The rule knows that when key == "user", it needs to check that the current user's email address matches the value.

  • Capacity: By using an optimal partitioning sub-strategy for each type of data, you can fine-tune what data is stored in the mobile app.
  • Security/Privacy: Your backend Realm app can apply custom rules based on the key component of the partition to decide whether the user is allowed to sync the requested partition.

You can see an example of how this is implemented for a chatroom app in Building a Mobile Chat App Using Realm – Data Architecture.

#Setting Up Partitions in the Backend Realm App

You need to set up one backend Realm app, which can then be used by both your iOS and Android apps. You can also have multiple iOS and Android apps using the same back end.

#Set Partition and Enable MongoDB Realm Sync

From the Realm UI, select the "Sync" tab. From that view, you select whether you'd prefer to specify your schema through the back end or have it automatically derived from the Realm Objects that you define in your mobile app. If you don't already have data in your Atlas database, then I'd suggest the second option which turns on "Dev Mode," which is the quickest way to get started:

Screenshot of the Realm UI. We select "Sync" from the left-hand menue and then click the "Define Data Models" button

On the next screen, select your key, specify the attribute to use as the partition key (in this case, a new string attribute named "partition"), and the database. Click "Turn Dev Mode On":

Screenshot of the Realm UI. Selet the name, partition, and database name before clicking the "Turn Dev Mode On" button

Click on the "REVIEW & DEPLOY" button. You'll need to do this every time you change the Realm app, but this is the last time that I'll mention it:

Screenshot of the Realm UI. Click on the "REVIEW & DEPLOY" button

Now that Realm sync has been enabled, you should ensure that you set the partition attribute in all documents in any collections to be synced.

#Sync Rules

Realm Sync rules control whether the user/app is permitted to sync a partition or not.

A common misconception is that sync rules can control which documents within a partition will be synced. That isn't the case. They simply determine (true or false) whether the user is allowed to sync the entire partition.

The default behaviour is that the app can sync whichever partition it requests, and so you need to change the rules if you want to increase security/privacy—which you probably do before going into production!

To see or change the rules, select the "Configuration" tab and then expand the "Define Permissions" section:

Screenshot of the Realm UI. Select the "Configuration" tab and then expand the "Define Permissions" section

Both the read and write rules default to true.

You should click "Pause Sync" before editing the rules and then re-enable sync afterwards.

The rules are JSON expressions that have access to the user object (%%user) and the requested partition (%%partition). If you're using the user ID as your partitioning key, then this rule would ensure that a user can only sync the partition containing their documents: { "%%user.id": "%%partition" }.

For more complex partitioning schemes (e.g., the combination strategy), you can provide a JSON expression that delegates the true/false decision to a Realm function:

1{
2 "%%true": {
3 "%function": {
4 "arguments": [
5 "%%partition"
6 ],
7 "name": "canReadPartition"
8 }
9 }
10}

It's then your responsibility to create the canReadPartition function. Here's an example from the Rchat app:

1exports = function(partition) {
2console.log(`Checking if can sync a read for partition = ${partition}`);
3
4const db = context.services.get("mongodb-atlas").db("RChat");
5const chatsterCollection = db.collection("Chatster");
6const userCollection = db.collection("User");
7const chatCollection = db.collection("ChatMessage");
8const user = context.user;
9let partitionKey = "";
10let partitionVale = "";
11
12const splitPartition = partition.split("=");
13if (splitPartition.length == 2) {
14 partitionKey = splitPartition[0];
15 partitionValue = splitPartition[1];
16 console.log(`Partition key = ${partitionKey}; partition value = ${partitionValue}`);
17} else {
18 console.log(`Couldn't extract the partition key/value from ${partition}`);
19 return false;
20}
21
22 switch (partitionKey) {
23 case "user":
24 console.log(`Checking if partitionValue(${partitionValue}) matches user.id(${user.id}) – ${partitionKey === user.id}`);
25 return partitionValue === user.id;
26 case "conversation":
27 console.log(`Looking up User document for _id = ${user.id}`);
28 return userCollection.findOne({ _id: user.id })
29 .then (userDoc => {
30 if (userDoc.conversations) {
31 let foundMatch = false;
32 userDoc.conversations.forEach( conversation => {
33 console.log(`Checking if conversaion.id (${conversation.id}) === ${partitionValue}`)
34 if (conversation.id === partitionValue) {
35 console.log(`Found matching conversation element for id = ${partitionValue}`);
36 foundMatch = true;
37 }
38 });
39 if (foundMatch) {
40 console.log(`Found Match`);
41 return true;
42 } else {
43 console.log(`Checked all of the user's conversations but found none with id == ${partitionValue}`);
44 return false;
45 }
46 } else {
47 console.log(`No conversations attribute in User doc`);
48 return false;
49 }
50 }, error => {
51 console.log(`Unable to read User document: ${error}`);
52 return false;
53 });
54 case "all-users":
55 console.log(`Any user can read all-users partitions`);
56 return true;
57 default:
58 console.log(`Unexpected partition key: ${partitionKey}`);
59 return false;
60 }
61};

The function splits the partition string, taking the key from the left of the = symbol and the value from the right side. It then runs a specific check based on the key:

  • user: Checks that the value matches the current user's ID.
  • conversation: This is used for the chat messages. Checks that the value matches one of the conversations stored in the user's document (i.e. that the current user is a member of the chat room.)
  • all-users: This is used for the Chatster collection which provides a read-only view of a subset of each user's data, such as their name and presence state. This data is readable by anyone and so the function always returns true.

RChat also has a canWritePartition function which has a similar structure but applies different checks. You can view that function here.

#Triggers

MongoDB Realm provides three types of triggers:

  • Authentication: Often used to create a user document when a new user registers.
  • Database: Invoked when your nominated collection is updated. You can use database triggers to automate the duplication of data so that it can be shared through a different partition.
  • Scheduled: Similar to a cron job, scheduled triggers run at a specified time or interval. They can be used to move documents into different time buckets (e.g., from "Today" into "Earlier this week").

In the RChat app, only the owner is allowed to read or write their User document, but we want the user to be discoverable by anyone and for their presence state to be visible to others. We add a database trigger that mirrors a subset of the User document to a Chatster document which is in a publicly visible partition.

The first step is to create a database trigger by selecting "Triggers" and then clicking "Add a Trigger":

Screenshot of the Realm UI. Select "Triggers" and then click "Add a Trigger"

Fill in the details about the collection that invokes the new trigger, specify which operations we care about (all of them), and then indicate that we'll provide a new function to be executed when the trigger fires:

Screenshot of the Realm UI. Fill in details for the trigger such as the collection that it is triggered by and that it should invoke a function that hasn't yet been created

After saving that definition, you're taken to the function editor to add the logic. This is the code for the trigger on the User collection:

1exports = function(changeEvent) {
2 const db = context.services.get("mongodb-atlas").db("RChat");
3 const chatster = db.collection("Chatster");
4 const userCollection = db.collection("User");
5 let eventCollection = context.services.get("mongodb-atlas").db("RChat").collection("Event");
6 const docId = changeEvent.documentKey._id;
7 const user = changeEvent.fullDocument;
8 let conversationsChanged = false;
9
10 console.log(`Mirroring user for docId=${docId}. operationType = ${changeEvent.operationType}`);
11 switch (changeEvent.operationType) {
12 case "insert":
13 case "replace":
14 case "update":
15 console.log(`Writing data for ${user.userName}`);
16 let chatsterDoc = {
17 _id: user._id,
18 partition: "all-users=all-the-users",
19 userName: user.userName,
20 lastSeenAt: user.lastSeenAt,
21 presence: user.presence
22 };
23 if (user.userPreferences) {
24 const prefs = user.userPreferences;
25 chatsterDoc.displayName = prefs.displayName;
26 if (prefs.avatarImage && prefs.avatarImage._id) {
27 console.log(`Copying avatarImage`);
28 chatsterDoc.avatarImage = prefs.avatarImage;
29 console.log(`id of avatarImage = ${prefs.avatarImage._id}`);
30 }
31 }
32 chatster.replaceOne({ _id: user._id }, chatsterDoc, { upsert: true })
33 .then (() => {
34 console.log(`Wrote Chatster document for _id: ${docId}`);
35 }, error => {
36 console.log(`Failed to write Chatster document for _id=${docId}: ${error}`);
37 });
38
39 if (user.conversations && user.conversations.length > 0) {
40 for (i = 0; i < user.conversations.length; i++) {
41 let membersToAdd = [];
42 if (user.conversations[i].members.length > 0) {
43 for (j = 0; j < user.conversations[i].members.length; j++) {
44 if (user.conversations[i].members[j].membershipStatus == "User added, but invite pending") {
45 membersToAdd.push(user.conversations[i].members[j].userName);
46 user.conversations[i].members[j].membershipStatus = "Membership active";
47 conversationsChanged = true;
48 }
49 }
50 }
51 if (membersToAdd.length > 0) {
52 userCollection.updateMany({userName: {$in: membersToAdd}}, {$push: {conversations: user.conversations[i]}})
53 .then (result => {
54 console.log(`Updated ${result.modifiedCount} other User documents`);
55 }, error => {
56 console.log(`Failed to copy new conversation to other users: ${error}`);
57 });
58 }
59 }
60 }
61 if (conversationsChanged) {
62 userCollection.updateOne({_id: user._id}, {$set: {conversations: user.conversations}});
63 }
64 break;
65 case "delete":
66 chatster.deleteOne({_id: docId})
67 .then (() => {
68 console.log(`Deleted Chatster document for _id: ${docId}`);
69 }, error => {
70 console.log(`Failed to delete Chatster document for _id=${docId}: ${error}`);
71 });
72 break;
73 }
74};

Note that the Chatster document is created with partition set to "all-users=all-the-users". This is what makes the document accessible by any user.

#Accessing Realm Partitions from Your Mobile App (iOS or Android)

In this section, you'll learn how to request a partition when opening a Realm. If you want more of a primer on using Realm in a mobile app, then these are suitable resources:

First of all, note that you don't need to include the partition key in your iOS or Android Object definitions. They are handled automatically by Realm.

All you need to do is specify the partition value when opening a synced Realm:

1ChatRoomBubblesView(conversation: conversation)
2 .environment(
3 \.realmConfiguration,
4 app.currentUser!.configuration(partitionValue: "conversation=\(conversation.id)"))

#Summary

At this point, you've hopefully learned:

  • That MongoDB Realm Sync partitioning is a great way to control data privacy and storage requirements in your mobile app.
  • How Realm partitioning works.
  • A number of partitioning strategies.
  • How to combine strategies to build the optimal solution for your mobile app.
  • How to implement your partitioning strategy in your backend Realm app and in your iOS/Android mobile apps.

#Resources

If you have questions, please head to our developer community website where the MongoDB engineers and the MongoDB community will help you build your next big idea with MongoDB.

Rate this article
MongoDB Icon
  • Developer Hub
  • Documentation
  • University
  • Community Forums

© MongoDB, Inc.