Master Detail Transactions in MongoDB

MongoDB

#Releases

In relational databases, transactions let you make reliable atomic updates to your data. Because relational schemas are often highly normalized, most logical transactions span multiple tables, so it is important to be able to do multiple updates atomically (all or nothing).

While MongoDB does not have multi-document transactions, it makes up for this in many use cases through its document oriented data model. In this post, we’ll talk about the Master-Detail design pattern that comes up very often in data modelling that almost always requires multi-statement transactions in an RDBMS, but is easily handled without cross-statement transactions in MongoDB.

Master-Detail transactions in an RDBMS

As an example of the Master-Detail pattern, consider a Purchase Order with multiple line items. In an RDBMS, we might model this as a Purchase Order table (the Master) and a Line Item table (the Detail). To get a purchase order, I need to join Purchase Order and Line Item tables together to get all of the info in the purchase order.

I might model my Purchase Orders as follows in an RDBMS:

CREATE TABLE purchase_orders ( 
	id INT NOT NULL,
	title VARCHAR(100),
	total DECIMAL(10,2)
);
<p>CREATE TABLE line_items (
id INT NOT NULL,
sku VARCHAR(100),
quantity INT,
price DECIMAL(10,2),
purchase_order_id INT,
Foreign Key (purchase_order_id) references purchase_orders(id)
);

If I want to make atomic updates to a purchase order and its line items, I need a multi-statement transaction. For example, if I am going to create a purchase order, I might follow these steps:

START TRANSACTION;

/* Create a purchase order row */
INSERT INTO purchase_orders (id,title,total) VALUES (1, ‘purchase order 1’, 10.50);

/* Create a line item, including the foreign key of the purchase order we just created */
INSERT INTO line_items(id,sku,quantity,price,purchase_order_id) VALUES (2, ‘a’, 1, 10.50 1);

COMMIT;

With this update, there is never a time where the Purchase Order exists but has no Line Items in it. The whole object and its details are committed in a single transaction.

Now if I need to update that Purchase Order, say to add a few more line items, then I would perform another transaction.

START TRANSACTION;
<p>/* Add some new line item to the PO */
INSERT INTO line_items(id,sku,quantity,purchase_order_id) VALUES (3, ‘b’, 1, 12.34, 1);
INSERT INTO line_items(id,sku,quantity,purchase_order_id) VALUES (4, ‘c’, 1, 15.25, 1);</p>
<p>/* Update the “total” field of the purchase order to reflect the added line items */
UPDATE purchase_orders SET total = (total + 12.34 + 15.25) WHERE id = 1;</p>
<p>COMMIT;

This time I’ve ensured that my two new Line Items appear at the same time (or not at all) and that the total field of the Purchase Order is updated at the same time. No client will ever see a state in which only one of those Line Items exists nor any state where the total does not match the sum of the line items.

Master-Detail in MongoDB

Working with MongoDB is a bit different. While we don’t have the ability to perform multi-documents transactions (at least so far). However this Master-Detail pattern can be handled without multi-statement transactions thanks to MongoDB’s richer data model.

MongoDB can store data as documents with nested objects, arrays, and other rich data types. Thus we don’t need to store Master-Detail entities in multiple collections, or even in more than one document. A common way of modeling such objects with MongoDB is using nested documents. So our Purchase Order model might look like this:

var purchase_order = { 
  _id: 1
  title: ‘Purchase order 1’,
  total: 10.50,
  line_items: [ 
    { sku: ‘a’, quantity: 1, price: 10.50 }
  ]
} 

Let’s look at how we can get the same level of atomicity from MongoDB without needing multi-statement transactions!

First, we want to be able to create a new purchase order and its first line items atomically.

db.purchase_orders.save( purchase_order )

This atomically creates the purchase order and its initial items in a single operation. Just as with the SQL scenario, clients will never see a point in time where the purchase order is empty. It all succeeds in a single step.

Now what about modifying that purchase order? If we want to add some items to the PO, we can do so like this:

db.purchase_orders.update( { _id: 1234 }, { 
  $pushAll: { line_items: [
      { sku: ‘c’, quantity: 1, price: 12.34 },
      { sku: ‘d’, quantity: 1, price: 15.25 } 
    ],
  $inc: { total: 27.59 } }
});

The $pushAll operator atomically appends values onto an array attribute. Just as with our RDBMS scenario, this update is atomic and the whole command either succeeds or fails. Meanwhile the $inc operator atomically increments the “total” field of the purchase order. All of these updates happen atomically and they succeed or fail as a group so another client will never see an inconsistent state of the order.

Summary

It turns out that most of the time where you find yourself with a Master-Detail pattern in an RDBMS, you can achieve the same level of consistency in MongoDB by modelling your object as a rich, nested document, rather than multiple joined tables. Combine this with MongoDB’s atomic update operators, and you can solve most of what you would traditionally do with multi-statement transactions in an RDBMS.

An RDBMS needs multi-statement transactions for these scenarios because the only way it has to model these types of objects is with multiple tables. By contrast, MongoDB can go much further with single-statement transactions because there’s no need to join on simple updates like this.

This is not to say that multi-statement transactions are not useful. If you need to perform cross-entity transactions (e.g. move a line item from one purchase order to another) or if you need to modify a purchase order and inventory objects in a single step, then you may still need multi-statement transactions. Or else you would have to take some alternate approach.

But it turns out that many of the cases where we traditionally need multi-statement transactions go away when we can model objects as documents and perform atomic updates on those documents.

Another aspect of transactions and ACID is isolation. MongoDB does not support fully generalized snapshotting. We haven’t discussed that here; it’s probably a good topic for another blog post in the future.

Jared Rosoff (@forjared)