GIANT Stories at MongoDB

MongoSV Recap

MongoDB

Releases

Last week over 1,100 developers came together for MongoSV, the largest MongoDB conference to date. 10gen kicked off MongoSV with our inaugural MongoDB Masters program, which brought together MongoDB evangelists from around the world.

At the opening keynote, 10gen CTO Eliot Horowitz demoed a twitter app for #mongoSV tweets, featuring the new aggregation framework expected for the MongoDB 2.2 release. These gather all the tweets sent out with the hashtag #mongoSV and organizes them in by recency and most retweets. Get the source code for the demo app here

Highlights from MongoSV include presentations on X.commerce’s new open source developer platform, MongoDB’s integration with Azure, MongoDB’s new aggregation framework, How Disney manages their deployment of 1400 Mongo instances and more











the 10gen booth at MongoSV



10gen President Max Schireson welcomes the Speakers and Masters to MongoSV

MongoDB 2.0 Released

MongoDB

Releases

The MongoDB development team is pleased to announce the release of version 2.0.0. Version 2.0 is the latest stable release, following the March 2011 release of version 1.8. This release includes many new features, improvements to existing features, and performance enhancements.

Please note version 2.0 is a significant new release, but is 2.0 solely because 1.8 + 0.2 = 2.0; for example the upgrade from 1.6 to 1.8 was similar in scope.

Highlights of the 2.0 release:

Concurrency improvements in 2.0 are just the start of a much larger concurrency roadmap we are working on. In 2.0, we are beginning to address one of the biggest issues: holding locks during a page fault. 2.0 tracks memory caching and has the ability to yield locks and fault outside. This is hooked in a number places, notably: updates by _id, remove, and long table scans.

Being able to keep the working index set in memory is an important factor in overall performance, and we overhauled indexes to make this easier. Indexes in 2.0 are about 25% smaller and faster, meaning that you can fit more in memory.

Replica sets get two new important features in 2.0: priorities and tagging. Priorities let you have nodes that you prefer to be primary if you have a non homogeneous environment. Tagging lets you guarantee writes hit certain groups of servers. One use case for this is guaranteeing a new user registration is written to two data centers before acknowledging to a user.

There were many other improvements, so we encourage those interested to look at the change log. Overall, this release added a large number of small performance and concurrency improvements, greater stability to sharding, and better replica set operations.

<span >Downloads: <span >http://www.mongodb.org/downloads<span >

<span >Release Notes: <span >http://www.mongodb.org/display/DOCS/2.0+Release+Notes<span >

<span >Full change log: https://jira.mongodb.org/secure/IssueNavigator.jspa?requestId=10140

For the full scoop on what’s new in MongoDB version 2.0, register for our live webinar on Thursday, September 15th. We will have two sessions: the first at 10am GMT, and another at 1:30pm ET/10:30am PT.

Thank you to the MongoDB community for your continued feedback and testing through the 1.9 development series.

-Eliot and the MongoDB Team

Master Detail Transactions in MongoDB

MongoDB

Releases

In relational databases, transactions let you make reliable atomic updates to your data. Because relational schemas are often highly normalized, most logical transactions span multiple tables, so it is important to be able to do multiple updates atomically (all or nothing).

While MongoDB does not have multi-document transactions, it makes up for this in many use cases through its document oriented data model. In this post, we’ll talk about the Master-Detail design pattern that comes up very often in data modelling that almost always requires multi-statement transactions in an RDBMS, but is easily handled without cross-statement transactions in MongoDB.

Master-Detail transactions in an RDBMS

As an example of the Master-Detail pattern, consider a Purchase Order with multiple line items. In an RDBMS, we might model this as a Purchase Order table (the Master) and a Line Item table (the Detail). To get a purchase order, I need to join Purchase Order and Line Item tables together to get all of the info in the purchase order.

I might model my Purchase Orders as follows in an RDBMS:

CREATE TABLE purchase_orders ( 
    id INT NOT NULL,
    title VARCHAR(100),
    total DECIMAL(10,2)
);

CREATE TABLE line_items (
    id INT NOT NULL,
    sku VARCHAR(100),
    quantity INT,
    price DECIMAL(10,2),
    purchase_order_id INT,
    Foreign Key (purchase_order_id) references purchase_orders(id)
);

If I want to make atomic updates to a purchase order and its line items, I need a multi-statement transaction. For example, if I am going to create a purchase order, I might follow these steps:

START TRANSACTION;

/* Create a purchase order row */
INSERT INTO purchase_orders (id,title,total) VALUES (1, ‘purchase order 1’, 10.50);

/* Create a line item, including the foreign key of the purchase order we just created */
INSERT INTO line_items(id,sku,quantity,price,purchase_order_id) VALUES (2, ‘a’, 1, 10.50 1);

COMMIT;

With this update, there is never a time where the Purchase Order exists but has no Line Items in it. The whole object and its details are committed in a single transaction.

Now if I need to update that Purchase Order, say to add a few more line items, then I would perform another transaction.

START TRANSACTION;

/* Add some new line item to the PO */
INSERT INTO line_items(id,sku,quantity,purchase_order_id) VALUES (3, ‘b’, 1, 12.34, 1);
INSERT INTO line_items(id,sku,quantity,purchase_order_id) VALUES (4, ‘c’, 1, 15.25, 1);

/* Update the “total” field of the purchase order to reflect the added line items */ 
UPDATE purchase_orders SET total = (total + 12.34 + 15.25) WHERE id = 1;

COMMIT;

This time I’ve ensured that my two new Line Items appear at the same time (or not at all) and that the total field of the Purchase Order is updated at the same time. No client will ever see a state in which only one of those Line Items exists nor any state where the total does not match the sum of the line items.

Master-Detail in MongoDB

Working with MongoDB is a bit different. While we don’t have the ability to perform multi-documents transactions (at least so far). However this Master-Detail pattern can be handled without multi-statement transactions thanks to MongoDB’s richer data model.

MongoDB can store data as documents with nested objects, arrays, and other rich data types. Thus we don’t need to store Master-Detail entities in multiple collections, or even in more than one document. A common way of modeling such objects with MongoDB is using nested documents. So our Purchase Order model might look like this:

var purchase_order = { 
  _id: 1
  title: ‘Purchase order 1’,
  total: 10.50,
  line_items: [ 
    { sku: ‘a’, quantity: 1, price: 10.50 }
  ]
} 

Let’s look at how we can get the same level of atomicity from MongoDB without needing multi-statement transactions!

First, we want to be able to create a new purchase order and its first line items atomically.

db.purchase_orders.save( purchase_order )

This atomically creates the purchase order and its initial items in a single operation. Just as with the SQL scenario, clients will never see a point in time where the purchase order is empty. It all succeeds in a single step.

Now what about modifying that purchase order? If we want to add some items to the PO, we can do so like this:

db.purchase_orders.update( { _id: 1234 }, { 
  $pushAll: { line_items: [
      { sku: ‘c’, quantity: 1, price: 12.34 },
      { sku: ‘d’, quantity: 1, price: 15.25 } 
    ],
  $inc: { total: 27.59 } }
});

The $pushAll operator atomically appends values onto an array attribute. Just as with our RDBMS scenario, this update is atomic and the whole command either succeeds or fails. Meanwhile the $inc operator atomically increments the “total” field of the purchase order. All of these updates happen atomically and they succeed or fail as a group so another client will never see an inconsistent state of the order.

Summary

It turns out that most of the time where you find yourself with a Master-Detail pattern in an RDBMS, you can achieve the same level of consistency in MongoDB by modelling your object as a rich, nested document, rather than multiple joined tables. Combine this with MongoDB’s atomic update operators, and you can solve most of what you would traditionally do with multi-statement transactions in an RDBMS.

An RDBMS needs multi-statement transactions for these scenarios because the only way it has to model these types of objects is with multiple tables. By contrast, MongoDB can go much further with single-statement transactions because there’s no need to join on simple updates like this.

This is not to say that multi-statement transactions are not useful. If you need to perform cross-entity transactions (e.g. move a line item from one purchase order to another) or if you need to modify a purchase order and inventory objects in a single step, then you may still need multi-statement transactions. Or else you would have to take some alternate approach.

But it turns out that many of the cases where we traditionally need multi-statement transactions go away when we can model objects as documents and perform atomic updates on those documents.

Another aspect of transactions and ACID is isolation. MongoDB does not support fully generalized snapshotting. We haven’t discussed that here; it’s probably a good topic for another blog post in the future.

Jared Rosoff (@forjared)

Getting started with VMware CloudFoundry, MongoDB and Node.js

MongoDB

Releases

Listen to the recording of the Node.js Panel Discussion webinar.

Overview

Following up from our previous post we’re posting up a quick how-to for using Node.JS, CloudFoundry and MongoDB together.

Our end goal here is to build a simple web app that records visits and provides a reporting screen for the last 10 visits.

Tools We Need

  1. Sign up for a Cloud Foundry account.
  2. Local installation of MongoDB & Node.JS.
  3. Cloud Foundry VMC tools.
  4. All of the code is available on github.
Follow the links to install & configure the various tools.

Getting Started

  1. Start the mongod process on your local computer. Use the default port.
  2. Confirm that node.js is correctly installed. You should be able to run node from the command-line and receive a basic javascript shell. You should also have NPM (node package manager) installed.
  3. Make a directory for the project and then ensure that CloudFoundry is correctly configured. Mine looked as follows:
<pre > mongo@ubuntu:~$ sudo gem install vmc mongo@ubuntu:~$ vmc target api.cloudfoundry.com Succesfully targeted to [http://api.cloudfoundry.com] mongo@ubuntu:~$ vmc login Email: gates@10gen.com Password: ******** Successfully logged into [http://api.cloudfoundry.com]

Step 1: Hello world

In your directory create a file called app.js. In that file, type the following code. This will create a basic web server on localhost:3000 or on the assigned host:port combination on the CloudFoundry server.

var port = (process.env.VMC_APP_PORT || 3000);
var host = (process.env.VCAP_APP_HOST || 'localhost');
var http = require('http');

http.createServer(function (req, res) { res.writeHead(200, {'Content-Type': 'text/plain'}); res.end('Hello World\n'); }).listen(port, host);

Test our file locally:

$ node app.js
$ curl localhost:3000
Hello World
# kill node with CTRL+C

Push our file to CloudFoundry and test. CloudFoundry automatically picks up that we’re using node.js,but it will ask some other configuration questions, including a name and the services we want to have running. I have named mine gvp_node_test and requested that MongoDB be run as a service.

The commands & output:

$ vmc push 
Would you like to deploy from the current directory? [Yn]: Y
Application Name: gvp_node_test
Application Deployed URL: 'gvp_node_test.cloudfoundry.com'? 
Detected a Node.js Application, is this correct? [Yn]: Y
Memory Reservation [Default:64M] (64M, 128M, 256M, 512M, 1G or 2G) 
Creating Application: OK
Would you like to bind any services to 'gvp_node_test'? [yN]: y
The following system services are available::
1. mongodb
2. mysql
3. redis
Please select one you wish to provision: 1
Specify the name of the service [mongodb-55516]: 
Creating Service: OK
Binding Service: OK
Uploading Application:
  Checking for available resources: OK
  Packing application: OK
  Uploading (0K): OK   
Push Status: OK
Staging Application: OK                                                         
Starting Application: OK                                       

At this point you should have a simple working web app. Note the URL: your_app_name.cloudfoundry.com, we can test it is working with curl.

$ curl your_app_name.cloudfoundry.com
Hello World

Step 2: getting mongo configs

CloudFoundry has now configured a MongoDB service with its own user name, password, ip and port. To access these on the server, we will need to parse the environment variables coming into the node process.

To do this we do the following, note that we’ve added an else clause so that we can run this locally.

if(process.env.VCAP_SERVICES){
  var env = JSON.parse(process.env.VCAP_SERVICES);
  var mongo = env['mongodb-1.8'][0]['credentials'];
}
else{
  var mongo = {"hostname":"localhost","port":27017,"username":"",
    "password":"","name":"","db":"db"}
}

The code wraps this up in a generate_mongo_url function that provides a “connection string” of the form mongodb://username:password@host:port/db_name.

Copy in the rest of the code from step 2 on github and test locally.

$ node app.js
$ curl localhost:3000
# connection string for localhost

Once that’s working push the update to the cloud. Notice that we add the name of the project and we don’t get asked any questions:

$ vmc update your_app_name
Uploading Application:
...
Stopping Application: OK
Staging Application: OK                                                         
Starting Application: OK 
# test again
$ curl your_app_name.cloudfoundry.com
# bunch of environment variables

Step 3: now with drivers

First we need to install the node-mongodb-native driver. To do this, we use NPM.

$ npm install mongodb

You should see a new directory at the end of this process: node_modules. To enable use to include these module on the cloud we add this path to the require variable at the top of the code.

require.paths.unshift('./node_modules');

if(process.env.VCAP_SERVICES){ ...

Our goal here is to build a function for recording a visit. Let’s build that function.

var record_visit = function(req, res){
  /* Connect to the DB and auth */
  require('mongodb').connect(mongourl, function(err, conn){
    conn.collection('ips', function(err, coll){
      /* Simple object to insert: ip address and date */
      object_to_insert = { 'ip': req.connection.remoteAddress, 'ts': new Date() };

  /* Insert the object then print in response */
  /* Note the _id has been created */
  coll.insert( object_to_insert, {safe:true}, function(err){
    res.writeHead(200, {'Content-Type': 'text/plain'});
    res.write(JSON.stringify(object_to_insert));
    res.end('\n');
  });
});

}); }

Notice the .connect and .collection('ips'...). We’re telling it to store data in the ips collection.

Another nice feature is the object_to_insert. Saving a document with Node+MongoDB is as simple as creating the object and inserting it.

Let’s fix up the main createServer function.

http.createServer(function (req, res) {
  record_visit(req, res);
}).listen(port, host);

Then we can test locally and push with vmc. If this is working locally, you should be able to connect to your local mongod instance and see some data in the ips collection.

$ mongo localhost:27017/db
> db.ips.find()
...

Step 4

At this point, you’ve probably tested a few times and you’ve successfully put data in the database. Now it’s time to get that data out.

Let’s create a function to print the last 10 visits:

var print_visits = function(req, res){
  /* Connect to the DB and auth */
  require('mongodb').connect(mongourl, function(err, conn){
    conn.collection('ips', function(err, coll){
      /*find with limit:10 & sort */
      coll.find({}, {limit:10, sort:[['_id','desc']]}, function(err, cursor){
        cursor.toArray(function(err, items){
          res.writeHead(200, {'Content-Type': 'text/plain'});
          for(i=0; i < items.length; i++){
            res.write(JSON.stringify(items[i]) + "\n");
          }
          res.end();
        });
      });
    });
  });
}

Let’s update the createServer method to print when we request /history.

http.createServer(function (req, res) {
  params = require('url').parse(req.url);
  if(params.pathname === '/history') {
    print_visits(req, res);
  }
  else{
    record_visit(req, res);
  }
}).listen(port, host);

Again, we test locally and then upload with vmc. If it all works, you should be able to do this:

$ vmc update your_app_name
...
$ curl your_app_name.cloudfoundry.com
{"ip":"172.30.49.42","ts":"2011-06-15T20:14:18.977Z","_id":"4df9129af354f8682d000001"}
$ curl your_app_name.cloudfoundry.com
{"ip":"172.30.49.43","ts":"2011-06-15T20:14:21.745Z","_id":"4df9129df354f8682d000002"}

now let's test history

$ curl gvp_node_test.cloudfoundry.com/history {"ip":"172.30.49.43","ts":"2011-06-15T20:14:21.745Z","_id":"4df9129df354f8682d000002"} {"ip":"172.30.49.42","ts":"2011-06-15T20:14:18.977Z","_id":"4df9129af354f8682d000001"} ...

Going further

  1. Check out our upcoming Node.js Panel Discussion webinar.
  2. For some MongoDB wrappers take a look at
    • Mongoose, an ORM / ODM wrapper
    • MongoSkin, a layer over node-mongodb-native to help reduce callbacks.
  3. For building more complex web sites take a look at the Express framework.

– Gates Voyer-Perrault
@gatesvp

The State of MongoDB and Ruby

MongoDB

Releases

The state of Ruby and MongoDB is strong. In this post, I’d like to describe some of the recent developments in the Ruby driver and provide a few notes on Rails and the object mappers in particular.

The Ruby Driver

We just released v1.2 of the MongoDB Ruby driver. This release is stable and supports all the latest features of MongoDB. If you haven’t been paying attention to the driver’s development, the Cliff’s Notes are below. (Note that if you’re an using older version of the driver, you owe it to your app to upgrade).

If you’re totally new to the driver, you may want to read Ethan’s Gunderson’s excellent post introducing it before continuing on.

Connections

There are now two connection classes: <a href="http://api.mongodb.org/ruby/current/Mongo/Connection.html" title="Connection class docs">Connection</a> and <a href="http://api.mongodb.org/ruby/current/Mongo/ReplSetConnection.html" title="ReplSetConnection class docs">ReplSetConnection</a>. The first simply creates a connection to a single node, primary or secondary. But you probably already knew that.

The ReplSetConnection class is brand new. It has a slightly different API and must be used when connecting to a replica set. To connect, initialize the ReplSetConnection with a set of seed nodes followed by any connection options.

ReplSetConnection.new(['db1.app.com'], ['db2.app.com'],
  :rs_name => "myapp")

You can pass the replica set’s name as a kind of sanity check, ensuring that each node connected to is part of the same replica set.

Replica sets

If you’re running replica sets (and why wouldn’t you be?), then you’ll first want to make sure you connect with the ReplSetConnection class. Why? Because this class facilitates discovery, automatic failover, and read distribution.

Discovery is the process of finding the nodes of a set and determining their roles. When you pass a set of seed nodes to the ReplSetConnection class, you may now know which is the primary node. The driver will find that node and ensure that all writes are sent to it. In addition, the driver will discover any other nodes not specified as seeds and then cache those for failover and, optionally, read distribution.

Failover works like this. Your application is humming along when, for whatever reason, the primary member of the replica set goes down. So subsequent operations will fail, and the driver will raise the Mongo::ConnectionFailure exception until the replica set has successfully elected a new primary.

We’ve decided that connection failures shouldn’t be handled automatically by the driver. However, it’s not hard to achieve the oft-sought seamless failover. You simply need to make sure that 1) all writes use safe mode and 2) that all operations are wrapped in a rescue block. Details on just how to do that can be found in the replica set docs.

Finally, we should mention read distribution. For certain read-heavy applications, it’s useful to distribute the read load to a number of slave nodes, and the driver now facilitates this.

ReplSetConnection.new(['db1.app.com'], ['db2.app.com'],
  :read_secondary => true)

With :read_secondary => true, the connection will send all reads to an arbitrary secondary node. When running Ruby in production, where you’ll have a whole bunch of Thins and Mongrels or forked workers (à la Unicorn and Phusion), you should get a good distribution of reads across secondaries.

Write concern (i.e., safe mode plus)

Write concern is the term we use to describe safe mode and its options. For instance, you can use safe mode to ensure that a given write blocks until it’s been replicated to three nodes by specifying :safe => {:w => 3}. For example:

That gets verbose after a while, which is why the Ruby driver supports setting a default safe mode on the Connection, DB, and Collection levels as well. For instance:

@con = Connection.new("localhost", 27017, :safe => {:w = 3})
@db = @con['myapp']
@collection = @db['users']
@collection.insert({:username => "banker"})

Now, the insert will still use safe mode with w equal to 3, but it inherits this setting through the @con, @db, and @collection objects. A few more details on this can be found in the write concern docs.

JRuby

One of the most exciting advances in the last few months is the driver’s special support for JRuby. Essentially, when you run the driver on JRuby, the BSON library uses a Java-based serializer, guaranteeing the best performance for the platform.

One of the big advantages to running on JRuby is its support for native threads. So if you’re building multi-threaded apps, you may want to take advantage of the driver’s built-in connection pooling. Whether you’re creating a standard connection or a replica set connection, simply pass in a size and timeout for the thread pool, and you’re good to go.

Another relevant feature that’s slated for the next month is an asynchronous facade for the driver that uses the reactor pattern. (This has been spearheaded, and is in fact used in production, by Chuck Remes. Thanks, Chuck!). You can track progress at the async branch.

Rails and the Object Mappers

Finally, a word about Rails and object mappers. If you’re a Rails user, then there’s a good chance that you don’t use the Ruby driver directly at all. Instead, you probably use one of the available object mappers.

The object mappers can be a great help, but do be careful. We’ve seen a number of users get burned because they don’t understand the data model being created. So the biggest piece of advice is to understand the underlying representation being built out by your object mapper. It’s all too easy to abuse the nice abstractions provided by the OMs to create unwieldy, inefficient mega-documents down below. Caveat programator.

That said, I get a lot of questions about which OM to use. Now, if you understand how the OM actually works, then it really shouldn’t matter which one you use. But not everyone has the time to dig into these code bases. So when I do recommend one, I recommend MongoMapper. This is, admittedly, a bit of an aesthetic judgment, but I like the API and have found the software to be simple and reliable. Long-awaited docs for the projects are imminent, and we’ll tweet about them once they’re available.

What’s next

If you want to know more about the Ruby driver, tune in to next week’s Ruby driver webcast, where I’ll talk about everything in the post, plus some.

Finally, a big thanks to all those who have contributed to the driver, to the object mapper authors, and the all users of MongoDB with Ruby.

- Kyle Banker

Node.js and MongoDB

MongoDB

Releases

Visit the more recent post, Getting Started with VMware Cloud Foundry, MongoDB, and Node.js. Listen to the recorded Node.js Panel Discussion webinar.

Node.js is turning out to be a framework of choice for building real-time applications of all kinds, from analytics systems to chat servers to location-based tracking services. If you’re still new to Node, check out Simon Willison’s excellent introductory post. If you’re already using Node, you probably need a database, and you just might have considered using MongoDB.

The rationale is certainly there. Working with Node’s JavaScript means that MongoDB documents get their most natural representation – as JSON – right in the application layer. There’s also significant continuity between your application and the MongoDB shell, since the shell is essentially a JavaScript interpreter, so you don’t have to change languages when moving from application to database.

Node.js MongodB Driver

Especially impressive to us at 10gen has been the community support for Node.js and MongoDB. First, there’s Christian Kvalheim’s excellent mongodb-node-native project, a non-blocking MongoDB driver implemented entirely in JavaScript using Node.js’s system libraries. The project is a pretty close port of the MongoDB Ruby driver, making for an easy transition for those already used to the 10gen-supported drivers. If you’re just starting, there’s a helpful mongodb-node-native mailing list.

Hummingbird

Need a real-world example? Check out Hummingbird, Michael Nutt’s real-time analytics app. It’s built on top of MongoDB using Node.js and the mongodb-node-native driver. Hummingbird, which is used in production at Gilt Groupe, brings together an impressive array of technologies; it uses the express.js Node.js app framework and sports a responsive interface with the help of web sockets. Definitely worth checking out.

Mongoose

Of course, one of the admitted difficulties in working with Node.js is dealing with deep callback structures. If this poses a problem, or if you happen to want a richer data modeling library, then Mongoose is the answer. Created by Learnboost, Mongoose sits atop mongodb-node-native, providing a nice API for modeling your application.

Node Knockout

All of this just to show that the MongoDB/Node.js ecosystem thrives. If you need a good excuse to jump into Node.js or MongoDB development, be sure to check out next month's Node Knockout. It’s a weekend app competition for teams up to four, and registration is now open.

MongoDB 1.4 Ready for Production

MongoDB

Releases

The MongoDB team is very excited to announce the release of MongoDB 1.4.0. This is the culmination of 3 months of work in the 1.3 branch and has a large number of very important changes.

Many users have been running 1.3 in production, so this release is already very thoroghly vetted both by our regressions systems and by real users.

Some highlights:

Core server enhancements

Replication & Sharding

  • better handling for restarting slaves offline for a while
  • fast new slaves from snapshots
  • configurable slave delay
  • replication handles clock skew on master
  • $inc replication fixes
  • sharding alpha 3 - notably 2 phase commit on config servers

Deployment & production

  • configure “slow” for profiling
  • ability to do fsync + lock for backing up raw files
  • option for separate directory per db
  • http://localhost:28017/_status to get serverStatus via http
  • REST interface is off by default for security (–rest to enable)
  • can rotate logs with a db command, logRotate
  • enhancements to serverStatus - counters/replication lag
  • new mongostat tool and db.serverStatus() enhancements

Query language improvements

Geo

Downloads: www.mongodb.org/display/DOCS/Downloads

Full Change Log: jira

Release Notes: http://www.mongodb.org/display/DOCS/1.4+Release+Notes

Thanks for all your continued support, and we hope MongoDB 1.4 works great for you.

As always, please let us know of any issues,

-Eliot and the MongoDB Team

"Partial Object Updates" will be an Important NoSQL Feature

MongoDB

Releases

It’s nice that in SQL we can do things like

UPDATE PERSONS SET X = X + 1

We term this a “partial object update”: we updated the value of X without sending a full row update to the server.

Seems like a very simple thing to be discussing, yet some nosql solutions do not support this (others do).

In these new datastores, the average stored object size (whether it be a document, a key/value blob, or a row) tends to be larger than the traditional database row. The data is not fully normalized, so we are packing more data into a single storage object than before.

This means the cost of full updates is higher. If we have a 100KB document and want to set a single value within it, passing the full 100KB in both directions over the network for the operation is expensive.

MongoDB supports partial updates in its update operation via a set of special $ operators: $inc, $set, $push, etc. More of these operators will be added in the future.

There are further benefits to the technique too. First, we get easy (single document) atomicity for these operations (consider $inc). Second, replication is made cheaper: when a partial update occurs, MongoDB replicates the partial update rather than the full object changed. This makes replication much less expensive and network intensive.

Joyent

MongoDB

Releases

A prebuilt binary for Joyent (labeled “Solaris64”) is now available on the mongodb.org downloads page.

See http://www.mongodb.org/display/DOCS/Joyent for more information including an example of installation.

Upcoming Conferences for the MongoDB Team

MongoDB

Releases

We try to speak about MongoDB at as many conferences and meetups as possible. If you’re interested in learning more about MongoDB or in meeting some of the people who work on it then you should try to make it out to one. Our schedule for the next couple of months is below. If you know of (or are organizing) a conference/meetup where you’d like to hear from us shoot us an email at info@10gen.com!

  • 10/5/2009 NYC NoSQL NYC Dwight will be presenting about MongoDB and Eliot will be on a panel discussion, but all of us will be at the event

  • 10/16/2009 DC DC Hadoop Meetup Mike will be talking about MongoDB

  • 10/23/2009 St Louis Strange Loop Conference Mike will be discussing MongoDB

  • 10/24/2009 Foz do Iguaçu, Brazil Latinoware Kristina will be talking about MongoDB for web applications

  • 10/27/2009 NYC NY PHP Kristina will be talking about using MongoDB from PHP

  • 11/7/2009 Poznań, Poland RuPy Mike will be talking about using MongoDB from Ruby and Python

  • 11/14/2009 Portland OpenSQLCamp Portland Mike will be in Portland for OpenSQLCamp

  • 11/17/2009 NYC Web 2.0 Expo Eliot will be talking about shifting to non-relational databases

  • 11/19/2009 San Francisco RubyConf Mike will be talking about using MongoDB from Ruby

  • 11/19/2009 NYC Interop New York Dwight will be talking about data in the cloud