MongoDB Aggregation OverviewCode SnippetsDiscussion

This is a guest post from MongoLab

In this blog post, you run a concise set of aggregation framework examples on the mongo Javascript shell against a MongoLab hosted 2.1 database. The framework includes the aggregation operators $project, $unwind, $group, and others. These operators allow you to calculate values across documents in a collection, like averages and sums. They also let you reshape documents, unpacking nested structures and regrouping them as needed.

The aggregation framework, one of the most powerful and highly anticipated features in the forthcoming MongoDB 2.2 release, lets you construct a server-side processing pipeline to be run on a collection. A rich set of operations are available for incorporation in the pipeline so as to achieve various kinds of collection transforms, ranging from simple multi-document calculations (e.g., sums and averages) to complex projections and pivots.

The framework fits nicely in a range of data manipulation tools available in MongoDB from basic built-in functions like document counts to map-reduce and Javascript, to custom code and language-specific packages, including Hadoop.

  1. Create a 2.1 MongoLab database
  2. with your own unique name, say Instructions here. You'll need your mongod username and password.
  3. On your database's home page, copy the mongo shell connection to your clipboard.
  4. git clone git://gist.github.com/1401585.git aggdemo ; cd aggdemo
  5. Edit articles.js and aggregation.js to use the your db <myaggdemo> (see below)
  6. <your connection> mongo -u -p <mongod password> articles.js 
    (inserts the data into your database, 3 documents)em>
  7. mongo --shell <your connection> -u <mongod username> -p <mongod password> aggregation.js 
    (performs several aggregation examples and leaves you in the mongo shell.)
  8. Type g1 in the mongo shell to see the first $group result discussed below.

article.js

 /* sample articles for aggregation demonstrations */ // make sure we're using the right db; this is the same as "use mydb;" in shell db = db.getSiblingDB("aggdb"); //Put your MongoLab database name here. db.article.drop(); db.article.save( { title : "this is my title" , author : "bob" , posted : new Date(1079895594000) , pageViews : 5 , tags : [ "fun" , "good" , "fun" ] , comments : [ { author :"joe" , text : "this is cool" } , { author :"sam" , text : "this is bad" } ], other : { foo : 5 } }); //...snip [/sourcecode] aggregation.js [sourcecode language="javascript"] // make sure we're using the right db; this is the same as "use aggdb;" in shell db = db.getSiblingDB("aggdb"); //Put your MongoLab database name here. // ...snip... // grouping var g1 = db.runCommand( { aggregate : "article", pipeline : [ { $project : { author : 1, tags : 1, pageViews : 1 }}, { $unwind : "$tags" }, { $group : { _id : "$tags", docsByTag : { $sum : 1 }, viewsByTag : { $sum : "$pageViews" }, mostViewsByTag : { $max : "$pageViews" }, avgByTag : { $avg : "$pageViews" } }} ]}); // ...snip [/sourcecode] g1 aggregation result [sourcecode language="javascript"] { "result" : [ //...snip... { "_id" : "fun", "docsByTag" : 3, "viewsByTag" : 17, "mostViewsByTag" : 7, "avgByTag" : 5.666666666666667 } ], "Ok" : 1 } [/sourcecode] 

*See the MongoDB blog post for more information on the aggregation framework

The results of the aggregation are saved to convenient variables for examination. The group operations (g1 and g5) at the end of the aggregation.js file are noteworthy because they rollup three operators into a common pivot and aggregation example. The g1 data flow is shown below. Click it for a larger version or [here for a .pdf version.](http://blog.mongolab.com/wp-content/uploads/2012/07/Aggregation.pdf)

Tagged with: aggregation, javascript, shell, pipeline, MongoDB, Mongo, NoSQL, Polyglot persistence, 10gen

comments powered by Disqus