The Modern Application Stack – Part 2: Using MongoDB With Node.js

Introduction

This is the second in a series of blog posts examining the technologies that are driving the development of modern web and mobile applications.

"Modern Application Stack – Part 1: Introducing The MEAN Stack" introduced the technologies making up the MEAN (MongoDB, Express, Angular, Node.js) and MERN (MongoDB, Express, React, Node.js) Stacks, why you might want to use them, and how to combine them to build your web application (or your native mobile or desktop app).

The remainder of the series is focussed on working through the end to end steps of building a real (albeit simple) application – MongoPop.

This post demonstrates how to use MongoDB from Node.js.

MongoDB (recap)

MongoDB provides the persistence for your application data.

MongoDB is an open-source, document database designed with both scalability and developer agility in mind. MongoDB bridges the gap between key-value stores, which are fast and scalable, and relational databases, which have rich functionality. Instead of storing data in rows and columns as one would with a relational database, MongoDB stores JSON documents in collections with dynamic schemas.

MongoDB's document data model makes it easy for you to store and combine data of any structure, without giving up sophisticated validation rules, flexible data access, and rich indexing functionality. You can dynamically modify the schema without downtime – vital for rapidly evolving applications.

It can be scaled within and across geographically distributed data centers, providing high levels of availability and scalability. As your deployments grow, the database scales easily with no downtime, and without changing your application.

MongoDB Atlas is a database as a service for MongoDB, letting you focus on apps instead of ops. With MongoDB Atlas, you only pay for what you use with a convenient hourly billing model. With the click of a button, you can scale up and down when you need to, with no downtime, full security, and high performance.

Our application will access MongoDB via the JavaScript/Node.js driver which we install as a Node.js module.

Node.js (recap)

Node.js is a JavaScript runtime environment that runs your back-end application (via Express).

Node.js is based on Google's V8 JavaScript engine which is used in the Chrome browser. It also includes a number of modules that provides features essential for implementing web applications – including networking protocols such as HTTP. Third party modules, including the MongoDB driver, can be installed, using the npm tool.

Node.js is an asynchronous, event-driven engine where the application makes a request and then continues working on other useful tasks rather than stalling while it waits for a response. On completion of the requested task, the application is informed of the results via a callback (or a promise or Observable. This enables large numbers of operations to be performed in parallel – essential when scaling applications. MongoDB was also designed to be used asynchronously and so it works well with Node.js applications.

The application – Mongopop

MongoPop is a web application that can be used to help you test out and exercise MongoDB. After supplying it with the database connection information (e.g., as displayed in the MongoDB Atlas GUI), MongoPop provides these features:

Accept your username and password and create the full MongoDB connect string – using it to connect to your database
Populate your chosen MongoDB collection with bulk data (created with the help of the Mockeroo service)
Count the number of documents
Read sample documents
Apply bulk changes to selected documents

Downloading, running, and using the Mongopop application

Rather than installing and running MongoDB ourselves, it's simpler to spin one up in MongoDB Atlas:

Create MongoDB Atlas Cluster

To get the application code, either download and extract the zip file or use git to clone the Mongopop repo:

git clone git@github.com:am-MongoDB/MongoDB-Mongopop.git
cd MongoDB-Mongopop

If you don't have Node.js installed then that needs to be done before building the application; it can be downloaded from nodejs.org .

A file called package.json is used to control npm (the package manager for Node.js); here is the final version for the application:

{
  "//": "This contains the meta data to run the Mongopop application",

  "name": "Mongopop",
  "description": "Adds and manipulates data with MongoDB Atlas or other MongoDB databases",
  "author": "Andrew Morgan andrew.morgan@mongodb.com",
  "version": "0.0.1",
  "private": false,

  "//": "The scripts are used to build, launch and debug both the server and",
  "//": "client side apps. e.g. `npm install` will install the client and",
  "//": "server-side dependencies",

  "scripts": {
    "start": "cd public && npm run tsc && cd .. && node ./bin/www",
    "debug": "cd public && npm run tsc && cd .. && node-debug ./bin/www",
    "tsc": "cd public && npm run tsc",
    "tsc:w": "cd public && npm run tsc:w",
    "express": "node ./bin/www",
    "express-debug": "node-debug ./bin/www",
    "postinstall": "cd public && npm install"
  },

  "//": "The Node.js modules that are needed for the server-side application (the",
  "//": "client-side dependencies are specified in `public/package.jsom`).",

  "dependencies": {
    "body-parser": "~1.15.1",
    "debug": "~2.2.0",
    "ee-first": "^1.1.1",
    "ejs": "^2.5.2",
    "express": "^4.13.4",
    "external-ip": "^0.2.4",
    "jade": "~1.11.0",
    "methods": "^1.1.2",
    "mongodb": "^2.2.5",
    "morgan": "~1.7.0",
    "pug": "^0.1.0",
    "request": "^2.74.0",
    "serve-favicon": "~2.3.0",
    "tsc": "^1.20150623.0"
  }
}

The scripts section defines a set of shortcuts that can be executed using npm run <script-name>. For example npm run debug runs the Typescript transpiler (tsc) and then the Express framework in debug mode. start is a special case and can be executed with npm start.

Before running any of the software, the Node.js dependencies must be installed (into the node_modules directory):

npm install

Note the list of dependencies in package.json – these are the Node.js packages that will be installed by npm install. After those modules have been installed, npm will invoke the postinstall script (that will be covered in Part 4 of this series). If you later realise that an extra package is needed then you can install it and add it to the dependency list with a single command. For example, if the MongoDB Node.js driver hadn't already been included then it could be added with npm install --save mongodb – this would install the package as well as saving the dependency in package.json.

The application can then be run:

npm start

Once running, browse to http://localhost:3000/ to try out the application. When browsing to that location, you should be rewarded with the IP address of the server where Node.js is running (useful when running the client application remotely) – this IP address must be added to the IP Whitelist in the Security tab of the MongoDB Atlas GUI. Fill in the password for the MongoDB user you created in MongoDB Atlas and you're ready to go. Note that you should get your own URL, for your own data set using the Mockaroo service – allowing you to customise the format and contents of the sample data (and avoid exceeding the Mockaroo quota limit for the example URL).

What are all of these files?

package.json: Instructs the Node.js package manager (npm) what it needs to do; including which dependency packages should be installed
node_modues: Directory where npm will install packages
node_modues/mongodb: The MongoDB driver for Node.js
node_modues/mongodb-core: Low-level MongoDB driver library; available for framework developers (application developers should avoid using it directly)
javascripts/db.js: A JavaScript module we've created for use by our Node.js apps (in this series, it will be Express) to access MongoDB; this module in turn uses the MongoDB Node.js driver.

The rest of the files and directories can be ignored for now – they will be covered in later posts in this series.

Architecture

Using the JavaScript MongoDB Node.js Driver

The MongoDB Node.js Driver provides a JavaScript API which implements the network protocol required to read and write from a local or remote MongoDB database. If using a replica set (and you should for production) then the driver also decides which MongoDB instance to send each request to. If using a sharded MongoDB cluster then the driver connects to the mongos query router, which in turn picks the correct shard(s) to direct each request to.

We implement a shallow wrapper for the driver (javascripts/db.js) which simplifies the database interface that the application code (coming in the next post) is exposed to.

Code highlights

javascripts/db.js defines an /object prototype/ (think class from other languages) named DB to provide access to MongoDB.

Its only dependency is the MongoDB Node.js driver:

var MongoClient = require('mongodb').MongoClient;

The prototype has a single property – db which stores the database connection; it's initialised to null in the constructor:

function DB() {
    this.db = null;            // The MongoDB database connection
}

The MongoDB driver is asynchronous (the function returns without waiting for the requested operation to complete); there are two different patterns for handling this:

The application passes a callback function as a parameter; the driver will invoke this callback function when the operation has run to completion (either on success or failure)
If the application does not pass a callback function then the driver function will return a promise

This application uses the promise-based approach. This is the general pattern when using promises:

var _this = this;        // Required as `this` is no longer available in the functions invoked
                        // in response to the promise being satisfied

myLibrary.myAsyncFunction(myParameters)
.then(
    function(myResults) {
        // If the asynchronous operation eventually *succeeds* then the first of the `then`
        // functions is invoked and this code block will be executed at that time.
        // `myResults` is an arbitrary name and it is set to the result data sent back
        // by `myAsyncFunction` when resolving the promise

        _this.results = myReults;
    },
    function(myError) {
        // If the asynchronous operation eventually *fails* then the second of the `then`
        // functions is invoked and this code block will be executed at that time.
        // `myError ` is an arbitrary name and it is set to the error data sent back
        // by `myAsyncFunction` when rejecting the promise

        console.log("Hit a problem: " + myError.message);
    }
)

The methods of the DB object prototype we create are also asynchronous and also return promises (rather than accepting callback functions). This is the general pattern for returning and then subsequently satisfying promises:

return new Promise(function(resolve, reject) {
    // At this point, control has already been passed back to the calling function
  // and so we can safely perform operations that might take a little time (e.g.
    // a big database update) without the application hanging.

    var result = doSomethingThatTakesTime();

    // If the operation eventually succeeds then we can invoke `resolve` (the name is
    // arbitrary, it just has to patch the first function argument when the promise was
  // created) and optionally pass back the results of the operation

    if (result.everythingWentOK) {
        resolve(result.documentsReadFromDatabase);
    } else {
        
        // Call `reject` to fail the promise and optionally provide error information
        
        reject("Something went wrong: " + result.error.message);
    }
})

db.js represents a thin wrapper on top of the MongoDB driver library and so (with the background on promises under our belt) the code should be intuitive. The basic interaction model from the application should be:

Connect to the database
Perform all of the required database actions for the current request
Disconnect from the database

Here is the method from db.js to open the database connection:

DB.prototype.connect = function(uri) {

    // Connect to the database specified by the connect string / uri
    
    // Trick to cope with the fact that "this" will refer to a different
    // object once in the promise's function.
    var _this = this;
    
    // This method returns a javascript promise (rather than having the caller
    // supply a callback function).

    return new Promise(function(resolve, reject) {
        if (_this.db) {
            // Already connected
            resolve();
        } else {
            var __this = _this;
            
            // Many methods in the MongoDB driver will return a promise
            // if the caller doesn't pass a callback function.
            MongoClient.connect(uri)
            .then(
                function(database) {
                    
                    // The first function provided as a parameter to "then"
                    // is called if the promise is resolved successfully. The 
                    // "connect" method returns the new database connection
                    // which the code in this function sees as the "database"
                    // parameter

                    // Store the database connection as part of the DB object so
                    // that it can be used by subsequent method calls.

                    __this.db = database;

                    // Indicate to the caller that the request was completed succesfully,
                    // No parameters are passed back.

                    resolve();
                },
                function(err) {

                    // The second function provided as a parameter to "then"
                    // is called if the promise is rejected. "err" is set to 
                    // to the error passed by the "connect" method.

                    console.log("Error connecting: " + err.message);

                    // Indicate to the caller that the request failed and pass back
                    // the error that was returned from "connect"

                    reject(err.message);
                }
            )
        }
    })
}

One of the simplest methods that can be called to use this new connection is to count the number of documents in a collection:

DB.prototype.countDocuments = function(coll) {
    
    // Returns a promise which resolves to the number of documents in the 
    // specified collection.

    var _this = this;

    return new Promise(function (resolve, reject){

        // {strict:true} means that the count operation will fail if the collection
        // doesn't yet exist

        _this.db.collection(coll, {strict:true}, function(error, collection){
            if (error) {
                console.log("Could not access collection: " + error.message);
                reject(error.message);
            } else {
                collection.count()
                .then(
                    function(count) {
                        // Resolve the promise with the count
                        resolve(count);
                    },
                    function(err) {
                        console.log("countDocuments failed: " + err.message);
                        // Reject the promise with the error passed back by the count
                        // function
                        reject(err.message);
                    }
                )
            }
        });
    })
}

Note that the collection method on the database connection doesn't support promises and so a callback function is provided instead.

And after counting the documents; the application should close the connection with this method:

DB.prototype.close = function() {
    
    // Close the database connection. This if the connection isn't open
    // then just ignore, if closing a connection fails then log the fact
    // but then move on. This method returns nothing – the caller can fire
    // and forget.

    if (this.db) {
        this.db.close()
        .then(
            function() {},
            function(error) {
                console.log("Failed to close the database: " + error.message)
            }
        )    
    }
}

Note that then also returns a promise (which is, in turn, resolved or rejected). The returned promise could be created in one of 4 ways:

The function explicitly creates and returns a new promise (which will eventually be resolved or rejected).
The function returns another function call which, in turn, returns a promise (which will eventually be resolved or rejected).
The function returns a value – which is automatically turned into a resolved promise.
The function throws an error – which is automatically turned into a rejected promise.

In this way, promises can be chained to perform a sequence of events (where each step waits on the resolution of the promise from the previous one). Using those 3 methods from db.js, it's now possible to implement a very simple application function:

var DB = require('./javascripts/db');

function count (MongoDBURI, collectionName) {

    var database = new DB;

    database.connect(MongoDBURI)
    .then(
        function() {
            // Successfully connected to the database
            // Make the database call and pass the returned promise to the next stage
            return database.countDocuments(collectionName);
        },
        function(err) {
            // DB connection failed, add context to the error and throw it (it will be
            // converted to a rejected promise
            throw("Failed to connect to the database: " + err);
        })
    // The following `.then` clause uses the promise returned by the previous one.
    .then(
        function(count) {
            // Successfully counted the documents
            console.log(count + " documents");
            database.close();
        },
        function(err) {
            // Could have got here by either `database.connect` or `database.countDocuments`
            // failing
            console.log("Failed to count the documents: " + err);
            database.close();
        })
}

count("mongodb://localhost:27017/mongopop", "simples");

That function isn't part of the final application – the actual code will be covered in the next post – but jump ahead and look at routes/pop.js if your curious).

It's worth looking at the sampleCollection prototype method as it uses a database /cursor/. This method fetches a "random" selection of documents – useful when you want to understand the typical format of the collection's documents:

DB.prototype.sampleCollection = function(coll, numberDocs) {

    // Returns a promise which is either resolved with an array of 
    // "numberDocs" from the "coll" collection or is rejected with the
    // error passed back from the database driver.

    var _this = this;

    return new Promise(function (resolve, reject){
        _this.db.collection(coll, {strict:true}, function(error, collection){
            if (error) {
                console.log("Could not access collection: " + error.message);
                reject(error.message);
            } else {

                // Create a cursor from the aggregation request

                var cursor = collection.aggregate([
                    {
                        $sample: {size: parseInt(numberDocs)}
                    }],
                    { cursor: { batchSize: 10 } }
                )

                // Iterate over the cursor to access each document in the sample
                // result set. Could use cursor.each() if we wanted to work with
                // individual documents here.

                cursor.toArray(function(error, docArray) {
                    if (error) {
                        console.log("Error reading from cursor: " + error.message);
                        reject(error.message);
                    } else {
                        resolve(docArray);
                    }
                })
            }
        })
    })
}

Note that collection.aggregate doesn't actually access the database – that's why it's a synchronous call (no need for a promise or a callback) – instead, it returns a cursor. The cursor is then used to read the data from MongoDB by invoking its toArray method. As toArray reads from the database, it can take some time and so it is an asynchronous call, and a callback function must be provided (toArray doesn't support promises).

The rest of these database methods can be viewed in db.js but they follow a similar pattern. The Node.js MongoDB Driver API documentation explains each of the methods and their parameters.

Summary & what's next in the series

This post built upon the first, introductory, post by stepping through how to install and use Node.js and the MongoDB Node.js driver. This is our first step in building a modern, reactive application using the MEAN and MERN stacks.

The blog went on to describe the implementation of a thin layer that's been created to sit between the application code and the MongoDB driver. The layer is there to provide a simpler, more limited API to make application development easier. In other applications, the layer could add extra value such as making semantic data checks.

The next part of this series adds the Express framework and uses it to implement a REST API to allow clients to send requests of the MongoDB database. That REST API will subsequently be used by the client application (using Angular in Part 4 or React in Part 5).

Continue following this blog series to step through building the remaining stages of the MongoPop application:

Part 1: Introducing The MEAN Stack (and the young MERN upstart)
Part 2: Using MongoDB With Node.js
Part 3: Building a REST API with Express.js
Part 4: Building a Client UI Using Angular 2 (formerly AngularJS) & TypeScript
Part 5: Using ReactJS, ES6 & JSX to Build a UI (the rise of MERN)
Part 6: Browsers Aren't the Only UI – Mobile Apps, Amazon Alexa, Cloud Services...

A simpler way to build your app – MongoDB Stitch, Backend as a Service

MongoDB Stitch is a backend as a service (BaaS), giving developers a REST-like API to MongoDB, and composability with other services, backed by a robust system for configuring fine-grained data access controls. Stitch provides native SDKs for JavaScript, iOS, and Android.

Built-in integrations give your application frontend access to your favorite third party services: Twilio, AWS S3, Slack, Mailgun, PubNub, Google, and more. For ultimate flexibility, you can add custom integrations using MongoDB Stitch's HTTP service.

MongoDB Stitch allows you to compose multi-stage pipelines that orchestrate data across multiple services; where each stage acts on the data before passing its results on to the next.

Unlike other BaaS offerings, MongoDB Stitch works with your existing as well as new MongoDB clusters, giving you access to the full power and scalability of the database. By defining appropriate data access rules, you can selectively expose your existing MongoDB data to other applications through MongoDB Stitch's API.

If you'd like to try it out, step through building an application with MongoDB Stitch.

If you're interested in learning everything you need to know to get started building a MongoDB-based app you can sign up for one of our free online MongoDB University courses.