GIANT Stories at MongoDB

Learn New Skills in a MongoDB World Pre-Conference Workshop

MongoDB World is just around the corner, and you might be wondering, Why should I go? By not attending, you’d be missing out on great educational opportunities!

First, there are the conference sessions. You get to connect with MongoDB engineers, who give you an insider’s look on how the database is built and what’s coming on the roadmap. You can get your detailed technical questions answered in a free consulting slot during Ask the Experts. You’ll have the opportunity to meet other MongoDB users who showcase the projects they’ve built, talk about their pains, and discuss where MongoDB fits and doesn't fit their requirements.

Still, for me, the pre-conference workshops we offer are the icing on the cake. These full day training sessions, delivered by MongoDB engineers, provide the opportunity to experiment with MongoDB in a different context. By attending, you’ll get real, hands-on experience.

What You’ll Learn

Once again, we’re offering two of our most popular workshops on Ops Manager and data modeling. This year, we’re adding two new topics: how to keep your database secure and how to migrate an application from a relational database to MongoDB. No matter what your skill level is, we have a workshop to fit your needs.

MongoDB Security

In the MongoDB Security workshop, we will give you a set of instances and an app which you will need to secure, from top to bottom. You’ll start at the connection layer, using SSL and X509 certificates, then move to the underlying encryption of the storage layer by configuring the cluster using MongoDB encrypted storage engine. We’ll also cover auditing, authentication, and role-based access control.

Cleansing Time - 99% SQL Free Applications

Because migrations from relational can be cumbersome, in we’ll go through the different techniques and procedures that make this as painless as possible. You might be wondering where to start, or how to break that annoying super large join into a nice MongoDB query. We’ll address these and other common obstacles in migration. This workshop is the result of several years of experience helping our customers to perform these migrations.

Getting Started with MongoDB Ops Manager

The Getting Started with MongoDB Ops Manager workshop, for system administrators and DBAs, is a great crash course in MongoDB administration. Using a pre-configured Ops Manager installation, and using a set of AWS instances, we will be setting up and managing MongoDB clusters through both the Ops Manager UI and API commands. This will be a great way to explore the full set of features that Ops Manager provides.

Data Modeling

Data Modeling is a complex exercise, and you want to ensure you analyze all possible options to define your database documents and operations. In this workshop, we will cover the majority of schema design patterns and trade-offs between different approaches to a common use case.

We want these workshops to be an opportunity for you to learn new skills in a context that allows you to transfer them into your day-to-day work. We limit the class size for each workshop to ensure that you’ll receive individual attention.

Sign up soon; prices increase after April 28.

Register now

How to Perform Random Queries on MongoDB

Norberto Leite

Technical

As part of my day-to-day tasks I occasionally run random queries on datasets. I could need these queries to get a nice example of a random set or to put together a list of MUG members for a quick raffle or swag winner.

MUG member holding prize winning raffle

To pull these queries using MongoDB (obviously I'm going to use MongoDB for the task!) we can apply a few different approaches.

The traditional approach

Not long ago, to raffle a few conference tickets at one of the MongoDB User Groups, I invited all members to come up with an implementation on how to get a random document from MongoDB.

I was hoping for highly efficient approaches that would involve changing the MongoDB kernel and implementing a new feature in the database, but that’s not what our MUG members came up with. All proposed algorithms were instead client based, meaning that all the randomization phase of the algorithm was performed on the client side.

The algorithms would be based on the client random libraries and would consist of the following approach:

  • Get data from MongoDB
  • Run it through a random library
  • Spill out a winner

A simple example for this type of approach can be the following:

def load_data(collection, n=100):
    for i,d in load_data_file(n):
        collection.insert( d )

def get_data(collection, query={}):
    for d in collection.find(query):
        yield d

# Load the entire contents of the collection in memory.
elements = []
for e in get_data(collection):
    elements.append(e)

idx = random.randint(0, len(elements))

print "AND THE WINNER IS ..... " + elements[idx]['name']

We could adjust the above code to avoid loading the entire contents of the collection into memory at once by using reservoir sampling. Reservoir sampling is generally equivalent to the process described above, except that we sample the winners immediately from the incoming MongoDB result set, as a stream, so that we don't need to keep all documents in memory at once. The algorithm works by replacing the current winner with decreasing probability such that all elements--without even knowing how many there are beforehand--have an equal probability of being selected. This is actually how mongodb-js/collection-sample works when communicating with older versions of the server that don't support the new $sample operator.

Adding a bit of salt ...

Although the previous approach can be slightly improved by adding filters and indexes, they are client bound for choosing a random document. If we have 1M records (MongoDB User groups are quite popular!), then iterating over that many documents becomes a burden on the application and we are not really using any MongoDB magic to help us complete our task.


#### Adding an incremental value

A typical approach would be to mark our documents with a value that we would then use to randomly pick elements.

We can start by setting an incremental value to our documents and then query based on the range of values that we recently marked the documents with.

def load_data(collection, n=100):
    #for each element we will insert the `i` value
    for i in xrange(n):
        name = ''.join(random.sample( string.letters, 20))
        collection.insert( {'name': name, 'i': i})

After tattooing our documents with this new element we are now able to use some of the magic of a MongoDB query language to start collecting our well deserved prize winner.

mc = MongoClient()
db = mc.simplerandom
collection = db.names

number_of_documents = 100

load_data(collection, number_of_documents )

query = {'i': random.randint(0, number_of_documents )  }

winner = collection.find_one(query);

print "AND THE WINNER IS ..... " + winner['name']

While this approach seems fine and dandy there are a few problems here:

  • we need to know the number of documents inserted
  • we need to make sure that all data is available

Double trip to mother ship

To attend the first concern we need to know the number of documents inserted, which we can count with the MongoDB count operator.

number_of_documents = collection.count()

This operator will immediately tell us the number of elements that a given collection contains

But it does not solve all of our problems. Another thread might be working concurrently and deleting or updating your documents.

We need to know the higher value of i since that might differ from the count of documents in the collection and we need to account for any eventual skip of the value of i (deleted document, incorrect client increment).

For accomplishing this in a truly correct way we would need to use distinct to make sure we are not missing any values and then querying for a document that would contain such a value.

def load_data(collection, n=100):
    #let's skip some elements
    skiplist = [10, 12, 231 , 2 , 4]
    for i,d in load_data_file(n):
        d['i'] = i
        if i in skiplist:
            continue
        collection.insert( d )

load_data(collection, 100)

distinct = collection.distinct('i')

ivalue = random.sample(distinct, 1)

winner = collection.find_one({ 'i': ivalue })

print "AND THE WINNER IS ..... " + winner['name']

Although we are starting to use MongoDB magic to give us some element of randomness this is not truly a good solution.

  • Requires multiple trips to the database
  • Becomes computationally bound on the client (again)
  • Very prone to errors while data is being used

Making more of the same tatoo

To avoid a large computational bound on the client due to the high number of distinct i values, we could use a limited amount of i elements to mark our documents and randomize the occurrence of this mark:

def load_data(collection, n=100):
    #fixed number of marks
    max_i = 10
    for j,d in load_data_file(n):
        d['i'] = random.randint(0, max_i)
        collection.insert( d )

This way we are limiting the variation of the i value and limiting the computational task on both the client and on MongoDB:

number_of_documents = 100

load_data(collection, number_of_documents )

query = {'i': random.randint(0, 10 )  }

docs = [x for x in collection.find(query)]

winner = random.sample(docs, 1)[0]

print "AND THE WINNER IS ..... " + winner['name']

... but then again, we have a better solution!

The above implementations, no matter how magical they might be, more or less simplistic, multiple or single trips to the database ... they tend to:

  • Be inefficient
  • Create artificial workarounds on data
  • Are not natively / pure MongoDB implementations
  • And the random implementation is always bound to the client

But fear not! Our 3.2 release brings a solution to this simple wished task: $sample

$sample is a new aggregation framework operator that implements a native random sample operation over a collection data set:

  • no more playing with extra fields and extra indexes
  • no more double trips to the database to get your documents
  • native, optimized, sampling operator implemented in the database:
number_of_documents = 100

load_data(collection, number_of_documents)

winner = [ d for d in collection.aggregate([{'$sample': {'size': 1 }}])][0]

print "AND THE WINNER IS ..... " + winner['name']

Just beautiful!


Learn more about the latest features included in the release of MongoDB 3.2.

What's new in MongoDB 3.2


About the Author - Norberto Leite

Norberto is a Software Engineer at MongoDB working on the content and materials that make up the MongoDB University curriculum. Prior to this role, Norberto served as Developer Advocate and Solutions Architect helping professionals develop their skills and understanding of MongoDB. Norberto has several years of software development experience in large scalable systems.

Santa Claus and His Distributed Team

Norberto Leite

Company

So here it comes again, that happy season when Santa visits and brings the presents that we've earned for being so "well behaved" throughout the year.

Well… it happens that there are a lot of requests (+6 billion of those!), requiring the Elves to build a very scalable system to support so many gift deliveries. Not only handling the total number of requests, but also the concentration of these requests around the holiday season.

And of course, do not forget about the different types of websites that we need to build according to the regional variations of the requests (wool pajamas are likely to be requested next to the North Pole but not so much in the tropics!). To make sure that all requests are well served we need to have different applications serving this immense variability.

Architecture design

In order to deliver all presents in time for Christmas, Santa has asked Elf Chief Architect (ECA) to build a distributed, globally scalable platform so that regional Elves could build their apps to meet the needs of their local users.

Apparently Elf Chief Architect has been attending several MongoDB Days Conferences and will certainly be attending MongoDB World, but one of the things that intrigued Elf Chief Architect was a talk around distributed container platforms supported by a MongoDB sharded cluster. Elf Chief Architect has a great deal of experience scaling databases using MongoDB, but the use of containers started gaining a lot of traction so he decided to give it a go.

First thing that the ECA did was to deploy a container fleet on different data centers across the world:

Schema design

Using tag aware sharding, the Elfs team in Brazil can now build their app for South America with complete independence from the Japanese team

//brazil team 
node.js
focus on 4 languages
portuguese
spanish 
french
dutch - yes, surinam and the duct antilles are located in SA!
schema design for present requests
{ 
  _id: "surf_board_12334244", 
  color: "azul e amarelo",
  size: "M"
  present_for: "Ze Pequeno",
  address: {
    street: "Rua Noe 10",
    city: "Cidade de Deus"
    zip: "22773-410 - RJ",
    geo: {
        "type": "Point",
        "coordinates": [
          -43.36263209581375,
          -22.949136390313374
        ]
      }
  ... 
  }
//japan team 
ruby 
focus on 2 languages 
japanese
english 
schema design for present requests 
{ 
  _id: { inc: 5535345, name: "Maneki Neko"}
  shape: "Round"
  receiver: "Shinobu Tsukasa",
  street: "〒東京都中央区 日本橋3-1-15, 1F",
  city: "Tokio"
  zip: "103-0027 - TK",
  position: {
      "type": "Point",
      "coordinates": [
          136.8573760986328,
          35.14882797801035

      ]
  }
  ... 
  }

As one can figure out from the above misaligned schema designs, the 2 teams have considerably different views on how to model some important base information.

We start by the simple but quite important format of _id.

On the Brazil team, a simple composed string would be enough to uniquely identify the intended gift while in Japan, they adopted a different strategy setting _id as a sub-document with 2 fields, incremental value (inc) and the name of the object (name).

While both are valid MongoDB schemas,and can coexist the same collection (shared or not), this situation can cause some "interesting" side effects:

  • more complex queries to find all intended elements
  • index inefficiencies due to the multiple data types that we are indexing
  • sorting issues
  • ordering of keys in sub-documents will matter
  • specific match criterion
> db.test.find()
{ "_id" : { "a" : 1234, "b" : 5678 } }
> db.test.find( { _id : { b : 5678, a : 1234 } } ) <- No Match! Wuh?
> db.test.find( { _id : {  a : 1234, b : 5678 } } ) <- But this does? What"s the diff?

This can become a very hairy situation!

Although flexibility is one of the most loved and appreciated attributes of MongoDB, one needs to be aware that "with great power comes great responsibility".

One of the basic rules of good schema design is to have a common base structure for documents stored in the same collection. This common structure should be a set of fields that are generally the same for different applications, with agreed upon data types and formats.

This uniform data structure makes everyone's life much simpler and of course, when Santa wants to get a list of all presents he needs to deliver (yes, Santa does his own queries with the MongoDB shell!) he does not need to build a large $or statement with all the variations that schema might contain.

Document Validation

Now we all know that even with the best intentions, once we have a distributed team like Santa's regional super expert Elf developers, sometimes we change the schema involuntarily; either by implementing a new schema that slightly changes a data type or even the format of a given field.

To avoid issues like these we introduced Document Validation in MongoDB 3.2 . Document validation enforces guarantees on the structural definition of your documents.

For optimal configuration the validation of incoming write operations needs to be defined at the collection level on the server. In MongoDB this setup is very flexible and we can adjust the validation rules as we evolve our application. All incoming write operations, either new writes or updates, will be matched against the predefined validator rules to acknowledge or reject those operations.

The behavior of the validator can also be changed according to the operations, giving the system a way to override the rejection of certain write operations (if we want to bypass the validation rule) or just purely set the action for the validation by changing the default rejection to a warning in the mongod log.

db.presents.insert( { 
  _id: "skate_board_434", 
  color: "blue",
  for: "Mariott"
} )  -> results in an error since it's missing `present_for` field

This is particularly interesting for distributed and multi app/multi versioned environments that have multiple teams working over the same dataset and all sorts of different roles (developers, sys admins, DBAs …)

The minute Elf Chief Architect read about this feature he jumped in his warm, comfortable, well cushioned sofa and started playing around with the existing release candidate! "What a great feature" some Elves reported hearing.

Lookup operator

Now, one of Santa's main responsibilities is to make sure that only well behaved children actually receive presents.

In the past, the Elves would overcome this problem by putting together a list of all the poorly behaved children (far less than the well behaved ones) and mark the presents that would match this list to _deserves:false, and then filter out this field on the list of results.

While this was efficient since they were doing an inplace update of a given list, an extra write operation consisted of batching the 6 billion children (we are all children inside!). 6 billion * 16 (16 is the average amount of presents that each child gets on Christmas in the UK) tends to become a massive operation, but nothing that MongoDB can’t handle. To avoid changing data, another option would be to filter this with an aggregation operation. Since we just need a report at the end of the present submission period, what the Elves refer to as CFP (call for presents), ECA decided to test the new $lookup operator.

The Elves decided to have the following architecture, collection of all children and how well behaved they've been this year:

// collection children 
> db.children.find()
{
 name: "Norberto Leite",
 behaved: true,
 note: "Deserves all the presents possible!"
}
{
 name: "Ze Pequeno",
 behaved: false,
 note: "very nasty gangster!"
}
{
 name: "Shinobu Tsukasa",
 behaved: false,
 note: "japanese mafia member"
}

And another collection with all the presents that our parents and friends submitted on our behalf:

// presents collection
> db.presents.find()
{ 
  _id: "5535345_Maneki Neko",
  shape: "Round"
  receiver: "Shinobu Tsukasa",
  street: "〒東京都中央区 日本橋3-1-15, 1F",
  city: "Tokio"
  zip: "103-0027 - TK",
  geo: {
      "type": "Point",
      "coordinates": [
          136.8573760986328,
          35.14882797801035

      ]
  }
} 
{ 
  _id: "surf_board_12334244", 
  color: "azul e amarelo",
  size: "M"
  receiver: "Ze Pequeno",
  address: {
    street: "Rua Noe 10",
    city: "Cidade de Deus"
    zip: "22773-410 - RJ",
    geo: {
        "type": "Point",
        "coordinates": [
          -43.36263209581375,
          -22.949136390313374
        ]
      }
  } 
}
...

Given these 2 collections we can perform a left outer join between them using the aggregation framework:

db.children.aggregate( {"$match": { "behaved": true }}, 
{"$lookup": {  "from": "presents", "localField": "name", "foreignField": "present_for", "as":"presents"   }  })

With the previous instruction we enabled, we will collect all presents for each child and set those values on the presents field:

{
  name: "Norberto Leite"
  behaved: true
  presents: [{_id: "play_that_box_34432", ...}]
}

Conclusion

Elf Chief Architect was really pleased with the present MongoDB has delivered this year. Not only can he can make much more concise decisions around how data can be handled by the different teams across the globe, he can accommodate some known regional challenges:

  • distribution - sharding
  • schema variation - document validation
  • enhancement of technical expertise - lots of different drivers
  • complex queries across different collections - $lookup
  • good integration with container architecture

There are many new tricks available with 3.2 that make the Elf Chief Architect happy:

  • partial indexes
  • connector for BI
  • new election protocol
  • new aggregation framework operators
  • ...

...and full bag of other features that enable the Elves to produce great applications for the Christmas operations to run smoothly. You can learn more about all of these by downloading our What’s New in MongoDB 3.2 white paper.

With MongoDB 3.2 not only does your application get that edge required for enabling large distributed teams to work on the same dataset with extra guarantees at the server level, but keeping the flexibility and scalability that developers love.

Happy Holidays, everyone!


Learn more about MongoDB 3.2.

Read the What's New in 3.2 white paper


About the Author - Norberto

Norberto Leite is Technical Evangelist at MongoDB. Norberto has been working for the last 5 years on large scalable and distributable application environments, both as advisor and engineer. Prior to MongoDB Norberto served as a Big Data Engineer at Telefonica.

Building your first application with MongoDB: Creating a REST API using the MEAN Stack - Part 2

Updated March 2017 Since this post, other MEAN & MERN stack posts have been written: The Modern Application Stack by Andrew Morgan.

In the first part of this blog series, we covered the basic mechanics of our application and undertook some data modeling. In this second part, we will create tests that validate the behavior of our application and then describe how to set-up and run the application.

Write the tests first

Let’s begin by defining some small configuration libraries.

file name: test/config/test_config.js

module.exports = {
    url : 'http://localhost:8000/api/v1.0'
}

Our server will be running on port 8000 on localhost. This will be fine for initial testing purposes. Later, if we change the location or port number for a production system, it would be very easy to just edit this file.

To prepare for our test cases, we need to ensure that we have a good test environment. The following code achieves this for us. First, we connect to the database.

file name: test/setup_tests.js

function connectDB(callback) {
    mongoClient.connect(dbConfig.testDBURL, function(err, db) {
        assert.equal(null, err);
        reader_test_db = db;
        console.log("Connected correctly to server");
        callback(0);
    });
}

Next, we drop the user collection. This ensures that our database is in a known starting state.

function dropUserCollection(callback) {
        console.log("dropUserCollection");
        user = reader_test_db.collection('user');
        if (undefined != user) {
            user.drop(function(err, reply) {
                console.log('user collection dropped');
                callback(0);
            });
        } else {
            callback(0);
        }
    },

Next, we will drop the user feed entry collection.

    function dropUserFeedEntryCollection(callback) {
        console.log("dropUserFeedEntryCollection");
        user_feed_entry = reader_test_db.collection('user_feed_entry');
        if (undefined != user_feed_entry) {
            user_feed_entry.drop(function(err, reply) {
                console.log('user_feed_entry collection dropped');
                callback(0);
            });
        } else {
            callback(0);
        }
    }

Next, we will connect to Stormpath and delete all the users in our test application.

function getApplication(callback) {
        console.log("getApplication");
        client.getApplications({
            name: SP_APP_NAME
        }, function(err, applications) {
            console.log(applications);
            if (err) {
                log("Error in getApplications");
                throw err;
            }
            app = applications.items[0];
            callback(0);
        });
    },
    function deleteTestAccounts(callback) {
        app.getAccounts({
            email: TU_EMAIL_REGEX
        }, function(err, accounts) {
            if (err) throw err;
            accounts.items.forEach(function deleteAccount(account) {
                account.delete(function deleteError(err) {
                    if (err) throw err;
                });
            });
            callback(0);
        });
    }

Next, we close the database.

function closeDB(callback) {
    reader_test_db.close();
}

Finally, we call async.series to ensure that all the functions run in the correct order.

async.series([connectDB, dropUserCollection, dropUserFeedEntryCollection, dropUserFeedEntryCollection, getApplication, deleteTestAccounts, closeDB]);

Frisby was briefly mentioned earlier. We will use this to define our test cases, as follows.

file name: test/create_accounts_error_spec.js

TU1_FN = "Test";
TU1_LN = "User1";
TU1_EMAIL = "testuser1@example.com";
TU1_PW = "testUser123";
TU_EMAIL_REGEX = 'testuser*';
SP_APP_NAME = 'Reader Test';

var frisby = require('frisby');
var tc = require('./config/test_config');

We will start with the enroll route in the following code. In this case we are deliberately missing the first name field, so we expect a status reply of 400 with a JSON error that we forgot to define the first name. Let’s “toss that frisby”:

frisby.create('POST missing firstName')
    .post(tc.url + '/user/enroll',
          { 'lastName' : TU1_LN,
            'email' : TU1_EMAIL,
            'password' : TU1_PW })
    .expectStatus(400)
    .expectHeader('Content-Type', 'application/json; charset=utf-8')
    .expectJSON({'error' : 'Undefined First Name'})
    .toss()

In the following example, we are testing a password that does not have any lower-case letters. This would actually result in an error being returned by Stormpath, and we would expect a status reply of 400.

frisby.create('POST password missing lowercase')
    .post(tc.url + '/user/enroll',
          { 'firstName' : TU1_FN,
            'lastName' : TU1_LN,
            'email' : TU1_EMAIL,
            'password' : 'TESTUSER123' })
    .expectStatus(400)
    .expectHeader('Content-Type', 'application/json; charset=utf-8')
    .expectJSONTypes({'error' : String})
    .toss()

In the following example, we are testing an invalid email address. So, we can see that there is no @ sign and no domain name in the email address we are passing, and we would expect a status reply of 400.

frisby.create('POST invalid email address')
    .post(tc.url + '/user/enroll',
          { 'firstName' : TU1_FN,
            'lastName' : TU1_LN,
            'email' : "invalid.email",
            'password' : 'testUser' })
    .expectStatus(400)
    .expectHeader('Content-Type', 'application/json; charset=utf-8')
    .expectJSONTypes({'error' : String})
    .toss()

Now, let’s look at some examples of test cases that should work. Let’s start by defining 3 users.

file name: test/create_accounts_spec.js

TEST_USERS = [{'fn' : 'Test', 'ln' : 'User1',
               'email' : 'testuser1@example.com', 'pwd' : 'testUser123'},
              {'fn' : 'Test', 'ln' : 'User2',
               'email' : 'testuser2@example.com', 'pwd' : 'testUser123'},
              {'fn' : 'Test', 'ln' : 'User3',
               'email' : 'testuser3@example.com', 'pwd' : 'testUser123'}]

SP_APP_NAME = 'Reader Test';

var frisby = require('frisby');
var tc = require('./config/test_config');

In the following example, we are sending the array of the 3 users we defined above and are expecting a success status of 201. The JSON document returned would show the user object created, so we can verify that what was created matched our test data.

TEST_USERS.forEach(function createUser(user, index, array) {
    frisby.create('POST enroll user ' + user.email)
        .post(tc.url + '/user/enroll',
              { 'firstName' : user.fn,
                'lastName' : user.ln,
                'email' : user.email,
                'password' : user.pwd })
        .expectStatus(201)
        .expectHeader('Content-Type', 'application/json; charset=utf-8')
        .expectJSON({ 'firstName' : user.fn,
                      'lastName' : user.ln,
                      'email' : user.email })
        .toss()
});

Next, we will test for a duplicate user. In the following example, we will try to create a user where the email address already exists.

frisby.create('POST enroll duplicate user ')
    .post(tc.url + '/user/enroll',
          { 'firstName' : TEST_USERS[0].fn,
            'lastName' : TEST_USERS[0].ln,
            'email' : TEST_USERS[0].email,
            'password' : TEST_USERS[0].pwd })
    .expectStatus(400)
    .expectHeader('Content-Type', 'application/json; charset=utf-8')
    .expectJSON({'error' : 'Account with that email already exists.  Please choose another email.'})
    .toss()

One important issue is that we don’t know what API key will be returned by Stormpath a priori. So, we need to create a file dynamically that looks like the following. We can then use this file to define test cases that require us to authenticate a user.

file name: /tmp/readerTestCreds.js

TEST_USERS = 
[{    "_id":"54ad6c3ae764de42070b27b1",
    "email":"testuser1@example.com",
    "firstName":"Test",
    "lastName":"User1",
    "sp_api_key_id":”<API KEY ID>",
    "sp_api_key_secret":”<API KEY SECRET>”
},
{    "_id":"54ad6c3be764de42070b27b2”,
    "email":"testuser2@example.com",
    "firstName":"Test",
    "lastName":"User2”,
    "sp_api_key_id":”<API KEY ID>",
    "sp_api_key_secret":”<API KEY SECRET>”
}];
module.exports = TEST_USERS;

In order to create the temporary file above, we need to connect to MongoDB and retrieve user information. This is achieved by the following code.

file name: tests/writeCreds.js

TU_EMAIL_REGEX = new RegExp('^testuser*');
SP_APP_NAME = 'Reader Test';
TEST_CREDS_TMP_FILE = '/tmp/readerTestCreds.js';

var async = require('async');
var dbConfig = require('./config/db.js');
var mongodb = require('mongodb');
assert = require('assert');

var mongoClient = mongodb.MongoClient
var reader_test_db = null;
var users_array = null;

function connectDB(callback) {
     mongoClient.connect(dbConfig.testDBURL, function(err, db) {
         assert.equal(null, err);
         reader_test_db = db;
         callback(null);
     });
 }
 
 function lookupUserKeys(callback) {
     console.log("lookupUserKeys");
     user_coll = reader_test_db.collection('user');
     user_coll.find({email : TU_EMAIL_REGEX}).toArray(function(err, users) {
         users_array = users;
         callback(null);
     });
 }
 
function writeCreds(callback) {
     var fs = require('fs');
     fs.writeFileSync(TEST_CREDS_TMP_FILE, 'TEST_USERS = ');
     fs.appendFileSync(TEST_CREDS_TMP_FILE, JSON.stringify(users_array));
     fs.appendFileSync(TEST_CREDS_TMP_FILE, '; module.exports = TEST_USERS;');
     callback(0);
 }
 
 function closeDB(callback) {
     reader_test_db.close();
 }
 
 async.series([connectDB, lookupUserKeys, writeCreds, closeDB]);

In the following code, we can see that the first line uses the temporary file that we created with the user information. We have also defined several feeds, such as Dilbert and the Eater Blog.

file name: tests/feed_spec.js


TEST_USERS = require('/tmp/readerTestCreds.js');
 
var frisby = require('frisby');
var tc = require('./config/test_config');
var async = require('async');
var dbConfig = require('./config/db.js');
 
var dilbertFeedURL = 'http://feeds.feedburner.com/DilbertDailyStrip';
var nycEaterFeedURL = 'http://feeds.feedburner.com/eater/nyc';

Previously, we defined some users but none of them had subscribed to any feeds. In the following code we test feed subscription. Note that authentication is required now and this is achieved using .auth with the Stormpath API keys. Our first test is to check for an empty feed list.

function addEmptyFeedListTest(callback) {
     var user = TEST_USERS[0];
     frisby.create('GET empty feed list for user ' + user.email)
             .get(tc.url + '/feeds')
             .auth(user.sp_api_key_id, user.sp_api_key_secret)
             .expectStatus(200)
             .expectHeader('Content-Type', 'application/json; charset=utf-8')
             .expectJSON({feeds : []})
             .toss()
             callback(null);
}

In our next test case, we will subscribe our first test user to the Dilbert feed.

function subOneFeed(callback) {
     var user = TEST_USERS[0];
     frisby.create('PUT Add feed sub for user ' + user.email)
             .put(tc.url + '/feeds/subscribe', {'feedURL' : dilbertFeedURL})
             .auth(user.sp_api_key_id, user.sp_api_key_secret)
             .expectStatus(201)
             .expectHeader('Content-Type', 'application/json; charset=utf-8')
             .expectJSONLength('user.subs', 1)
             .toss()
             callback(null);
}

In our next test case, we will try to subscribe our first test user to a feed that they are already subscribed-to.

function subDuplicateFeed(callback) {
     var user = TEST_USERS[0];
     frisby.create('PUT Add duplicate feed sub for user ' + user.email)
             .put(tc.url + '/feeds/subscribe',
                  {'feedURL' : dilbertFeedURL})
             .auth(user.sp_api_key_id, user.sp_api_key_secret)
             .expectStatus(201)
             .expectHeader('Content-Type', 'application/json; charset=utf-8')
             .expectJSONLength('user.subs', 1)
             .toss()
     callback(null);
}

Next, we will subscribe our test user to a new feed. The result returned should confirm that the user is subscribed now to 2 feeds.

function subSecondFeed(callback) {
     var user = TEST_USERS[0];
     frisby.create('PUT Add second feed sub for user ' + user.email)
             .put(tc.url + '/feeds/subscribe',
                  {'feedURL' : nycEaterFeedURL})
             .auth(user.sp_api_key_id, user.sp_api_key_secret)
             .expectStatus(201)
             .expectHeader('Content-Type', 'application/json; charset=utf-8')
             .expectJSONLength('user.subs', 2)
             .toss()
     callback(null);
 }

Next, we will use our second test user to subscribe to a feed.

function subOneFeedSecondUser(callback) {
     var user = TEST_USERS[1];
     frisby.create('PUT Add one feed sub for second user ' + user.email)
             .put(tc.url + '/feeds/subscribe',
                  {'feedURL' : nycEaterFeedURL})
             .auth(user.sp_api_key_id, user.sp_api_key_secret)
             .expectStatus(201)
             .expectHeader('Content-Type', 'application/json; charset=utf-8')
             .expectJSONLength('user.subs', 1)
             .toss()
     callback(null);
}

async.series([addEmptyFeedListTest, subOneFeed, subDuplicateFeed, subSecondFeed, subOneFeedSecondUser]);

The REST API

Before we begin writing our REST API code, we need to define some utility libraries. First, we need to define how our application will connect to the database. Putting this information into a file gives us the flexibility to add different database URLs for development or production systems.

file name: config/db.js

module.exports = {
     url : 'mongodb://localhost/reader_test'
 }

If we wanted to turn on database authentication we could put that information in a file, as shown below. This file should not be checked into source code control for obvious reasons.

file name: config/security.js

module.exports = {
     stormpath_secret_key : ‘YOUR STORMPATH APPLICATION KEY’;
}

We can keep Stormpath API and Secret keys in a properties file, as follows, and need to carefully manage this file as well.

file name: config/stormpath_apikey.properties

apiKey.id = YOUR STORMPATH API KEY ID
apiKey.secret = YOUR STORMPATH API KEY SECRET

Express.js overview

In Express.js, we create an “application” (app). This application listens on a particular port for HTTP requests to come in. When requests come in, they pass through a middleware chain. Each link in the middleware chain is given a req (request) object and a res (results) object to store the results. Each link can choose to do work, or pass it to the next link. We add new middleware via app.use(). The main middleware is called our “router”, which looks at the URL and routes each different URL/verb combination to a specific handler function.

Creating our application

Now we can finally see our application code, which is quite small since we can embed handlers for various routes into separate files.

file name: server.js

var express = require('express');
var bodyParser = require('body-parser');
var mongoose = require('mongoose');
var stormpath = require('express-stormpath');
var routes = require("./app/routes");
var db     = require('./config/db');
var security = require('./config/security');
 
var app = express();
var morgan = require('morgan’);
app.use(morgan);
app.use(stormpath.init(app, {
     apiKeyFile: './config/stormpath_apikey.properties',
     application: ‘YOUR SP APPLICATION URL',
     secretKey: security.stormpath_secret_key
 }));
 var port = 8000;
 mongoose.connect(db.url);
 
app.use(bodyParser.urlencoded({ extended: true }));
 
routes.addAPIRouter(app, mongoose, stormpath);

We define our own middleware at the end of the chain to handle bad URLs.

app.use(function(req, res, next){
   res.status(404);
   res.json({ error: 'Invalid URL' });
});

Now our server application is listening on port 8000.

app.listen(port);

Let’s print a message on the console to the user.

console.log('Magic happens on port ' + port);
 
exports = module.exports = app;

Defining our Mongoose data models

We use Mongoose to map objects on the Node.js side to documents inside MongoDB. Recall that earlier, we defined 4 collections:

  1. Feed collection.
  2. Feed entry collection.
  3. User collection.
  4. User feed-entry-mapping collection.

So we will now define schemas for these 4 collections. Let’s begin with the user schema. Notice that we can also format the data, such as converting strings to lowercase, and remove leading or trailing whitespace using trim.

file name: app/routes.js

var userSchema = new mongoose.Schema({
         active: Boolean,
         email: { type: String, trim: true, lowercase: true },
         firstName: { type: String, trim: true },
         lastName: { type: String, trim: true },
         sp_api_key_id: { type: String, trim: true },
         sp_api_key_secret: { type: String, trim: true },
         subs: { type: [mongoose.Schema.Types.ObjectId], default: [] },
         created: { type: Date, default: Date.now },
         lastLogin: { type: Date, default: Date.now },
     },
     { collection: 'user' }
);

In the following code, we can also tell Mongoose what indexes need to exist. Mongoose will also ensure that these indexes are created if they do not already exist in our MongoDB database. The unique constraint ensures that duplicates are not allowed. The “email : 1” maintains email addresses in ascending order. If we used “email : -1” it would be in descending order.

userSchema.index({email : 1}, {unique:true});
userSchema.index({sp_api_key_id : 1}, {unique:true});

We repeat the process for the other 3 collections.

var UserModel = mongoose.model( 'User', userSchema );

var feedSchema = new mongoose.Schema({
         feedURL: { type: String, trim:true },
         link: { type: String, trim:true },
         description: { type: String, trim:true },
         state: { type: String, trim:true, lowercase:true, default: 'new' },
         createdDate: { type: Date, default: Date.now },
         modifiedDate: { type: Date, default: Date.now },
     },
     { collection: 'feed' }
);
 
feedSchema.index({feedURL : 1}, {unique:true});
feedSchema.index({link : 1}, {unique:true, sparse:true});
 
var FeedModel = mongoose.model( 'Feed', feedSchema );

var feedEntrySchema = new mongoose.Schema({
         description: { type: String, trim:true },
         title: { type: String, trim:true },
         summary: { type: String, trim:true },
         entryID: { type: String, trim:true },
         publishedDate: { type: Date },
         link: { type: String, trim:true  },
         feedID: { type: mongoose.Schema.Types.ObjectId },
         state: { type: String, trim:true, lowercase:true, default: 'new' },
         created: { type: Date, default: Date.now },
     },
     { collection: 'feedEntry' }
);
 
feedEntrySchema.index({entryID : 1});
feedEntrySchema.index({feedID : 1});
 
var FeedEntryModel = mongoose.model( 'FeedEntry', feedEntrySchema );

var userFeedEntrySchema = new mongoose.Schema({
         userID: { type: mongoose.Schema.Types.ObjectId },
         feedEntryID: { type: mongoose.Schema.Types.ObjectId },
         feedID: { type: mongoose.Schema.Types.ObjectId },
         read : { type: Boolean, default: false },
     },
     { collection: 'userFeedEntry' }
 );

The following is an example of a compound index on 4 fields. Each index is maintained in ascending order.

userFeedEntrySchema.index({userID : 1, feedID : 1, feedEntryID : 1, read : 1});
 
var UserFeedEntryModel = mongoose.model('UserFeedEntry', userFeedEntrySchema );

Every route that comes in for GET, POST, PUT and DELETE needs to have the correct content type, which is application/json. Then the next link in the chain is called.

exports.addAPIRouter = function(app, mongoose, stormpath) {
 
     app.get('/*', function(req, res, next) {
         res.contentType('application/json');
         next();
     });
     app.post('/*', function(req, res, next) {
         res.contentType('application/json');
         next();
     });
     app.put('/*', function(req, res, next) {
         res.contentType('application/json');
         next();
     });
     app.delete('/*', function(req, res, next) {
         res.contentType('application/json');
         next();
     });

Now we need to define handlers for each combination of URL/verb. The link to the complete code is available in the resources section and we just show a few examples below. Note the ease with which we can use Stormpath. Furthermore, notice that we have defined /api/v1.0, so the client would actually call /api/v1.0/user/enroll, for example. In the future, if we changed the API, say to 2.0, we could use /api/v2.0. This would have its own router and code, so clients using the v1.0 API would still continue to work.

var router = express.Router();
     
     router.post('/user/enroll', function(req, res) {
         logger.debug('Router for /user/enroll');
         …
     }
     router.get('/feeds', stormpath.apiAuthenticationRequired, function(req, res) {
         logger.debug('Router for /feeds');
         …
     }
     router.put('/feeds/subscribe', 
               stormpath.apiAuthenticationRequired, function(req, res) {
         logger.debug('Router for /feeds');
         …
     }
      app.use('/api/v1.0', router);
}

Starting the server and running tests

Finally, here is a summary of the steps we need to follow to start the server and run the tests.

  • Ensure that the MongoDB instance is running
    • mongod
  • Install the Node libraries
    • npm install
  • Start the REST API server
    • node server.js
  • Run test cases
    • node setup_tests.js
    • jasmine-node create_accounts_error_spec.js
    • jasmine-node create_accounts_spec.js
    • node write_creds.js
    • jasmine-node feed_spec.js

MongoDB University provides excellent free training. There is a course specifically aimed at Node.js developers and the link can be found in the resources section below. The resources section also contains links to good MongoDB data modeling resources.

Resources

HTTP status code definitions

Chad Tindel’s Github Repository

M101JS: MongoDB for Node.js Developers

Data Models

Data Modeling Considerations for MongoDB Applications


Want to learn more MongoDB? Explore our Starter-Kit:

MongoDB Starter-Kit


<< Read Part 1

 


About the Author - Norberto

Norberto Leite is Technical Evangelist at MongoDB. Norberto has been working for the last 5 years on large scalable and distributable application environments, both as advisor and engineer. Prior to MongoDB Norberto served as a Big Data Engineer at Telefonica.

Building your first application with MongoDB: Creating a REST API using the MEAN Stack - Part 1

Updated March 2017 Since this post, other MEAN & MERN stack posts have been written: The Modern Application Stack by Andrew Morgan.

Introduction

In this 2-part blog series, you will learn how to use MongoDB, Mongoose Object Data Mapping (ODM) with Express.js and Node.js. These technologies use a uniform language - JavaScript - providing performance gains in the software and productivity gains for developers.

In this first part, we will describe the basic mechanics of our application and undertake data modeling. In the second part, we will create tests that validate the behavior of our application and then describe how to set-up and run the application.

No prior experience with these technologies is assumed and developers of all skill levels should benefit from this blog series. So, if you have no previous experience using MongoDB, JavaScript or building a REST API, don’t worry - we will cover these topics with enough detail to get you past the simplistic examples one tends to find online, including authentication, structuring code in multiple files, and writing test cases.

Let’s begin by defining the MEAN stack.

What is the MEAN stack?

The MEAN stack can be summarized as follows:

  • M = MongoDB/Mongoose.js: the popular database, and an elegant ODM for node.js.
  • E = Express.js: a lightweight web application framework.
  • A = Angular.js: a robust framework for creating HTML5 and JavaScript-rich web applications.
  • N = Node.js: a server-side JavaScript interpreter.

The MEAN stack is a modern replacement for the LAMP (Linux, Apache, MySQL, PHP/Python) stack that became the popular way for building web applications in the late 1990s.

In our application, we won’t be using Angular.js, as we are not building an HTML user interface. Instead, we are building a REST API which has no user interface, but could instead serve as the basis for any kind of interface, such as a website, an Android application, or an iOS application. You might say we are building our REST API on the ME(a)N stack, but we have no idea how to pronounce that!

What is a REST API?

REST stands for Representational State Transfer. It is a lighter weight alternative to SOAP and WSDL XML-based API protocols.

REST uses a client-server model, where the server is an HTTP server and the client sends HTTP verbs (GET, POST, PUT, DELETE), along with a URL and variable parameters that are URL-encoded. The URL describes the object to act upon and the server replies with a result code and valid JavaScript Object Notation (JSON).

Because the server replies with JSON, it makes the MEAN stack particularly well suited for our application, as all the components are in JavaScript and MongoDB interacts well with JSON. We will see some JSON examples later, when we start defining our Data Models.

The CRUD acronym is often used to describe database operations. CRUD stands for CREATE, READ, UPDATE, and DELETE. These database operations map very nicely to the HTTP verbs, as follows:

  • POST: A client wants to insert or create an object.
  • GET: A client wants to read an object.
  • PUT: A client wants to update an object.
  • DELETE: A client wants to delete an object.

These operations will become clear later when define our API.

Some of the common HTTP result codes that are often used inside REST APIs are as follows:

  • 200 - “OK”.
  • 201 - “Created” (Used with POST).
  • 400 - “Bad Request” (Perhaps missing required parameters).
  • 401 - “Unauthorized” (Missing authentication parameters).
  • 403 - “Forbidden” (You were authenticated but lacking required privileges).
  • 404 - “Not Found”.

A complete description can be found in the RFC document, listed in the resources section at the end of this blog. We will use these result codes in our application and you will see some examples shortly.

Why Are We Starting with a REST API?

Developing a REST API enables us to create a foundation upon which we can build all other applications. As previously mentioned, these applications may be web-based or designed for specific platforms, such as Android or iOS.

Today, there are also many companies that are building applications that do not use an HTTP or web interface, such as Uber, WhatsApp, Postmates, and Wash.io. A REST API also makes it easy to implement other interfaces or applications over time, turning the initial project from a single application into a powerful platform.

Creating our REST API

The application that we will be building will be an RSS Aggregator, similar to Google Reader. Our application will have two main components:

  1. The REST API
  2. Feed Grabber (similar to Google Reader)

In this blog series we will focus on building the REST API, and we will not cover the intricacies of RSS feeds. However, code for Feed Grabber is available in a github repository, listed in the resources section of this blog.

Let’s now describe the process we will follow in building our API. We will begin by defining the data model for the following requirements:

  • Store user information in user accounts
  • Track RSS feeds that need to be monitored
  • Pull feed entries into the database
  • Track user feed subscriptions
  • Track which feed entry a user has already read

Users will need to be able to do the following:

  • Create an account
  • Subscribe/unsubscribe to feeds
  • Read feed entries
  • Mark feeds/entries as read or unread

Modeling Our Data

An in-depth discussion on data modeling in MongoDB is beyond the scope of this article, so see the references section for good resources on this topic.

We will need 4 collections to manage this information:

  • Feed collection
  • Feed entry collection
  • User collection
  • User-feed-entry mapping collection

Let’s take a closer look at each of these collections.

Feed Collection

Lets now look at some code. To model a feed collection, we can use the following JSON document:

{
    "_id": ObjectId("523b1153a2aa6a3233a913f8"),
    "requiresAuthentication": false,
    "modifiedDate": ISODate("2014-08-29T17:40:22Z"),
    "permanentlyRemoved": false,
    "feedURL": "http://feeds.feedburner.com/eater/nyc",
    "title": "Eater NY",
    "bozoBitSet": false,
    "enabled": true,
    "etag": "4bL78iLSZud2iXd/vd10mYC32BE",
    "link": "http://ny.eater.com/",
    "permanentRedirectURL": null,
    "description": "The New York City Restaurant, Bar, and Nightlife Blog”
}

If you are familiar with relational database technology, then you will know about databases, tables, rows and columns. In MongoDB, there is a mapping to most of these Relational concepts. At the highest level, a MongoDB deployment supports one or more databases. A database contains one or more collections, which are the similar to tables in a relational database. Collections hold documents. Each document in a collection is, at a highest level, similar to a row in a relational table. However, documents do not follow a fixed schema with pre-defined columns of simple values. Instead, each document consists of one or more key-value pairs where the value can be simple (e.g., a date), or more sophisticated (e.g., an array of address objects).

Our JSON document above is an example of one RSS feed for the Eater Blog, which tracks information about restaurants in New York City. We can see that there are a number of different fields but the key ones that our client application may be interested in include the URL of the feed and the feed description. The description is important so that if we create a mobile application, it would show a nice summary of the feed.

The remaining fields in our JSON document are for internal use. A very important field is _id. In MongoDB, every document must have a field called _id. If you create a document without this field, at the point where you save the document, MongoDB will create it for you. In MongoDB, this field is a primary key and MongoDB will guarantee that within a collection, this value is unique.

Feed Entry Collection

After feeds, we want to track feed entries. Here is an example of a document in the feed entry collection:

{
    "_id": ObjectId("523b1153a2aa6a3233a91412"),
    "description": "Buzzfeed asked a bunch of people...”,
    "title": "Cronut Mania: Buzzfeed asked a bunch of people...",
    "summary": "Buzzfeed asked a bunch of people that were...”,
    "content": [{
        "base": "http://ny.eater.com/",
        "type": "text/html",
        "value": ”LOTS OF HTML HERE ",
        "language": "en"
    }],
    "entryID": "tag:ny.eater.com,2013://4.560508",
    "publishedDate": ISODate("2013-09-17T20:45:20Z"),
    "link": "http://ny.eater.com/archives/2013/09/cronut_mania_41.php",
    "feedID": ObjectId("523b1153a2aa6a3233a913f8")
}

Again, we can see that there is a _id field. There are also some other fields, such as description, title and summary. For the content field, note that we are using an array, and the array is also storing a document. MongoDB allows us to store sub-documents in this way and this can be very useful in some situations, where we want to hold all information together. The entryID field uses the tag format to avoid duplicate feed entries. Notice also the feedID field that is of type ObjectId - the value is the _id of the Eater Blog document, described earlier. This provides a referential model, similar to a foreign key in a relational database. So, if we were interested to see the feed document associated with this ObjectId, we could take the value 523b1153a2aa6a3233a913f8 and query the feed collection on _id, and it would return the Eater Blog document.

User Collection

Here is the document we could use to keep track of users:

{
     "_id" : ObjectId("54ad6c3ae764de42070b27b1"),
     "active" : true,
     "email" : "testuser1@example.com",
     "firstName" : "Test",
     "lastName" : "User1",
     "sp_api_key_id" : "6YQB0A8VXM0X8RVDPPLRHBI7J",
     "sp_api_key_secret" : "veBw/YFx56Dl0bbiVEpvbjF”,
     "lastLogin" : ISODate("2015-01-07T17:26:18.996Z"),
     "created" : ISODate("2015-01-07T17:26:18.995Z"),
     "subs" : [ ObjectId("523b1153a2aa6a3233a913f8"),
                                ObjectId("54b563c3a50a190b50f4d63b") ],
}

A user has an email address, first name and last name. There is also an sp_api_key_id and sp_api_key_secret - we will use these later with Stormpath, a user management API. The last field, called subs, is a subscription array. The subs field tells us which feeds this user is subscribed-to.

User-Feed-Entry Mapping Collection

The last collection allows us to map users to feeds and to track which feeds have been read.

{
     "_id" : ObjectId("523b2fcc054b1b8c579bdb82"),
     "read" : true,
     "user_id" : ObjectId("54ad6c3ae764de42070b27b1"),
     "feed_entry_id" : ObjectId("523b1153a2aa6a3233a91412"),
     "feed_id" : ObjectId("523b1153a2aa6a3233a913f8")
}

We use a Boolean (true/false) to mark the feed as read or unread.

Functional Requirements for the REST API

As previously mentioned, users need to be able to do the following:

  • Create an account.
  • Subscribe/unsubscribe to feeds.
  • Read feed entries.
  • Mark feeds/entries as read or unread.

Additionally, a user should be able to reset their password.

The following table shows how these operations can be mapped to HTTP routes and verbs.

Route Verb Description Variables
/user/enroll POST Register a new user firstName
lastName
email
password
/user/resetPassword PUT Password Reset email
/feeds GET Get feed subscriptions for each user with description and unread count
/feeds/subscribe PUT Subscribe to a new feed feedURL
/feeds/entries GET Get all entries for feeds the user is subscribed to
/feeds/&ltfeedid&gt/entries GET Get all entries for a specific feed
/feeds/&ltfeedid&gt PUT Mark all entries for a specific feed as read or unread read = &lttrue | false&gt
/feeds/&ltfeedid&gt/entries/&ltentryid&gt PUT Mark a specific entry as either read or unread read = &lttrue | false&gt
/feeds/&ltfeedid&gt DELETE Unsubscribe from this particular feed

In a production environment, the use of secure HTTP (HTTPS) would be the standard approach when sending sensitive details, such as passwords.

Real World Authentication with Stormpath

In robust real-world applications it is important to provide user authentication. We need a secure approach to manage users, passwords, and password resets.

There are a number of ways we could authenticate users for our application. One possibility is to use Node.js with the Passport Plugin, which could be useful if we wanted to authenticate with social media accounts, such as Facebook or Twitter. However, another possibility is to use Stormpath. Stormpath provides User Management as a Service and supports authentication and authorization through API keys. Basically, Stormpath maintains a database of user details and passwords and a client application REST API would call the Stormpath REST API to perform user authentication.

The following diagram shows the flow of requests and responses using Stormpath.

In detail, Stormpath will provide a secret key for each “Application” that is defined with their service. For example, we could define an application as “Reader Production” or “Reader Test”. This could be very useful when we are still developing and testing our application, as we may be frequently adding and deleting test users. Stormpath will also provide an API Key Properties file.

Stormpath also allows us to define password strength requirements for each application, such as:

  • Must have >= 8 characters.
  • Must include lowercase and uppercase.
  • Must include a number.
  • Must include a non-alphabetic character

Stormpath keeps track of all of our users and assigns them API keys, which we can use for our REST API authentication. This greatly simplifies the task of building our application, as we don’t have to focus on writing code for authenticating users.

Node.js

Node.js is a runtime environment for server-side and network applications. Node.js uses JavaScript and it is available for many different platforms, such as Linux, Microsoft Windows and Apple OS X.

Node.js applications are built using many library modules and there is a very rich ecosystem of libraries available, some of which we will use to build our application.

To start using Node.js, we need to define a package.json file describing our application and all of its library dependencies.

The Node.js Package Manager installs copies of the libraries in a subdirectory, called node_modules/, in the application directory. This has benefits, as it isolates the library versions for each application and so avoids code compatibility problems if the libraries were to be installed in a standard system location, such as /usr/lib, for example.

The command npm install will create the node_modules/ directory, with all of the required libraries.

Here is the JavaScript from our package.json file:

{
    "name": "reader-api",
    "main": "server.js",
    "dependencies": {
    "express" : "~4.10.0",
    "stormpath" : "~0.7.5", "express-stormpath" : "~0.5.9",
    "mongodb" : "~1.4.26”, "mongoose" : "~3.8.0",
    "body-parser" : "~1.10.0”, "method-override" : "~2.3.0",
    "morgan" : "~1.5.0”, "winston" : "~0.8.3”, "express-winston" : "~0.2.9",
    "validator" : "~3.27.0",
    "path" : "~0.4.9",
    "errorhandler" : "~1.3.0",
    "frisby" : "~0.8.3",
    "jasmine-node" : "~1.14.5",
    "async" : "~0.9.0"
    }
}

Our application is called reader-api. The main file is called server.js. Then we have a list of the dependent libraries and their versions. Some of these libraries are designed for parsing the HTTP queries. The test harness we will use is called frisby. The jasmine-node is used to run frisby scripts.

One library that is particularly important is async. If you have never used node.js, it is important to understand that node.js is designed to be asynchronous. So, any function which does blocking input/output (I/O), such as reading from a socket or querying a database, will take a callback function as the last parameter, and then continue with the control flow, only returning to that callback function once the blocking operation has completed. Let’s look at the following simple example to demonstrate this.

function foo() {
    someAsyncFunction(params, function(err, results) {
        console.log(“one”);
    });
    console.log(“two”);
}

In the above example, we may think that the output would be:

one two

but in fact it might be:

two one

because the line that prints “one” might happen later, asynchronously, in the callback. We say “might” because if conditions are just right, “one” might print before “two”. This element of uncertainty in asynchronous programming is called non-deterministic execution. For many programming tasks, this is actually desirable and permits high performance, but clearly there are times when we want to execute functions in a particular order. The following example shows how we could use the async library to achieve the desired result of printing the numbers in the correct order:

actionArray = [
    function one(cb) {
        someAsyncFunction(params, function(err, results) {
            if (err) {
                cb(new Error(“There was an error”));
            }
            console.log(“one”);
            cb(null);
        });
    },
    function two(cb) {
        console.log(“two”);
        cb(null);
    }
]
 
async.series(actionArray);
 

In the above code, we are guaranteed that function two will only be called after function one has completed.

Wrapping Up Part 1

Now that we have seen the basic mechanics of node.js and async function setup, we are ready to move on. Rather than move into creating the application, we will instead start by creating tests that validate the behavior of the application. This approach is called test-driven development and has two very good features:

  • It helps the developer really understand how data and functions are consumed and often exposes subtle needs like the ability to return 2 or more things in an array instead of just one thing.
  • By writing tests before building the application, the paradigm becomes “broken / unimplemented until proven tested OK” instead of “assumed to be working until a test fails.” The former is a “safer” way to keep the code healthy.


Learn more MongoDB by exploring our Starter-Kit:

MongoDB Starter-Kit


Read Part 2 >>

About the Author - Norberto

Norberto Leite is Technical Evangelist at MongoDB. Norberto has been working for the last 5 years on large scalable and distributable application environments, both as advisor and engineer. Prior to MongoDB Norberto served as a Big Data Engineer at Telefonica.