Simplifying MongoDB Testing with Mongo Orchestration

< View all blog posts
MongoDB
January 23, 2015
Category: Technical

Introduction

MongoDB is not only a database, it's also a rich suite of database drivers used to connect various languages to the database. When we develop those language drivers, we need to test extensively to make sure that the language drivers work with a vast variety of different MongoDB configurations. This post is about internal tools we created to make that testing much easier.

As part of the testing, we need to start, stop and reconfigure MongoDB many times, automatically. There’s no part of a software project more highly esteemed, anticipated, nor relished than the process management code in your test suite. This is only one reason why we as developers love testing so much!

If only this were the truth. In the end, software that requires the most sophisticated tests, such as a distributed system, has test suites filled with fickle process management code, leaving the developer to wonder if test failures mean bugs or simply represent noise from temperamental tests.

The complexity and nuisance only worsen when we consider writing tests in more than one language, each suite of tests requiring an interface to the processes it requires to run. MongoDB drivers, for example, need to be tested against a variety of MongoDB topologies. These topologies may even change over time: a replica set gets a new member, the primary from a replica set fails over, etc.

Originally, each driver’s test suite implemented its own way of spawning, manipulating, and shutting down MongoDB. Although demanding a lot of duplicated effort, it allowed each test suite to have its own set of bugs. Additionally, there was no guarantee one driver had the same tests as the others. Indeed, drivers did not have uniform coverage of all scenarios, since the test suites for each were developed separately. In some cases, one driver was unable to test the same situations as another, due to discrepancies in their process management code for the MongoDB server. Ultimately some drivers had different behavior than others under the same conditions because of disparities in their testing.

But that is all history. Now we’re using Mongo Orchestration.

What is Mongo Orchestration?

Mongo Orchestration (MO) is an HTTP server providing a RESTful interface to MongoDB process management running on the same machine. JSON documents specify a topology to create, and Mongo Orchestration makes it so. It also provides a way to manipulate each process individually, so that we can enhance (or utterly destroy) any kind of cluster we have created previously. Recently, we have open-sourced this project so that anyone writing applications close to MongoDB can take advantage of what it has to offer.

An Example

Let’s suppose that you’re writing a set of tools for storing files on MongoDB. The aim is to provide an API for programmers to leverage in their own applications, using their own installations of MongoDB. Because this API will be implemented in each language we hope to target, we’d like to test the API for each explicitly. Also, since we love testing so much and want to do a good job, we’ll want to test against each kind of MongoDB topology and perhaps even hit a few cases where the topology is changing state. There are a few options for how we can manipulate MongoDB:

  1. Write and re-write process management code for each language
  2. Use the deployment features of MongoDB Management Service (MMS)
  3. Set up some kind of server, perhaps exposing a RESTful service that manages everything for us, making us very happy indeed. Mongo Orchestration, for example
As we’ve learned, Option 1 requires enormous, duplicated effort, and each development team may not agree on what tests to write or bugs to create. Option 2 would work, but it would still require us to write code to iterate over all the topologies and configurations we want to test. The winning solution is Mongo Orchestration. Not only does it allow us to create each kind of MongoDB topology we aim to test, but it also allows us to manipulate these clusters easily. This also allows us to accept management of MongoDB as a black box to be used by the various test frameworks for each language our software is written in. All that needs to happen now is to construct JSON requests to be sent to MO to control MongoDB as we need.

We have sagely chosen to use Mongo Orchestration to drive MongoDB process management for our tests. Now let’s explore how our testing will work from the point of view of our Python client.

1. Setup Mongo Orchestration

The first item on our agenda is to get Mongo Orchestration running. Installation is easy:

pip install mongo-orchestration

We can also define where various installations of MongoDB live on our system inside a configuration file. This step isn’t strictly necessary but is useful when testing multiple versions of MongoDB. Because the system under test is so tightly coupled to the server itself, testing multiple versions is probably a good idea:

mongo-orchestration.config

{
  "releases": {
    "27-nightly": "/var/mongodb/versions/master-nightly/bin",
    "26-release": "/var/mongodb/versions/26-release/bin",
    "24-release": "/var/mongodb/versions/24-release/bin",
    "22-release": "/var/mongodb/versions/22-release/bin",
    "20-release": "/var/mongodb/versions/20-release/bin"
  }
}

Now we can start Mongo Orchestration:

mongo-orchestration start -f mongo-orchestration.config

2. Hook into the tests

The next step is to send some requests to Mongo Orchestration. Since we’re using Python as our language of choice in this example, we’ll be using the requests module to interact with the Mongo Orchestration server. First off, we need to start MongoDB. This will probably be done in our setUp method in our TestCase. We also want to take care of tearing down MongoDB when we’re done with it, which can be done in the tearDown method:

import unittest
import requests


class ExampleTestCase(unittest.TestCase):
    def setUp(self):
        # Set up a single mongod instance with MO.
        info = requests.post("http://localhost:8889/servers",            
                             name="mongod")
        self.server_id = info.get('server_id')
        self.conn_string = info.get('mongodb_uri')

    def tearDown(self):
        requests.post("http://localhost:8889/servers/%s"
                      % self.server_id,
                      action='stop')

3. Profit

The quantity of logic and debugging saved by the encapsulation of starting a single mongod with the request on lines 8-9 above can hardly be overstated. By the time that info is populated with the server’s information, that mongod process has started with the same options as in any other test and is ready to accept requests. This helps to ensure that each implementation of the API is tested under the same conditions.

Unifying the test process for each implementation of your API can even be taken a step further. At MongoDB, for example, we’re writing tests that describe how a driver’s perception of a MongoDB cluster changes with the cluster itself. Each test is implemented as a JSON document that describes several “phases,” each phase containing instructions for manipulating the cluster and describing how the driver should react. In other words, part of the test specification itself can be sent to Mongo Orchestration for setting everything up correctly! The general flow of a test will look like this:

  1. Set up Mongo Orchestration.
  2. Parse the JSON test. For each phase:
  3. a. Send the “input” as JSON to Mongo Orchestration. Mongo Orchestration takes care of changing the MongoDB cluster itself

    b. Assert that the state of the driver matches the “output” of the phase. For instance, the driver should know that a new machine has become primary in a replica set while the old primary has left the set entirely

  4. Tear down Mongo Orchestration
This flow resembles the Server Discovery and Monitoring (SDAM) tests that the MongoDB Drivers team has now, with the exception that this flow tests against real running MongoDB processes rather than using mocked server responses.

Conclusion

We’ve taken a look at the confusion and frustration that can occur when duplicating process management code between projects: the duplicated effort, unique deficiencies, and respective bugs in separately maintained process management code are all reasons to try something better. We’ve also discussed briefly how one could automate tests that run on top of a changing MongoDB topology. Mongo Orchestration is a valuable tool for testing just about any software running on top of MongoDB that can save you time, trouble, and tears.

I’m maintaining Mongo Orchestration on Github. Please feel free to check it out and use it for your own testing and experiments. The wiki holds all documentation, including an index of REST calls with examples. For a real-world example of using Mongo Orchestration in a test suite, I invite you to view the tests for Mongo Connector, another project I maintain. Click below to access Mongo Orchestration on Github:

Mongo Orchestration on GitHub

Happy Orchestrating!

About Luke Lovett

Luke Lovett is a software engineer at MongoDB, where he works on the Python team. He maintains Mongo Orchestration, an HTTP server for managing MongoDB processes; Mongo Connector, a command-line tool and Python library for handling events from the MongoDB oplog; and the Hadoop connector for MongoDB. When he’s not fulfilling his passion for testing software and finding bugs, Luke sometimes creates bugs so he can find them later.

comments powered by Disqus