GIANT Stories at MongoDB

Modern Distributed Application Deployment with Kubernetes and MongoDB Atlas

Jay Gordon

Technical, Cloud

Storytelling is one of the parts of being a Developer Advocate that I enjoy. Sometimes the stories are about the special moments when the team comes together to keep a system running or build it faster. But there are less than glorious tales to be told about the software deployments I’ve been involved in. And for situations where we needed to deploy several times a day, now we are talking nightmares.

For some time, I worked at a company that believed that deploying to production several times a day was ideal for project velocity. Our team was working to ensure that advertising software across our media platform was always being updated and released. One of the issues was a lack of real automation in the process of applying new code to our application servers.

What both ops and development teams had in common was a desire for improved ease and agility around application and configuration deployments. In this article, I’ll present some of my experiences and cover how MongoDB Atlas and Kubernetes can be leveraged together to simplify the process of deploying and managing applications and their underlying dependencies.

Let's talk about how a typical software deployment unfolded:

  1. The developer would send in a ticket asking for the deployment
  2. The developer and I would agree upon a time to deploy the latest software revision
  3. We would modify an existing bash script with the appropriate git repository version info
  4. We’d need to manually back up the old deployment
  5. We’d need to manually create a backup of our current database
  6. We’d watch the bash script perform this "Deploy" on about six servers in parallel
  7. Wave a dead chicken over my keyboard

Some of these deployments would fail, requiring a return to the previous version of the application code. This process to "rollback" to a prior version would involve me manually copying the repository to the older version, performing manual database restores, and finally confirming with the team that used this system that all was working properly. It was a real mess and I really wasn't in a position to change it.

I eventually moved into a position which gave me greater visibility into what other teams of developers, specifically those in the open source space, were doing for software deployments. I noticed that — surprise! — people were no longer interested in doing the same work over and over again.

Developers and their supporting ops teams have been given keys to a whole new world in the last few years by utilizing containers and automation platforms. Rather than doing manual work required to produce the environment that your app will live in, you can deploy applications quickly thanks to tools like Kubernetes.

What's Kubernetes?

Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. Kubernetes can help reduce the amount of work your team will have to do when deploying your application. Along with MongoDB Atlas, you can build scalable and resilient applications that stand up to high traffic or can easily be scaled down to reduce costs. Kubernetes runs just about anywhere and can use almost any infrastructure. If you're using a public cloud, a hybrid cloud or even a bare metal solution, you can leverage Kubernetes to quickly deploy and scale your applications.

The Google Kubernetes Engine is built into the Google Cloud Platform and helps you quickly deploy your containerized applications.

For the purposes of this tutorial, I will upload our image to GCP and then deploy to a Kubernetes cluster so I can quickly scale up or down our application as needed. When I create new versions of our app or make incremental changes, I can simply create a new image and deploy again with Kubernetes.

Why Atlas with Kubernetes?

By using these tools together for your MongoDB Application, you can quickly produce and deploy applications without worrying much about infrastructure management. Atlas provides you with a persistent data-store for your application data without the need to manage the actual database software, replication, upgrades, or monitoring. All of these features are delivered out of the box, allowing you to build and then deploy quickly.

In this tutorial, I will build a MongoDB Atlas cluster where our data will live for a simple Node.js application. I will then turn the app and configuration data for Atlas into a container-ready image with Docker.

MongoDB Atlas is available across most regions on GCP so no matter where your application lives, you can keep your data close by (or distributed) across the cloud.

Figure 1: MongoDB Atlas runs in most GCP regions

Requirements

To follow along with this tutorial, users will need some of the following requirements to get started:

First, I will download the repository for the code I will use. In this case, it's a basic record keeping app using MongoDB, Express, React, and Node (MERN).

bash-3.2$ git clone git@github.com:cefjoeii/mern-crud.git
Cloning into 'mern-crud'...
remote: Counting objects: 326, done.
remote: Total 326 (delta 0), reused 0 (delta 0), pack-reused 326
Receiving objects: 100% (326/326), 3.26 MiB | 2.40 MiB/s, done.
Resolving deltas: 100% (137/137), done.

cd mern-crud

Next, I will npm install and get all the required npm packages installed for working with our app:

> uws@9.14.0 install /Users/jaygordon/work/mern-crud/node_modules/uws
> node-gyp rebuild > build_log.txt 2>&1 || exit 0

Selecting your GCP Region for Atlas

Each GCP region includes a set number of independent zones. Each zone has power, cooling, networking, and control planes that are isolated from other zones. For regions that have at least three zones (3Z), Atlas deploys clusters across three zones. For regions that only have two zones (2Z), Atlas deploys clusters across two zones.

The Atlas Add New Cluster form marks regions that support 3Z clusters as Recommended, as they provide higher availability. If your preferred region only has two zones, consider enabling cross-region replication and placing a replica set member in another region to increase the likelihood that your cluster will be available during partial region outages.

The number of zones in a region has no effect on the number of MongoDB nodes Atlas can deploy. MongoDB Atlas clusters are always made of replica sets with a minimum of three MongoDB nodes.

For general information on GCP regions and zones, see the Google documentation on regions and zones.

Create Cluster and Add a User

In the provided image below you can see I have selected the Cloud Provider "Google Cloud Platform." Next, I selected an instance size, in this case an M10. Deployments using M10 instances are ideal for development. If I were to take this application to production immediately, I may want to consider using an M30 deployment. Since this is a demo, an M10 is sufficient for our application. For a full view of all of the cluster sizes, check out the Atlas pricing page. Once I’ve completed these steps, I can click the "Confirm & Deploy" button. Atlas will spin up my deployment automatically in a few minutes.

Let’s create a username and password for our database that our Kubernetes deployed application will use to access MongoDB.

  • Click "Security" at the top of the page.
  • Click "MongoDB Users"
  • Click "Add New User"
  • Click "Show Advanced Options"
  • We'll then add a user "mernuser" for our mern-crud app that only has access to a database named "mern-crud" and give it a complex password. We'll specify readWrite privileges for this user:

Click "Add User"

Your database is now created and your user is added. You still need our connection string and to whitelist access via the network.

Connection String

Get your connection string by clicking "Clusters" and then clicking "CONNECT" next to your cluster details in your Atlas admin panel. After selecting connect, you are provided several options to use to connect to your cluster. Click "connect your application."

Options for the 3.6 or the 3.4 versions of the MongoDB driver are given. I built mine using the 3.4 driver, so I will just select the connection string for this version.

I will typically paste this into an editor and then modify the info to match my application credentials and my database name:

I will now add this to the app's database configuration file and save it.

Next, I will package this up into an image with Docker and ship it to Google Kubernetes Engine!

Docker and Google Kubernetes Engine

Get started by creating an account at Google Cloud, then follow the quickstart to create a Google Kubernetes Project.

Once your project is created, you can find it within the Google Cloud Platform control panel:

It's time to create a container on your local workstation:

Set the PROJECT_ID environment variable in your shell by retrieving the pre- configured project ID on gcloud with the command below:

export PROJECT_ID="jaygordon-mongodb"
gcloud config set project $PROJECT_ID
gcloud config set compute/zone us-central1-b

Next, place a Dockerfile in the root of your repository with the following:

FROM node:boron

RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app

COPY . /usr/src/app

EXPOSE 3000

CMD [npm, start]

To build the container image of this application and tag it for uploading, run the following command:

bash-3.2$ docker build -t gcr.io/${PROJECT_ID}/mern-crud:v1 .
Sending build context to Docker daemon  40.66MB
Successfully built b8c5be5def8f
Successfully tagged gcr.io/jgordon-gc/mern-crud:v1

Upload the container image to the Container Registry so we can deploy to it:

Successfully tagged gcr.io/jaygordon-mongodb/mern-crud:v1
bash-3.2$ gcloud docker -- push gcr.io/${PROJECT_ID}/mern-crud:v1The push refers to repository [gcr.io/jaygordon-mongodb/mern-crud]

Next, I will test it locally on my workstation to make sure the app loads:

docker run --rm -p 3000:3000 gcr.io/${PROJECT_ID}/mern-crud:v1
> mern-crud@0.1.0 start /usr/src/app
> node server
Listening on port 3000

Great — pointing my browser to http://localhost:3000 brings me to the site. Now it's time to create a kubernetes cluster and deploy our application to it.

Build Your Cluster With Google Kubernetes Engine

I will be using the Google Cloud Shell within the Google Cloud control panel to manage my deployment. The cloud shell comes with all required applications and tools installed to allow you to deploy the Docker image I uploaded to the image registry without installing any additional software on my local workstation.

Now I will create the kubernetes cluster where the image will be deployed that will help bring our application to production. I will include three nodes to ensure uptime of our app.

Set up our environment first:

export PROJECT_ID="jaygordon-mongodb"
gcloud config set project $PROJECT_ID
gcloud config set compute/zone us-central1-b

Launch the cluster

gcloud container clusters create mern-crud --num-nodes=3

When completed, you will have a three node kubernetes cluster visible in your control panel. After a few minutes, the console will respond with the following output:

Creating cluster mern-crud...done.
Created [https://container.googleapis.com/v1/projects/jaygordon-mongodb/zones/us-central1-b/clusters/mern-crud].
To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/us-central1-b/mern-crud?project=jaygordon-mongodb
kubeconfig entry generated for mern-crud.
NAME       LOCATION       MASTER_VERSION  MASTER_IP       MACHINE_TYPE   NODE_VERSION  NUM_NODES  STATUS
mern-crud  us-central1-b  1.8.7-gke.1     35.225.138.208  n1-standard-1  1.8.7-gke.1   3          RUNNING

Just a few more steps left. Now we'll deploy our app with kubectl to our cluster from the Google Cloud Shell:

kubectl run mern-crud --image=gcr.io/${PROJECT_ID}/mern-crud:v1 --port 3000

The output when completed should be:

jay_gordon@jaygordon-mongodb:~$ kubectl run mern-crud --image=gcr.io/${PROJECT_ID}/mern-crud:v1 --port 3000
deployment "mern-crud" created

Now review the application deployment status:

jay_gordon@jaygordon-mongodb:~$ kubectl get pods
NAME                         READY     STATUS    RESTARTS   AGE
mern-crud-6b96b59dfd-4kqrr   1/1       Running   0          1m
jay_gordon@jaygordon-mongodb:~$

We'll create a load balancer for the three nodes in the cluster so they can be served properly to the web for our application:

jay_gordon@jaygordon-mongodb:~$ kubectl expose deployment mern-crud --type=LoadBalancer --port 80 --target-port 3000 
service "mern-crud" exposed

Now get the IP of the loadbalancer so if needed, it can be bound to a DNS name and you can go live!

jay_gordon@jaygordon-mongodb:~$ kubectl get service
NAME         TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)        AGE
kubernetes   ClusterIP      10.27.240.1              443/TCP        11m
mern-crud    LoadBalancer   10.27.243.208   35.226.15.67   80:30684/TCP   2m

A quick curl test shows me that my app is online!

bash-3.2$ curl -v 35.226.15.67
* Rebuilt URL to: 35.226.15.67/
*   Trying 35.226.15.67...
* TCP_NODELAY set
* Connected to 35.226.15.67 (35.226.15.67) port 80 (#0)
> GET / HTTP/1.1
> Host: 35.226.15.67
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< X-Powered-By: Express

I have added some test data and as we can see, it's part of my deployed application via Kubernetes to GCP and storing my persistent data in MongoDB Atlas.

When I am done working with the Kubernetes cluster, I can destroy it easily:

gcloud container clusters delete mern-crud

What's Next?

You've now got all the tools in front of you to build something HUGE with MongoDB Atlas and Kubernetes.

Check out the rest of the Google Kubernetes Engine's tutorials for more information on how to build applications with Kubernetes. For more information on MongoDB Atlas, click here.

Have more questions? Join the MongoDB Community Slack!

Continue to learn via high quality, technical talks, workshops, and hands-on tutorials. Join us at MongoDB World.

Optimizing for Fast, Responsive Reads with Cross-Region Replication in MongoDB Atlas

MongoDB Atlas customers can enable cross-region replication for multi-region fault tolerance and fast, responsive reads.

  • Improved availability guarantees can be achieved by distributing replica set members across multiple regions. These secondaries will participate in the automated election and failover process should the primary (or the cloud region containing the primary) go offline.
  • Read-only replica set members allow customers to optimize for local reads (minimize read latency) across different geographic regions using a single MongoDB deployment. These replica set members will not participate in the election and failover process and can never be elected to a primary replica set member.

In this post, we’ll dive a little deeper into optimizing for local reads using cross-region replication and walk you through the necessary configuration steps on an environment running on AWS.

Primer on read preference

Read preference determines how MongoDB clients route read operations to the members of a replica set. By default, an application directs its read operations to the replica set primary. By specifying the read preference, users can:

  • Enable local reads for geographically distributed users. Users from California, for example, can read data from a replica located locally for a more responsive experience
  • Allow read-only access to the database during failover scenarios

A read replica is simply an instance of the database that provides the replicated data from the oplog; clients will not write to a read replica.

With MongoDB Atlas, we can easily distribute read replicas across multiple cloud regions, allowing us to expand our application's data beyond the region containing our replica set primary in just a few clicks.

To enable local reads and increase the read throughput to our application, we simply need to modify the read preference via the MongoDB drivers.

Enabling read replicas in MongoDB Atlas

We can enable read replicas for a new or existing MongoDB paid cluster in the Atlas UI. To begin, we can click on the cluster “configuration” button and then find the link named "Enable cross-region configuration options."

When we click this, we’ll be presented with an option to select the type of cross-replication we want. We'll choose deploy read-only replicas:

As you can see above, we have our preferred region (the region containing our replica set primary) set to AWS, us-east-1 (Virginia) with the default three nodes. We can add regions to our cluster configuration based on where we think other users of our application might be concentrated. In this case, we will add additional nodes in us-west-1 (Northern California) and eu-west-1 (Ireland), providing us with read replicas to serve local users.

Note that all writes will still go to the primary in our preferred region, and reads from the secondaries in the regions we’ve added will be eventually consistent.

We’ll click "Confirm and Deploy", which will deploy our multi-region cluster.

Our default connection string will now include these read replicas. We can go to the "Connect" button and find our full connection string to access our cluster:

When the deployment of the cluster completes, we will be ready to distribute our application's data reads across multiple regions using the MongoDB drivers. We can specifically configure readPreference within our connection string to send clients to the "closest replicas". For example, the Node native MongoDB Driver permits us to specify our preference:

readPreference Specifies the replica set read preference for this connection.


The read preference values are the following:

For our app, if we want to ensure the read preference in our connection string is set to the nearest MongoDB replica, we would configure it as follows:

mongodb://admin:<PASSWORD>@cluster0-shard-00-00-bywqq.mongodb.net:27017,cluster0-shard-00-01-bywqq.mongodb.net:27017,cluster0-shard-00-02-bywqq.mongodb.net:27017,cluster0-shard-00-03-bywqq.mongodb.net:27017,cluster0-shard-00-04-bywqq.mongodb.net:27017/test?ssl=true&replicaSet=Cluster0-shard-0&authSource=admin?readPreference=nearest

Security and Connectivity (on AWS)

MongoDB Atlas allows us to peer our application server's VPC directly to our MongoDB Atlas VPC within the same region. This permits us to reduce the network exposure to the internet and allows us to use native AWS Security Groups or CIDR blocks. You can review how to configure VPC Peering here.

A note on VPCs for cross-region nodes:

At this time, MongoDB Atlas does not support VPC peering across regions. If you want to grant clients in one cloud region read or write access to database instances in another cloud region, you would need to permit the clients’ public IP addresses to access your database deployment via IP whitelisting.

With cross-region replication and read-only replicas enabled, your application will now be capable of providing fast, responsive access to data from any number of regions.


Get started today with a free 512 MB database managed by MongoDB Atlas here.

Improving MongoDB Performance with Automatically Generated Index Suggestions

Jay Gordon

Technical, Cloud

Beyond good data modeling, there are a few processes that teams responsible for optimizing query performance can leverage: looking for COLLSCANS in logs, analyzing explain results, or relying on third-party tools. While these efforts may help you resolve some of the problems you’re noticing, they often require digging into documentation, time, and money, all the while your application remains bogged down with issues.

MongoDB Atlas, the fully managed database service, helps you resolve performance issues with a greater level of ease by providing you with tools to ensure that your data is accessed as efficiently as possible. This post will provide a basic overview of how to access the MongoDB Atlas Performance Advisor, a tool that reviews your queries for up to two weeks and provides recommended indexes where appropriate.

Getting Started

This short tutorial makes use of the following:

  • A demo data set generated with mgodatagen
  • A dedicated MongoDB Atlas cluster (the Performance Advisor is available for M10s or larger)
  • MongoDB shell install (to create indexes)

My database has two million documents in two separate collections:

If an application tries to access these documents without the right indexes in place, a collection scan will take place. The database will scan the full collection to find the required documents, and any documents that are not in memory are read from disk. This can dramatically reduce performance and cause your application to respond slower than expected.

Case in point, when I try to run an unindexed query against my collections, MongoDB Atlas will automatically create an alert indicating that the query is not well targeted.

Reviewing Performance Advisor

The Performance Advisor monitors slow-running queries (anything that takes longer than 100 milliseconds to execute) and suggests new indexes to improve query performance.

To access this tool, go to your Atlas control panel and click your cluster's name. You’ll then find "Performance Advisor" at the top.

Click the link and you'll be taken to the page where you'll see any relevant index recommendations, based on the fixed time period at the top of the page.

In this example, I will review the performance of my queries from the last 24 hours. The Performance Advisor provides me with some recommendations on how to improve the speed of my slow queries:

It looks like the test collection with the field "name" could use an index. We can review the specific changes to be made by clicking the "More Info" button.

I can copy the contents of this recommendation and paste it into my MongoDB Shell to create the recommended index. You’ll notice a special option, { background: true }, is passed with the createIndex command. Using this command ensures that index creation does not block any operations. If you’re building new indexes on production systems, I highly recommend you read more about index build operations.

Now that the recommended index is created, I can review my application's performance and see if it meets the requirements of my users. The Atlas alerts I received earlier have been resolved, which is a good sign:

Noticeable slowdowns in performance from unindexed queries damage the user experience of your application, which may result in reduced engagement or customer attrition. The Performance Advisor in MongoDB Atlas gives you a simple and cost-efficient way to ensure that you’re getting the most out of the resources you’ve provisioned.

To get started, sign up for MongoDB Atlas and deploy a cluster in minutes.

Migrating your data from DynamoDB to MongoDB Atlas

Jay Gordon

Cloud

There may be a number of reasons you are looking to migrate from DynamoDB to MongoDB Atlas. While DynamoDB may be a good choice for a set of specific use cases, many developers prefer solutions that reduce the need for client-side code or additional technologies as requirements become more sophisticated. They may also want to work with open source technologies or require some degree of deployment flexibility. In this post, we are going to explore a few reasons why you might consider using MongoDB Atlas over DynamoDB, and then look at how you would go about migrating a pre-existing workload.

Get building, faster...

MongoDB Atlas is the best way to consume MongoDB and get access to a developer-friendly experience, delivered as a service on AWS. MongoDB’s query language is incredibly rich and allows you to execute complex queries natively while avoiding the overhead of moving data between operational and analytical engines. With Atlas, you can use MongoDB’s native query language to perform anything from searches on single keys or ranges, faceted searches, graph traversals, and geospatial queries through to complex aggregations, JOINs, and subqueries - without the need to use additional add-on services or integrations.

For example, MongoDB’s aggregation pipeline is a powerful tool for performing analytics and statistical analysis in real-time and generating pre-aggregated reports for dashboarding.

Additionally, MongoDB will give you a few extra things which I think are pretty cool:

  • Document Size - MongoDB handles documents up to 16MB in size, natively. In DynamoDB, you’re limited to 400KB per item, including the name and any local secondary indexes. For anything bigger, AWS suggests that you split storage between DynamoDB and S3.
  • Deployment flexibility - Using MongoDB Atlas, you are able to deploy fully managed MongoDB to AWS, Google Cloud Platform, or Microsoft Azure. If you decide that you no longer want to run your database in the cloud, MongoDB can be run in nearly any environment on any hardware so self-managing is also an option.
  • MongoDB has an idiomatic driver set, providing native language access to the database in dozens of programming languages.
  • MongoDB Atlas provides a queryable backup method for restoring your data at the document level, without requiring a full restoration of your database.
  • MongoDB Atlas provides you with over 100 different instance metrics, for rich native alerting and monitoring.
  • Atlas will assist you in finding the right indexes thanks to Performance Advisor. The Performance Advisor utility is on all the time, helping you make certain that your queries are efficient and fast.

Getting Started

In this tutorial, we'll take a basic data set from an existing DynamoDB table and migrate it to MongoDB Atlas. We'll use a free, M0 cluster so you can do this as well at no cost while you evaluate the benefits of MongoDB Atlas.

This blog post makes a couple of assumptions:

You've installed MongoDB on the computer you'll be importing the data from (we need the mongoimport tool which is included with MongoDB) You've signed up for a MongoDB Atlas Account (the M0 instance is free and fine for this demonstration)

To begin, we'll review our table in AWS:

This table contains data on movies including the year they were released, the title of the film, with other information about the film contained in a subdocument.

We want to take this basic set of data and bring it into MongoDB Atlas for a better method of querying, indexing, and managing our data long term.

First, ensure your application has stopped your writes if you are in production to prevent new entries into your database. You'll likely want to create a temporary landing page and disable new connections to your DynamoDB. Once you've completed this, navigate to your table in your AWS panel.

Click "Actions" at the top after you've selected your table. Find the "Export to .csv" option and click it.

Now you'll have a CSV export of your data from DynamoDB, let's take a quick look:

  
$ more ~/Downloads/Movies.csv
"year (N)","title (S)","info (M)"
"1944","Arsenic and Old Lace","{    ""actors"" : { ""L"" : [        { ""S"" : ""Cary Grant"" },        { ""S"" : ""Priscilla Lane"" },        { ""S"" : ""Raymond Massey"" }      ]    },    ""directors"" : { ""L"" : [        { ""S"" : ""Frank Capra"" }      ]    },    ""genres"" : { ""L"" : [        { ""S"" : ""Comedy"" },        { ""S"" : ""Crime"" },        { ""S"" : ""Romance"" },        { ""S"" : ""Thriller"" }      ]    },    ""image_url"" : { ""S"" : ""http://ia.media-imdb.com/images/M/MV5BMTI3NTYyMDA0NV5BMl5BanBnXkFtZTcwMjEwMTMzMQ@@._V1_SX400_.jpg"" },    ""plot"" : { ""S"" : ""A drama critic learns on his wedding day that his beloved maiden aunts are homicidal maniacs, and that insanity runs in his family."" },    ""rank"" : { ""N"" : ""4025"" },    ""rating"" : { ""N"" : ""8"" },    ""release_date"" : { ""S"" : ""1944-09-01T00:00:00Z"" },    ""running_time_secs"" : { ""N"" : ""7080"" }  }"
  

Looks good, let's go ahead and start using MongoDB's open source tools to import this to our Atlas cluster.

Import your data Let's start moving our data into MongoDB Atlas. First, launch a new free M0 cluster (M0s are great for demos but you’ll want to pick a different tier if you are going into production). Once you have a new M0 cluster, you can then whitelist your local IP address so that you may access your Atlas cluster.

Next, you'll want to use the mongoimport utility to upload the contents of Movies.csv to Atlas. I'll provide my connection string, which I can get right from my Atlas control panel so that mongoimport can begin importing our data:

Now I can enter this into my mongoimport command along with some other important options:

  
mongoimport --uri "mongodb://admin:PASSWORD@demoabc-shard-00-00-a7nzr.mongodb.net:27017,demoabc-shard-00-01-a7nzr.mongodb.net:27017,demoabc-shard-00-02-a7nzr.mongodb.net:27017/test?ssl=true&replicaSet=demoabc-shard-0&authSource=admin"  --collection movies --file ~/Downloads/Movies.csv  --type csv --headerline
2017-11-20T14:02:45.612-0500  imported 100 documents
   

Now that our documents are uploaded, we can log into Atlas with MongoDB Compass and review our data:

This is cool but I want to do something a little bit more advanced. Luckily, MongoDB’s aggregation pipeline will give us the power to do so

We'll need to connect to our shell here; I can press the "CONNECT" button within my Atlas cluster’s overview panel and find the connection instructions for the shell.

Once I am logged in, I can start playing with some different aggregations; here is a basic one that tells us the total number of movies released in 1944 in our collection:

  
db.movies.aggregate(

  // Pipeline
  [
    // Stage 1
    {
      $match: { year : 1944 }
    },

    // Stage 2
    {
      $group: { _id: null, count: { $sum: 1 } }
    },
  ]
);
  

With DynamoDB, we would have had to connect our database cluster to Amazon EMR, adding cost, complexity, and latency.

You can configure backups, ensure network security and configure additional user roles for your data all from the MongoDB Atlas UI.

Sign up for MongoDB Atlas today and start building better apps faster.

Enabling IP Security for MongoDB 3.6 on Ubuntu

Jay Gordon

MongoDB 3.6

MongoDB 3.6 provides developers and DevOps professionals with a secure by default configuration that protects data from external threats by denying unauthorized access on a public network. MongoDB servers will now only listen for connections on the local host unless explicitly configured to listen on another address.

This tutorial will briefly show you how to enable IP addresses beyond localhost to your MongoDB node to ensure your networked servers are able to connect to your database. You will see how easily MongoDB is configured to start up and listen on specific network interfaces.

This tutorial assumes you have:

  • Installed MongoDB 3.6 (this does not handle upgrading from previous versions)
  • Multiple network interfaces on your server (we'll use an AWS EC2 instance)
  • Basic understanding of IP Networks and how to configure a private network for your data (we’ll use an AWS VPC)
  • Understanding that "localhost" refers to IP 127.0.0.1

Getting Started

I have launched an AWS EC2 instance with Ubuntu 16.04 LTS and installed MongoDB as described on the MongoDB downloads page.

I want to enable the private IP range that is part of my VPC to allow us to access our MongoDB database. By doing this, we'll ensure that only our private network and "localhost" are valid network paths to connect to the database. This will help ensure we never have outsiders poking into our database!

I first launch an Ubuntu 16.04 EC2 instance in my public subnet within my VPC. By doing this, I will allow my network interface to allow network connections to the outside world without requiring a NAT Gateway.

Next, I follow the instructions on the MongoDB documentation on how to install MongoDB on Ubuntu. I can verify which ethernet interfaces the process starts on in Linux by running the following command:

ubuntu@ip-172-16-0-211:~$ sudo netstat -plant | egrep mongod
tcp        0      0 127.0.0.1:27017         0.0.0.0:*               LISTEN      2549/mongod

This output means that users are only permitted to access our MongoDB instance on port 27017 via IP 127.0.0.1. If you would like to make this available to other systems on your network, you'll want to bind the local IP associated with the private network. To determine network interface configuration easily, we can just run an ifconfig from the command line:

ubuntu@ip-172-16-0-211:~$ ifconfig
eth0      Link encap:Ethernet  HWaddr 0e:5e:76:83:49:3e
          inet addr:172.16.0.211  Bcast:172.16.0.255  Mask:255.255.255.0
          inet6 addr: fe80::c5e:76ff:fe83:493e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9001  Metric:1
          RX packets:65521 errors:0 dropped:0 overruns:0 frame:0
          TX packets:7358 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:94354063 (94.3 MB)  TX bytes:611646 (611.6 KB)

We have the IP we want to make part of the list of acceptable network addresses we can listen from. I will open the /etc/mongodb.conf file and edit it to reflect the additional network IP:

The file's changes will be:

# network interfaces
net:
  port: 27017
  bindIp: 127.0.0.1,172.16.0.211

After modifying bindIP under "net" from just 127.0.0.1 to include private IP address 172.16.0.211, we should be able to restart and see it listening from netstat on both now:

ubuntu@ip-172-16-0-211:~$ sudo service mongod stop
ubuntu@ip-172-16-0-211:~$ sudo service mongod start
ubuntu@ip-172-16-0-211:~$ sudo netstat -plnt | egrep mongod
tcp        0      0 172.16.0.211:27017      0.0.0.0:*               LISTEN      2892/mongod
tcp        0      0 127.0.0.1:27017         0.0.0.0:*               LISTEN      2892/mongod

Now our database will be able to accept requests from both the specified IP address as well as localhost:

Shell access via localhost

ubuntu@ip-172-16-0-211:~$ mongo localhost
MongoDB shell version v3.6.0-rc2
connecting to: mongodb://127.0.0.1:27017/localhost

Shell access via private IP

ubuntu@ip-172-16-0-211:~$ mongo 172.16.0.211
MongoDB shell version v3.6.0-rc2
connecting to: mongodb://172.16.0.211:27017/test

Next Steps

The default localhost configuration has tremendous benefits to security as you now must explicitly allow network connections, blocking attackers from untrusted networks. Keeping your MongoDB database safe from remote intrusion is extremely important. Make sure you follow our Security Checklist to configure your MongoDB database cluster with the appropriate security best practices.

Now that you understand how to configure additional IP addresses on your MongoDB 3.6 server, you're able to begin configuring replication. Don't forget backups, monitoring and all the other important parts of your MongoDB clusters' health. If you'd rather spend less time on these tasks and deploy MongoDB clusters with a click or an API call, check out MongoDB Atlas, our fully managed database as a service.

The User Guide to AWS re:Invent

This post is a mini-guide that walks through some of the things to do while you are at AWS re:Invent this year.

Building a NodeJS App with MongoDB Atlas and AWS Elastic Container Service - Part 2

In my last post, we started preparing an application built on Node.js and MongoDB Atlas for simple CRUD operations. We've completed the initial configuration of the code and are now ready to launch this into production.

Building a NodeJS App with MongoDB Atlas and AWS Elastic Container Service - Part 1

It's that time of year again! This post is part of our Road to AWS re:Invent 2017 blog series. In the weeks leading up to AWS re:Invent in Las Vegas this November, we'll be posting about a number of topics related to running MongoDB in the public cloud.

Planning for Chaos with MongoDB Atlas: Using the "Test Failover" Button

Jay Gordon

Cloud

When building an application, it's smart to consider chaos. Chaos can be introduced into an application in many different ways; some examples are:

  • Running out of disk space
  • Utilizing all connections to the cluster
  • Oversaturating the available IOPS
  • Network connectivity failure

To help you prepare for such an event, MongoDB Atlas has introduced a new feature called "Test Failover" that you can use to introduce some chaos for testing purposes.

Welcome to Chaos Engineering

One of the more popular terms to come out of the open source community has been "Chaos Engineering." On the "Principles of Chaos Engineering" you'll find the following definition that really encapsulates why the "Test Failover" feature in MongoDB Atlas exists:

Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production...

Chaos Engineering strives to eliminate the pain points in a distributed system by introducing a failure of one of the components in a test environment and reviewing the output. The harder it is to introduce chaos that will cause an application to no longer operate, the more confidence we can place in the infrastructure our app lives on.

The team at Netflix had to ensure that their massive distributed application would still survive if chaos was introduced. Based on the demand of their customer base and the distributed nature of their systems, the engineers at Netflix needed to ensure they could handle a failure of a production system. They created an open source tool called "Chaos Monkey" which you can read more about in this blog post.

The main intent of this chaos is to ensure that if part of production fails, you don't end up with a completely out of service application. For this reason, we've nicknamed the "Test Failover" feature the Chaos Button.

Chaos Checklist

One of the more important concepts of pre-production application architecture testing is ensuring that your application will continue to work during cases of unplanned outage. I like to create pre-deployment checklists to make sure I have considered all the potential ways my app could fail. These checklists typically consist of things like backups, restore testing and disaster recovery.

Some questions I like to have answered prior to going to production are:

  • Do I know how my app will respond when access to my data is temporarily interrupted?
  • When my database recovers, will the application work as expected?
  • Did I configure my application to utilize the full connection string to ensure failover?
  • If an issue occurs with my data, will I need to do any form of intervention?

By testing your application before going to production you're able to review how your app will survive an incident or planned maintenance where a failover may occur. You enable the best practice of ensuring you survive chaos, much like the team at Netflix did.

How "Test Failover" works

The "Test Failover" button will reboot the instance your primary lives on. Your cluster will perform an election and select one of your secondaries that has the most complete oplog to become your new primary.

Once failover is completed, the former primary instance is placed back into your cluster with the same hostname. Your connection string will not require modification as MongoDB drivers are smart enough to instantly know which members of your Atlas cluster are now primaries.

Begin your test

Note: In order to test failover, you need to be using a dedicated MongoDB Atlas cluster. This means that clusters on multi-tenant architecture will not have this feature.

To begin adding some "chaos", go to the "Clusters" menu for your organization, then find your project you'd like to work with. In the example shown below, I will use project "jg-MongoDB-Atlas-2017" to perform the chaos test.

Pick your project to perform chaos test

Once you get to the main window your cluster is listed in, you can then find the ellipsis menu, select it, and find "Test Failover."

Once you select “Test Failover”, you'll be brought to an information box that will inform you of what actions are about to happen:

Information box

Now click "RESTART PRIMARY", which will initiate the failover test as described above. You'll be shown a new window which informs you the test is underway.

You'll be able to tell what is going on by clicking on the cluster's name and reviewing the process as it occurs:

Review process as it occurs

You are able to see that the primary is moved to a new node and the failed over instance is having its data resynced from the new primary. At this time, if you are reviewing an application's stability, you may run some form of selenium test or a curl script that hits an endpoint to confirm a connection to your database is occurring as expected.

When completed, you'll see a new primary selected and the continuation of normal service:

New primary selected

That's it — there's no need to modify connection strings or edit your app. Your cluster's backup, replication, and other services will continue with no required intervention from you.

If you’re new to managed MongoDB services, we encourage you to start with our free tier. For existing customers of third party service providers, be sure to check out our migration offerings and learn about how you can get 3 months of free service.

Upgrading Your Free MongoDB Atlas Database to Production-Ready Instances

Jay Gordon

Cloud

The MongoDB Atlas free tier provides a simple interface to create and work with MongoDB databases and is great for prototyping, early development, or learning the database. But at some point, you may need or want to make use of the features not included with the M0 instance size, including:

  • Support for datasets larger than 512 MB
  • Elastic scalability
  • Region selection
  • Managed backup and restores
  • Network isolation & VPC Peering on AWS
  • The ability to review documents and metadata via the Data Explorer
  • The ability to track performance in real-time and view the hottest collections via the Real-Time Performance Panel
  • Richer, more granular monitoring metrics (with API access)
  • And more

When you’re ready to upgrade to a customized cluster for your use case, you'll find that MongoDB Atlas provides you with a frictionless process to do so.

Tutorial / Demo: Upgrading to a customized MongoDB Atlas cluster

In this blog tutorial, we'll upgrade from a free MongoDB Atlas cluster to one that we customize for our live application.

One of my favorite apps that helps users understand the simplicity of MongoDB Atlas and NodeJS is the Scotch.io node-todo app, which provides you with a full CRUD experience for the database. You can get the full details on using this Express framework single-page app by going to the Scotch.io tutorial or reviewing the github repository.

For the purposes of this tutorial, let’s say we’re running this app backed by a free M0 replica set in MongoDB Atlas. We’ve added a few documents with some tasks we need to accomplish today, but we would like to start backing up our cluster and using VPC peering with our app. Because we’re currently using a free MongoDB Atlas cluster, we won't be able to do any of these things without upgrading first.

Note that we’ll be going through this upgrade process on a Macbook, but this work can be easily done on a VM or bare metal server running Linux.

Installing this application is simple for those who have experience with NodeJS and MongoDB. For those who may be new to these technologies, check out the Getting Started with MongoDB, Node.js and Restify for any requirements (NodeJS, npm).

So let's download our repo and install the app:

bash-3.2$ git clone https://github.com/scotch-io/node-todo
Cloning into 'node-todo'...
remote: Counting objects: 452, done.
remote: Total 452 (delta 0), reused 0 (delta 0), pack-reused 452
Receiving objects: 100% (452/452), 58.45 KiB | 0 bytes/s, done.
Resolving deltas: 100% (158/158), done.

bash-3.2$ cd node-todo/
bash-3.2$ npm install

(when completed, the following output should be displayed)

npm WARN node-todo@0.0.1 No repository field.
npm WARN node-todo@0.0.1 No license field.

Our app will have some default MongoDB settings in the config/database.js file we will need to modify the default MongoDB servers:

Default:

module.exports = {
    remoteUrl : 'mongodb://node:nodeuser@mongo.onmodulus.net:27017/uwO3mypu',
    localUrl: 'mongodb://localhost/meanstacktutorials'
};

Modify this to contain our MongoDB Atlas cluster details, for example:

module.exports = {
    localUrl : 'mongodb://username:password@clustername0-shard-00-00-bywqq.mongodb.net:27017,clustername0-shard-00-01-bywqq.mongodb.net:27017,clustername0-shard-00-02-bywqq.mongodb.net:27017/node-todo?ssl=true&replicaSet=Cluster0-shard-0&authSource=admin',
};

And then we start the app:

bash-3.2$ npm start

> node-todo@0.0.1 start /Users/jaygordon/tmp/node-todo
> node server.js

Now we can go to our browser and review the app by accessing the http://localhost:8080 or whichever IP we may be using.

Using Atlas is an easy experience, and of course that includes moving from our free to our paid tiers. Let's see how we can get the cluster from a free, multi-tenant deployment to a paid deployment with all the features that come along with it.

When we log into our Atlas account and find our free cluster, we’ll now see an “Upgrade” button in the lower right hand side of the cluster information window:

When we click on this button, we will be taken to a cluster overview window with upgrade options. Because we're sticking with AWS in this example, we won't have to worry about our connection string changing. If we had opted for another cloud provider such as Google or Azure, we would have had to modify our connection string in our application.

In our case, we’ll upgrade our M0 to an M30 cluster in the AWS us-east-1 region. We can click "Continue to Payment" to proceed.

We’ll enter our credit card information and then click "Confirm and Deploy."

The upgrade process will take a few minutes, during which there will be downtime of our solution; in the background, our cluster is being moved from a multi-tenant deployment to a single-tenant architecture in its own virtual private cloud (VPC), new nodes are being spun up across multiple availability zones for improved resiliency, data is synced, EBS volumes are being spun up, etc. We can keep track of the process by reviewing the blue bar at the top of our Atlas window:

Since the process is fully automated by MongoDB Atlas, all we have to do is wait for it to complete and verify that our application is online afterwards.

When the process completes, we will see that the information about our new cluster has been refreshed in the Atlas UI.

The app still seems to be working in our browser; let's review one of the documents from the app directly via the Data Explorer to ensure our data exists as expected. We can click on the cluster name, then select "Data Explorer." Here we'll see the admin database as well as the "node-todo" database that we used for our app.

When we click the node-todo database, we are then brought to the collection level where we can begin reviewing documents directly from the Atlas window:

Get Started

To start using the M0 free tier of MongoDB Atlas for development or learning, you can sign up here. There’s no credit card required to start and as we’ve demonstrated in this blog, you can easily upgrade to a customized cluster at any time.