GIANT Stories at MongoDB

STREAM: How MongoDB Atlas and AWS help make it easier to build, scale, and personalize feeds that reach millions of users

This is a guest post by Ken Hoff of Stream (

Stream is a platform designed for building, personalizing, and scaling activity feeds that reach over 200 million users. We offer an alternative to building app feed functionality from scratch by simplifying implementation and maintenance so companies can stay focused on what makes their products unique.

Today our feed-as-a-service platform helps personalize user experiences for some of the most engaging applications and websites. For example, Product Hunt, which surfaces new products daily and allows enthusiasts to share and geek out about the latest mobile apps, websites, and tech creations, uses our API to do so.

We’ve recently been working on an application called Winds, an open source RSS and podcast application powered by Stream, that provides a new and personalized way to listen, read, and share content.

We chose MongoDB to support the first iteration of Winds as our developers found the database very easy to work with. I personally feel that the mix of data model flexibility, scalability, and rich functionality that you get with MongoDB makes it superior to what you would get out of the box with other NoSQL databases or tabular databases such as MySQL and PostgreSQL.

Our initial MongoDB deployment was managed by a vendor called Compose but that ultimately didn’t work out due to issues with availability and cost. We migrated off Compose and built our own self-managed deployment on AWS. When MongoDB’s own database as a service, MongoDB Atlas, was introduced to us, we were very interested. We wanted to reduce the operational work that our team was doing and found Atlas’s pricing much more predictable than what we had experienced with our previous MongoDB service provider. We also needed a database service that would be highly available out of the box. The fact that MongoDB Atlas sets a minimum replica set member count and automatically distributes each cluster across AWS availability zones had us sold.

The great thing about managing or scaling MongoDB with MongoDB Atlas is that pretty much almost all of the time, we don’t have to worry about it. We run our application on a deployment using the M30 size instances with the auto-expanding storage option enabled. When our disk utilization approaches 90%, Atlas automatically provisions us more with no impact to availability. And if we experience spikes in traffic like we have in the past, we can easily scale up or out using MongoDB Atlas by either clicking a few buttons in the UI or triggering a scaling event using the API.

Another benefit that MongoDB Atlas has provided us is on the cost savings side. With Atlas, we no longer need a dedicated person to worry about operations or maintaining uptime. Instead, that person can work on the projects that we’d rather have them working on. In addition, our team is able to move much faster. Not only can we make changes on the fly to our application leveraging MongoDB’s flexible data model, but we can deploy any downstream database changes on the fly or easily spin up new clusters to test new ideas. All of these can happen without impacting things in production; no worrying about provisioning infrastructure, setting up backups, monitoring, etc. It’s a real thing of beauty.

In the near future, we plan to look into utilizing change streams from MongoDB 3.6 for our Winds application, which is already undergoing some major upgrades (users can sign up for the beta here). This may eliminate the need to maintain separate Redis instances, which would further increase our savings and reduce architectural complexity.

We’re also looking into migrating more applications onto MongoDB Atlas as its built-in high availability, automation, fully managed backups, and performance optimization tools make it a no-brainer. While there are other MongoDB as a service providers out there (Compose, mLab, etc.) available, no other solution comes close to what MongoDB Atlas can provide.


Interested in reducing costs and faster time to market? Get started today with a free 512 MB database managed by MongoDB Atlas.

Be a part of the largest gathering of the MongoDB community. Join us at MongoDB World.

MongoDB Enterprise Running on OpenShift

Update: May 2, 2018
Our developer preview of MongoDB Enterprise Server running on OpenShift now includes a simple OpenShift Template. The mongodb-openshift-dev-preview.template.yaml template file reduces the complexity and additional requirements of running OpenShift with the --service-catalog enabled and deploying the Ansible Service Broker (not to mention needing to install the apb tool on your development system in order to build and deploy the Ansible Playbook Bundle). Currently the template can provision multiple pods each running an automation agent configured to the same MongoDB Ops Manager Project. You can complete the deployment of a MongoDB replica set with a few quick clicks in the Ops Manager user interface. We hope the removal of these additional dependencies helps you and your organization quickly adopt this modern, flexible, and full featured way to deploy and run MongoDB Enterprise on your OpenShift clusters. And, stay tuned! This is the tip of the iceberg for support of your cloud native workloads from MongoDB.

In order to compete and get products to market rapidly enterprises today leverage cloud-ready and cloud-enabled technologies. Platforms as a Service (or PaaS) provide out-of-the-box capabilities which enable application developers to focus on their business logic and users instead of infrastructure and interoperability. This key ability separates successful projects from those which drown themselves in tangential work which never stops.

In this blog post, we'll cover MongoDB's general PaaS and cloud enablement strategy as well as touch upon some new features of Red Hat’s OpenShift which enable you to run production-ready MongoDB clusters. We're also excited to announce the developer preview of MongoDB Enterprise Server running on OpenShift. This preview allows you to test out how your applications will interact with MongoDB running on OpenShift.

Integration Approach for MongoDB and PaaS

Platforms as a Service are increasingly popular, especially for those of you charged with building "cloud-enabled" or "cloud-ready" applications but required to use private data center deployments today. Integrating a database with a PaaS needs to be done appropriately to ensure that database instances can be deployed, configured, and administered properly.

There are two common components of any production-ready cloud-enabled database deployment:

  • A modern, rock-solid database (like MongoDB).
  • Tooling to enable telemetry, access and authorization, and backups (not to mention things like proactive alerting that integrates with your chosen issue tracking system, complete REST-based APIs for automation, and a seamless transition to hosted services.) For MongoDB, this is MongoDB Ops Manager.

A deep integration of MongoDB Ops Manager is core to our approach of integrating MongoDB with popular PaaS offerings. The general design approach is to use the "separation of concerns" design principle. The chosen PaaS handles the physical or virtual machines, CPU and RAM allotment, persistent storage requirements, and machine-level access control, while MongoDB Ops Manager controls all aspects of run-time database deployments

This strategy enables system administrators to quickly deploy "MongoDB as a Solution" offerings within their own data centers. In turn, enterprise developers can easily self-service their own database needs.

If you haven't already, download MongoDB Ops Manager for the best way to run MongoDB.

MongoDB Enterprise Server OpenShift Developer Preview

Our "developer preview" for MongoDB on OpenShift can be found here: The preview allows provisioning of both MongoDB replica sets and "agent-only" nodes (for easy later use as MongoDB instances) directly through OpenShift. The deployments automatically register themselves with an instance of MongoDB Ops Manager. All the technical details and notes of getting started can be found right in the repo. Here we'll just describe some of functionality and technology used.

The preview requires access to an OpenShift cluster running version 3.9 or later and takes advantage of the new Kubernetes Service Catalog features. Specifically, we're using the Ansible Service Broker and have build an Ansible Playbook Bundle which installs an icon into your OpenShift console. The preview also contains an example of an OpenShift template which supports replica sets and similar functionality.

A tour and deploying your first cluster:

Once you have your development environment ready (see notes in the developer preview Github repository) and have configured an instance of MongoDB Ops Manager you're ready to starting deploying MongoDB Enterprise Server.

Clusters can be provisioned through the OpenShift web console or via command line. The web console provides an intuitive "wizard-like" interface in which users specify values for various parameters, such as MongoDB version, storage size allocation, and MongoDB Ops Manager Organization/Project to name a few.

Command line installs are also available in which parameter values can be scripted or predefined. This extensibility allows for automation and integration with various Continuous Integration and Continuous Delivery technologies.

A future post will detail cluster configuration and various management scenarios, such as upgrades, performance tuning, and troubleshooting connectivity, so stay tuned.

We're excited to introduce simple and efficient ways to manage your MongoDB deployments with tools such as OpenShift and Kubernetes. Please try out the developer preview and drop us a line on Twitter #mongodb-openshift or email for more information.

Be a part of the largest gathering of the MongoDB community. Join us at MongoDB World.

Modern Distributed Application Deployment with Kubernetes and MongoDB Atlas

Storytelling is one of the parts of being a Developer Advocate that I enjoy. Sometimes the stories are about the special moments when the team comes together to keep a system running or build it faster. But there are less than glorious tales to be told about the software deployments I’ve been involved in. And for situations where we needed to deploy several times a day, now we are talking nightmares.

For some time, I worked at a company that believed that deploying to production several times a day was ideal for project velocity. Our team was working to ensure that advertising software across our media platform was always being updated and released. One of the issues was a lack of real automation in the process of applying new code to our application servers.

What both ops and development teams had in common was a desire for improved ease and agility around application and configuration deployments. In this article, I’ll present some of my experiences and cover how MongoDB Atlas and Kubernetes can be leveraged together to simplify the process of deploying and managing applications and their underlying dependencies.

Let's talk about how a typical software deployment unfolded:

  1. The developer would send in a ticket asking for the deployment
  2. The developer and I would agree upon a time to deploy the latest software revision
  3. We would modify an existing bash script with the appropriate git repository version info
  4. We’d need to manually back up the old deployment
  5. We’d need to manually create a backup of our current database
  6. We’d watch the bash script perform this "Deploy" on about six servers in parallel
  7. Wave a dead chicken over my keyboard

Some of these deployments would fail, requiring a return to the previous version of the application code. This process to "rollback" to a prior version would involve me manually copying the repository to the older version, performing manual database restores, and finally confirming with the team that used this system that all was working properly. It was a real mess and I really wasn't in a position to change it.

I eventually moved into a position which gave me greater visibility into what other teams of developers, specifically those in the open source space, were doing for software deployments. I noticed that — surprise! — people were no longer interested in doing the same work over and over again.

Developers and their supporting ops teams have been given keys to a whole new world in the last few years by utilizing containers and automation platforms. Rather than doing manual work required to produce the environment that your app will live in, you can deploy applications quickly thanks to tools like Kubernetes.

What's Kubernetes?

Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. Kubernetes can help reduce the amount of work your team will have to do when deploying your application. Along with MongoDB Atlas, you can build scalable and resilient applications that stand up to high traffic or can easily be scaled down to reduce costs. Kubernetes runs just about anywhere and can use almost any infrastructure. If you're using a public cloud, a hybrid cloud or even a bare metal solution, you can leverage Kubernetes to quickly deploy and scale your applications.

The Google Kubernetes Engine is built into the Google Cloud Platform and helps you quickly deploy your containerized applications.

For the purposes of this tutorial, I will upload our image to GCP and then deploy to a Kubernetes cluster so I can quickly scale up or down our application as needed. When I create new versions of our app or make incremental changes, I can simply create a new image and deploy again with Kubernetes.

Why Atlas with Kubernetes?

By using these tools together for your MongoDB Application, you can quickly produce and deploy applications without worrying much about infrastructure management. Atlas provides you with a persistent data-store for your application data without the need to manage the actual database software, replication, upgrades, or monitoring. All of these features are delivered out of the box, allowing you to build and then deploy quickly.

In this tutorial, I will build a MongoDB Atlas cluster where our data will live for a simple Node.js application. I will then turn the app and configuration data for Atlas into a container-ready image with Docker.

MongoDB Atlas is available across most regions on GCP so no matter where your application lives, you can keep your data close by (or distributed) across the cloud.

Figure 1: MongoDB Atlas runs in most GCP regions


To follow along with this tutorial, users will need some of the following requirements to get started:

First, I will download the repository for the code I will use. In this case, it's a basic record keeping app using MongoDB, Express, React, and Node (MERN).

bash-3.2$ git clone
Cloning into 'mern-crud'...
remote: Counting objects: 326, done.
remote: Total 326 (delta 0), reused 0 (delta 0), pack-reused 326
Receiving objects: 100% (326/326), 3.26 MiB | 2.40 MiB/s, done.
Resolving deltas: 100% (137/137), done.

cd mern-crud

Next, I will npm install and get all the required npm packages installed for working with our app:

> uws@9.14.0 install /Users/jaygordon/work/mern-crud/node_modules/uws
> node-gyp rebuild > build_log.txt 2>&1 || exit 0

Selecting your GCP Region for Atlas

Each GCP region includes a set number of independent zones. Each zone has power, cooling, networking, and control planes that are isolated from other zones. For regions that have at least three zones (3Z), Atlas deploys clusters across three zones. For regions that only have two zones (2Z), Atlas deploys clusters across two zones.

The Atlas Add New Cluster form marks regions that support 3Z clusters as Recommended, as they provide higher availability. If your preferred region only has two zones, consider enabling cross-region replication and placing a replica set member in another region to increase the likelihood that your cluster will be available during partial region outages.

The number of zones in a region has no effect on the number of MongoDB nodes Atlas can deploy. MongoDB Atlas clusters are always made of replica sets with a minimum of three MongoDB nodes.

For general information on GCP regions and zones, see the Google documentation on regions and zones.

Create Cluster and Add a User

In the provided image below you can see I have selected the Cloud Provider "Google Cloud Platform." Next, I selected an instance size, in this case an M10. Deployments using M10 instances are ideal for development. If I were to take this application to production immediately, I may want to consider using an M30 deployment. Since this is a demo, an M10 is sufficient for our application. For a full view of all of the cluster sizes, check out the Atlas pricing page. Once I’ve completed these steps, I can click the "Confirm & Deploy" button. Atlas will spin up my deployment automatically in a few minutes.

Let’s create a username and password for our database that our Kubernetes deployed application will use to access MongoDB.

  • Click "Security" at the top of the page.
  • Click "MongoDB Users"
  • Click "Add New User"
  • Click "Show Advanced Options"
  • We'll then add a user "mernuser" for our mern-crud app that only has access to a database named "mern-crud" and give it a complex password. We'll specify readWrite privileges for this user:

Click "Add User"

Your database is now created and your user is added. You still need our connection string and to whitelist access via the network.

Connection String

Get your connection string by clicking "Clusters" and then clicking "CONNECT" next to your cluster details in your Atlas admin panel. After selecting connect, you are provided several options to use to connect to your cluster. Click "connect your application."

Options for the 3.6 or the 3.4 versions of the MongoDB driver are given. I built mine using the 3.4 driver, so I will just select the connection string for this version.

I will typically paste this into an editor and then modify the info to match my application credentials and my database name:

I will now add this to the app's database configuration file and save it.

Next, I will package this up into an image with Docker and ship it to Google Kubernetes Engine!

Docker and Google Kubernetes Engine

Get started by creating an account at Google Cloud, then follow the quickstart to create a Google Kubernetes Project.

Once your project is created, you can find it within the Google Cloud Platform control panel:

It's time to create a container on your local workstation:

Set the PROJECT_ID environment variable in your shell by retrieving the pre- configured project ID on gcloud with the command below:

export PROJECT_ID="jaygordon-mongodb"
gcloud config set project $PROJECT_ID
gcloud config set compute/zone us-central1-b

Next, place a Dockerfile in the root of your repository with the following:

FROM node:boron

RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app

COPY . /usr/src/app


CMD [npm, start]

To build the container image of this application and tag it for uploading, run the following command:

bash-3.2$ docker build -t${PROJECT_ID}/mern-crud:v1 .
Sending build context to Docker daemon  40.66MB
Successfully built b8c5be5def8f
Successfully tagged

Upload the container image to the Container Registry so we can deploy to it:

Successfully tagged
bash-3.2$ gcloud docker -- push${PROJECT_ID}/mern-crud:v1The push refers to repository []

Next, I will test it locally on my workstation to make sure the app loads:

docker run --rm -p 3000:3000${PROJECT_ID}/mern-crud:v1
> mern-crud@0.1.0 start /usr/src/app
> node server
Listening on port 3000

Great — pointing my browser to http://localhost:3000 brings me to the site. Now it's time to create a kubernetes cluster and deploy our application to it.

Build Your Cluster With Google Kubernetes Engine

I will be using the Google Cloud Shell within the Google Cloud control panel to manage my deployment. The cloud shell comes with all required applications and tools installed to allow you to deploy the Docker image I uploaded to the image registry without installing any additional software on my local workstation.

Now I will create the kubernetes cluster where the image will be deployed that will help bring our application to production. I will include three nodes to ensure uptime of our app.

Set up our environment first:

export PROJECT_ID="jaygordon-mongodb"
gcloud config set project $PROJECT_ID
gcloud config set compute/zone us-central1-b

Launch the cluster

gcloud container clusters create mern-crud --num-nodes=3

When completed, you will have a three node kubernetes cluster visible in your control panel. After a few minutes, the console will respond with the following output:

Creating cluster mern-crud...done.
Created [].
To inspect the contents of your cluster, go to:
kubeconfig entry generated for mern-crud.
mern-crud  us-central1-b  1.8.7-gke.1  n1-standard-1  1.8.7-gke.1   3          RUNNING

Just a few more steps left. Now we'll deploy our app with kubectl to our cluster from the Google Cloud Shell:

kubectl run mern-crud${PROJECT_ID}/mern-crud:v1 --port 3000

The output when completed should be:

jay_gordon@jaygordon-mongodb:~$ kubectl run mern-crud${PROJECT_ID}/mern-crud:v1 --port 3000
deployment "mern-crud" created

Now review the application deployment status:

jay_gordon@jaygordon-mongodb:~$ kubectl get pods
NAME                         READY     STATUS    RESTARTS   AGE
mern-crud-6b96b59dfd-4kqrr   1/1       Running   0          1m

We'll create a load balancer for the three nodes in the cluster so they can be served properly to the web for our application:

jay_gordon@jaygordon-mongodb:~$ kubectl expose deployment mern-crud --type=LoadBalancer --port 80 --target-port 3000 
service "mern-crud" exposed

Now get the IP of the loadbalancer so if needed, it can be bound to a DNS name and you can go live!

jay_gordon@jaygordon-mongodb:~$ kubectl get service
NAME         TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)        AGE
kubernetes   ClusterIP              443/TCP        11m
mern-crud    LoadBalancer   80:30684/TCP   2m

A quick curl test shows me that my app is online!

bash-3.2$ curl -v
* Rebuilt URL to:
*   Trying
* Connected to ( port 80 (#0)
> GET / HTTP/1.1
> Host:
> User-Agent: curl/7.54.0
> Accept: */*
< HTTP/1.1 200 OK
< X-Powered-By: Express

I have added some test data and as we can see, it's part of my deployed application via Kubernetes to GCP and storing my persistent data in MongoDB Atlas.

When I am done working with the Kubernetes cluster, I can destroy it easily:

gcloud container clusters delete mern-crud

What's Next?

You've now got all the tools in front of you to build something HUGE with MongoDB Atlas and Kubernetes.

Check out the rest of the Google Kubernetes Engine's tutorials for more information on how to build applications with Kubernetes. For more information on MongoDB Atlas, click here.

Have more questions? Join the MongoDB Community Slack!

Continue to learn via high quality, technical talks, workshops, and hands-on tutorials. Join us at MongoDB World.

Longbow Advantage - Helping companies move beyond the spreadsheet for a real-time view of logistics operations

The global market in supply chain analytics is estimated at some $2.7 billion[1] — and yet, far too often supply chain leaders use spreadsheets to manage their operation, limiting the real-time visibility into their systems.

Longbow Advantage, a supply chain partner, helps companies get the maximum ROI from their supply chain software products. Moving beyond the spreadsheet and generic enterprise BI tools, Longbow developed an application called Rebus™ which allows users to harness the power of smart data and get real-time visibility into their entire supply chain. That means ingesting data in many formats from a wide range of systems, storing it for efficient reference, and presenting it as needed to users — at scale.

MongoDB Atlas is at the heart of Rebus. We talked to Alex Wakefield, Chief Commercial Officer, to find out why they chose to trust such a critical part of their business to MongoDB and how it’s panned out both technically and commercially.


Tell us a little bit about Longbow Advantage. How did you come up with the idea?

Sixteen years ago our Founder, Gerry Brady, left his job at a distribution company to build Longbow Advantage. The goal was to build a company that could help streamline warehouse and workforce management implementations, upgrades, and integrations, and put more focus on customer experience and success.

Companies of all sizes have greatly improved distribution processes but still lack real-time visibility into their systems. While there’s a desire to use BI/analytics systems, automate manual processes, and work with information in as close to real-time as possible, most companies continue to rely on manually generated spreadsheets to measure their logistics KPIs, slowing down speed to insights.

There had to be a better way to help companies address this problem. We built an application called Rebus. This SaaS-based analytics platform, used by industry leaders such as Del Monte Foods and Subaru of America, aggregates and harmonizes logistics data from any supply chain execution software to provide a near real-time view of logistics operations and deliver cross-functional insights. The idea is quite simply to provide more accurate data in as close to real-time as technically possible within a common platform that can be shared across the supply chain.

For example, one company may have a KPI around labor productivity. When that company receives a customer order to ship, there is a lot of information they want to know:

  • Was the order shipped and on-time?
  • How efficiently is the labor staff filling orders?
  • How many orders are processing?
  • How many individual lines or tasks on the order are being filled?

The list goes on. With Rebus, manufacturers, retailers and distributors can segment different business lines like ecommerce, traditional retail, direct to consumer and more, to ensure that they are being productive and meeting the appropriate deadlines. Without this information, a company may miss major deadlines, negatively impact customer satisfaction, miss out on revenue opportunities, and in some cases, incur significant financial penalties.

What are some of the benefits that your customers are experiencing?

Our customers are able to automate a manual and time-intensive metrics process and collect near real-time data in a common platform that can be used across the organization. All of this leads to more efficient decision-making and a coordinated communication effort.

Customers are also able to identify inaccurate or duplicate data that may be contributing to slow performance in their Warehouse and Labor Management software. Rebus provides an immediate way to identify data issues and improve overall performance. This is a huge benefit for customers who are shipping thousands of orders every week.

Why did you decide to use MongoDB?

Four years ago, when we first came up with the idea for Rebus, we gathered a group of employees to brainstorm the best way to build it.

In that brainstorm, one of our employees suggested that we use MongoDB as the underlying datastore. After doing some research, it was clear that the document model was a good match for Rebus. It would allow us to gather, store, and build analytics around a lot of disparate data in close to real time. We decided to build our application on MongoDB Enterprise Advanced.

When and why did you decide to move to MongoDB Atlas?

We first heard about MongoDB Atlas in July 2016 shortly after it launched, but were not able to migrate right away. We maintain strict requirements around compliance and data management, so it was not until May 2017, when MongoDB Atlas became SOC2 compliant, that we decided to migrate. Handing off our database management to the team that builds MongoDB gave us peace of mind and has helped us stay efficient and agile. We wanted to ensure that our team could remain focused on the application and not have to worry about the underlying infrastructure. Atlas allowed us to do just that.

The migration wasn’t hard. We were moving half a terabyte of data into Atlas, which took a couple of goes — the first time didn’t take. But the support team was proactive. After working with us to pinpoint the issue, one of our key technical people reconfigured an option and the process re-ran without any issues. We hit our deadline.

Why did you decide to use Atlas on Google Cloud Platform (GCP)?

Google Cloud Platform is SOC2 compliant and allows us to keep our team highly efficient and focused on developing the application instead of managing the back end. Additionally, GCP gave us great responses that we weren’t getting from other cloud vendors.

How has your experience been so far?

MongoDB Atlas has been fantastic for us. In particular, the real-time performance panel is fantastic, allowing us to see what is going on in our cluster as it’s happening.

In comparison to other databases, both NoSQL and SQL, MongoDB provides huge benefits. Despite the fact that many of our developers have worked with relational databases their entire careers, the way we can get data out of MongoDB is unparalleled to anything they’ve ever seen. That’s even with a smaller, more efficient footprint on our system.

Additionally, the speed of MongoDB has been really helpful. We’re still looking at the results from our load tests, but the ratio of timeouts to successes was very low. Atlas outperforms what we were doing before. We know we can support at least a couple hundred users at one time. That tells us we will be able to go and grow with MongoDB Atlas for years to come.

Thank you for your time Alex.

[1] Grand View Research, Supply Chain Analytics Market Analysis, 2014 - 2025,

Rebus is a trademark of Longbow Advantage Inc.

16 Cities in 5 Months: The MongoDB team is coming to an AWS Summit near you

As our community of users continues to grow and become more diverse, we want to ensure all of our customers are fully equipped to be successful on MongoDB Atlas. To that end, we have partnered with AWS, committing to 16 of their regional Summits. These 16 events span 13 different countries and expect to draw thousands of members of the AWS and MongoDB communities.

Powering an online community of coders with MongoDB Atlas

This is a guest post by Linda Peng (creator of CodeBuddies) and Dhaval Tanna (core contributor).

If you’re learning to code, or if you already have coding experience, it helps to have other people around -- like mentors, coworkers, hackathon buddies and study partners -- to help accelerate your learning, especially when you get stuck.

But not everyone can commute to a tech meetup, or lives in a city with access to a network of study partners or mentors/coworkers who can help them.

CodeBuddies started in 2014 as a free virtual space for independent code learners to share knowledge and help each other learn. It is fully remote and 100% volunteer-driven, and helps those who — due to geography, schedule or personal responsibilities — might not be able to easily attend in-person tech meetups and workshops/hackathons where they could find study partners and mentors.

The community is now comprised of a mix of experienced software engineers and beginning coders from countries around the world, who share advice and knowledge in a friendly Slack community. Members also use the website at to start study groups and schedule virtual hangouts. We have a pay-it-forward mentality.

The platform, an open-sourced project, was painstakingly built by volunteer contributors to help members organize study groups and schedule focused hangouts to learn together. In those peer-to-peer organized remote hangouts, the scheduler of the hangout might invite others to join them in:

  • Working through a coding exercise together
  • Screen sharing and helping each other through a contribution to an open-sourced project
  • Co-working silently in a “silent” hangout (peer motivation)
  • Helping them practice their knowledge of a topic by attempting to teach it
  • Reading through a chapter of a programming tutorial together

Occasionally, the experience will be magical: a single hangout on a popular framework might have participants joining in at the same time from Australia, the U.S., Finland, Hong Kong, and Nigeria.

The site uses the MeteorJS framework, and the data is stored in a MongoDB database.

For years, with a zero budget, CodeBuddies was hosted on a sandbox instance from mLab. When we had the opportunity to migrate to MongoDB Atlas, our database was small enough that we didn’t need to use live migration (which requires a paid mLab plan), but could migrate it manually. These are the three easy steps we took to complete the migration:

1) Dump the mongo database to a local folder

Once you have stopped application writes to your old database, run:

mongodump -h --port 15992 --db production-database -u username -p password -o Downloads/dump/production-database

2) Create a new cluster on MongoDB Atlas


3) Use mongorestore to populate the dumped DB into the MongoDB Atlas cluster


First, whitelist your droplet IP on MongoDB Atlas:

Then you can restore the mlab dump you have in a local folder to MongoDB Atlas:

mongorestore --host --port 27018 --authenticationDatabase admin --ssl  -u username -p password Downloads/dump/production-database

We host our app on DigitalOcean, and use Phusion Passenger to manage our app. When we were ready to make the switchover, we stopped Phusion Passenger, added our MongoDB connection string to our nginx config file, and then restarted Phusion Passenger.


CodeBuddies is a small project now, but we do not want to be unprepared when the community grows. We chose MongoDB Atlas for its mature performance monitoring tools, professional support, and easy scaling.

How Kustomer uses MongoDB and AWS to help fill in the gaps in the customer journey

Kustomer is a SaaS-based customer relationship platform designed to integrate conversations, transactions, and a company's proprietary data in one single system, capturing all aspects of the customer journey. We sat down with Jeremy Suriel, CTO & Co-Founder of Kustomer, to learn more.

Tell us about Kustomer

My co-founder and I worked together for 20 years in customer support. Over time, we’ve seen major changes in the industry - social media gave consumers a voice, users started communicating through text, mobile computing took off - and companies weren’t listening to their customers through these new channels.

Recognizing these changes, Kustomer was launched in 2015 as a CRM platform to improve the customer experience. Our goal is to help companies compile customer information into one place, automate business processes, address the pain points behind customer support systems, and enable users to make smarter, data driven decisions.

What are you building with MongoDB?

We are building an application that allows Kustomer users to get a complete picture of their customer’s activity from the first interaction through the entire journey. This insight allows customer support representatives to provide a better, more personalized experience to the end user. With Kustomer, users are able to combine conversations, custom objects, and track events in an easy-to-use interface. They are able to collect historical data behind every account from every channel, get insight into the customer sentiment, and more.

We could have chosen any data storage engine for this application. We briefly considered MySQL, Postgres, and DynamoDB, however, when compared to the alternatives, MongoDB was the stand out in two key areas. First, we needed to store complicated data in a simple way. MongoDB’s flexible data model allowed us to have independent tenants in our platform with the ability for each customer to define the structure of their data based on their specific requirements. Relational data stores didn’t give us this option and DynamoDB lacked some key features and flexibility like easily adding secondary compound indexes to an existing data model.

Second, we decided early on that we would be a JavaScript shop, specifically Node.js on the backend and React.js on the frontend. From a hiring perspective, we found that Node.js engineers have a lot of familiarity with MongoDB. Building our platform on MongoDB helps us get access to the top talent with the relevant set of expertise, and allow us to build our application quickly and efficiently.

We were also excited to leverage MongoDB’s WiredTiger storage engine with improved performance and concurrency. Overall, MongoDB was a no-brainer for us.

Please describe your application stack. What technologies or services are you using?

We have a microservice-based architecture with MongoDB as the primary database storing the majority of our data. Our infrastructure is running in AWS where we follow standard best practices.

  • Services are continuously deployed with zero-downtime from CircleCI to Amazon Elastic Container Service (ECS) running our docker-based microservice containers.
  • All services running with an AWS VPC, Multi-AZ for high availability with auto-scaling and traffic distributed through AWS ELB/ALBs.
  • API gateways sit in front of all our microservices, handling authentication, authorization, and auditing.
  • Customer Search & Segmentation, which is a core functionality of our platform, is powered by Elasticsearch.
  • We rely on AWS Kinesis Data Streams to collect and process events.
  • We use AWS Lambda functions to help customers populate AWS Redshift and create real-time dashboards. We’re also developing a Snowflake integration for other analytics use cases.
  • Finally, we use Terraform to automatically configure our cloud-based dev, qa, staging, and production environments.

We leverage MongoDB Enterprise Advanced for ongoing support and for the additional software that helps us with database operations. For example, we use the included Cloud Manager product to manage our database backups. The tool helps us upgrade our clusters, connect our alerts to Slack, and more. Our favorite feature of MongoDB Cloud Manager is the profiling/metrics dashboard that allows us to see everything that is happening within our deployment at all times and perform very specific queries to get greater insights into performance.

How is MongoDB performing for you?

MongoDB continues to perform well as our application and usage grows. We now have 1-4 millisecond reads and sub-millisecond writes. Our data volume has grown 80% since last quarter and we currently have 30+ MongoDB databases with well over 100 collections. We may explore sharding one or more of our services’ MongoDB collections and/or migrating to MongoDB Atlas in the future.

Overall we’ve experienced great benefits with MongoDB. We have great response times, are able to get the talent we need, are easily able to personalize our product to our customers’ needs, and more. Our company would not be where we are today if we had based our application on any other database.

SEGA HARDlight Migrates to MongoDB Atlas to Simplify Ops and Improve Experience for Millions of Mobile Gamers

It was way back in the summer of ‘91 that Sonic the Hedgehog first chased rings across our 2D screens. Gaming has come a long way since then. From a static TV and console setup in ‘91, to online PC gaming in the noughties and now to mobile and virtual reality. Surprisingly, for most of those 25 years, the underlying infrastructure that powered these games hasn’t really changed much at all. It was all relational databases. But with ever increasing need for scale, flexibility and creativity in games, that’s changing fast. SEGA HARDlight is leading this shift by adopting a DevOps culture and using MongoDB Atlas, the cloud hosted MongoDB service, to deliver the best possible gaming experience.

Bringing Icons to Mobile Games

SEGA HARDlight is a mobile development studio for SEGA, a gaming company you might have heard of. Based in the UK’s Royal Leamington Spa, SEGA HARDlight is well known for bringing the much-loved blue mascot Sonic the Hedgehog to the small screen. Along with a range of Sonic games, HARDlight is also responsible for building and running a number of other iconic titles such as Crazy Taxi: City Rush and Kingdom Conquest: Dark Empire.

Sonic Dash

Earlier versions of the mobile games such as Sonic Jump and Sonic Dash didn’t require a connection to the internet and had no server functionality. As they were relatively static games, developers initially supported the releases with an in-house tech stack based around Java and MySQL and hosted in SEGA HARDlight’s own data centre.

The standard practice for launching these games involved load testing the servers to the point of breaking, then provisioning the resources to handle an acceptable failure point. This limited application functionality, and could cause service outages when reaching the provisioned resources’ breaking point. As the games started to add more online functionality and increased in popularity, that traditional stack started to creak.

Massive Adoption: Spiky Traffic

Mobile games have an interesting load pattern. People flock in extreme numbers very soon after the release. For the most popular games, this can mean many millions people in just a few days or even hours. The peak is usually short and then quickly drops to a long tail of dedicated players. Provisioning for this kind of traffic with a dynamic game is a major headache. The graph from the Crazy Taxi: City Rush launch in 2014 demonstrates just how spiky the traffic can be.

Typical usage curve for a popular mobile game

We spoke with Yordan Gyurchev, Technical Director at SEGA HARDlight, who explained: “With these massive volumes even minor changes in the database have a big impact. To provide a perfect gaming experience developers need to be intimately familiar with the performance trade offs of the database they’re using,”

Crazy Taxi : City Rush

Supersonic Scaling

SEGA HARDlight knew that the games were only going to get more online functionality and generate even more massive bursts of user activity. Much of the gaming data was also account-based so it didn’t fit naturally in the rows and columns of relational databases. In order to address these limitations, the team searched for alternatives. After reviewing Cassandra and Couchbase, but feeling they were either too complex to manage or didn’t have the mature support needed to support the company’s SLAs, the HARDlight engineers looked to MongoDB Atlas, the MongoDB database as a service.

Then came extensive evaluations and testing across multiple dimensions such as cost, maintenance, monitoring and backups. It was well known that MongoDB natively had the scalability and flexibility to handle large volumes and always-on deployments but HARDlight’s team had to have support on the operations side too.

Advanced operational tooling in MongoDB Atlas gave a small DevOps team of just two staffers the ability to handle and run games even as millions of people join the fray. They no longer had to worry about maintenance, upgrades or backups. In fact, one of the clinchers was the point in time backup and restore feature which meant that they can roll back to a checkpoint with the click of a button. With MongoDB Atlas and running on AWS, SEGA HARDlight was ready to take on even Boss Level scaling.

“At HARDlight we’re passionate about finding the right tool for the job. For us we could see that using a horizontally scalable document database was a perfect fit for player-account based games,” said Yordan.

“The ability to create a high traffic volume, highly scalable solution is about knowing the tiny details. To do that, normally engineers need to focus on many different parts of the stack but MongoDB Atlas and MongoDB’s support gives us a considerable shortcut. If this was handled in-house we would only be as good as our database expert. Now we can rely on a wealth of knowledge, expertise and best in class technology.”

Sonic Forces

HARDlight’s first MongoDB powered game was Kingdom Conquest: Dark Empire which was a frictionless launch from the start and gave the engineers their first experiences of MongoDB. Then in a weekend in late 2017 Sonic Forces: Speed Battle was launched on mobile. It’s a demanding, always-on application that enables constant connection to the internet and shared leaderboards. In the background a 3 shard cluster running on MongoDB Atlas easily scaled to handle the complex loads as millions of gamers joined the race. The database was stable with low latencies and not a single service interruption. All of this resulted in a low stress launch, a happy DevOps team and a very enthusiastic set of gamers.

The latest SEGA HARDlight mobile game: Sonic Forces: Speed Battle

Yordan concluded: “With MySQL, it had taken multiple game launches to get the database backend right. With MongoDB Atlas, big launches were a success right from the start. That’s no mean feat.”

Just as the gaming platforms have evolved and transformed through the years, so too has the database layer had to grow and adapt. SEGA HARDlight is now expanding its use of MongoDB Atlas to support all new games as they come online. By taking care of the operations, management and scaling, MongoDB Atlas lets HARDlight focus on building and running some of the most iconic games in the world. And doing it with confidence.

Gone is the 90s infrastructure. Replaced by a stack that is every bit as modern, powerful and fast as the famous blue hedgehog.

SEGA Hardlight is looking for talented engineers to join the team. If you are interested, check out the careers page or email:

Start your Atlas journey today for free. What are you waiting for?

Optimizing for Fast, Responsive Reads with Cross-Region Replication in MongoDB Atlas

MongoDB Atlas customers can enable cross-region replication for multi-region fault tolerance and fast, responsive reads.

  • Improved availability guarantees can be achieved by distributing replica set members across multiple regions. These secondaries will participate in the automated election and failover process should the primary (or the cloud region containing the primary) go offline.
  • Read-only replica set members allow customers to optimize for local reads (minimize read latency) across different geographic regions using a single MongoDB deployment. These replica set members will not participate in the election and failover process and can never be elected to a primary replica set member.

In this post, we’ll dive a little deeper into optimizing for local reads using cross-region replication and walk you through the necessary configuration steps on an environment running on AWS.

Primer on read preference

Read preference determines how MongoDB clients route read operations to the members of a replica set. By default, an application directs its read operations to the replica set primary. By specifying the read preference, users can:

  • Enable local reads for geographically distributed users. Users from California, for example, can read data from a replica located locally for a more responsive experience
  • Allow read-only access to the database during failover scenarios

A read replica is simply an instance of the database that provides the replicated data from the oplog; clients will not write to a read replica.

With MongoDB Atlas, we can easily distribute read replicas across multiple cloud regions, allowing us to expand our application's data beyond the region containing our replica set primary in just a few clicks.

To enable local reads and increase the read throughput to our application, we simply need to modify the read preference via the MongoDB drivers.

Enabling read replicas in MongoDB Atlas

We can enable read replicas for a new or existing MongoDB paid cluster in the Atlas UI. To begin, we can click on the cluster “configuration” button and then find the link named "Enable cross-region configuration options."

When we click this, we’ll be presented with an option to select the type of cross-replication we want. We'll choose deploy read-only replicas:

As you can see above, we have our preferred region (the region containing our replica set primary) set to AWS, us-east-1 (Virginia) with the default three nodes. We can add regions to our cluster configuration based on where we think other users of our application might be concentrated. In this case, we will add additional nodes in us-west-1 (Northern California) and eu-west-1 (Ireland), providing us with read replicas to serve local users.

Note that all writes will still go to the primary in our preferred region, and reads from the secondaries in the regions we’ve added will be eventually consistent.

We’ll click "Confirm and Deploy", which will deploy our multi-region cluster.

Our default connection string will now include these read replicas. We can go to the "Connect" button and find our full connection string to access our cluster:

When the deployment of the cluster completes, we will be ready to distribute our application's data reads across multiple regions using the MongoDB drivers. We can specifically configure readPreference within our connection string to send clients to the "closest replicas". For example, the Node native MongoDB Driver permits us to specify our preference:

readPreference Specifies the replica set read preference for this connection.

The read preference values are the following:

For our app, if we want to ensure the read preference in our connection string is set to the nearest MongoDB replica, we would configure it as follows:


Security and Connectivity (on AWS)

MongoDB Atlas allows us to peer our application server's VPC directly to our MongoDB Atlas VPC within the same region. This permits us to reduce the network exposure to the internet and allows us to use native AWS Security Groups or CIDR blocks. You can review how to configure VPC Peering here.

A note on VPCs for cross-region nodes:

At this time, MongoDB Atlas does not support VPC peering across regions. If you want to grant clients in one cloud region read or write access to database instances in another cloud region, you would need to permit the clients’ public IP addresses to access your database deployment via IP whitelisting.

With cross-region replication and read-only replicas enabled, your application will now be capable of providing fast, responsive access to data from any number of regions.

Get started today with a free 512 MB database managed by MongoDB Atlas here.

Cloud Data Strategies: Preventing Data Black Holes in the Cloud

Leo Zheng

Cloud, Atlas

Black holes are regions in spacetime with such strong gravitational pull that nothing can escape. Not entirely destructive as you might have been led to believe, their gravitational effects help drive the formation and evolution of galaxies. In fact, our own Milky Way galaxy orbits a supermassive black hole with 4.1 million times the mass of the Sun. Some theorize that none of us would be here were it not for a black hole.

On the flip side, black holes can also be found hurtling through the cosmos — often at millions of miles per hour — tearing apart everything in their path. It’s said that anything that makes it into their event horizons, the “point of no return”, will never be seen or heard from again, making black holes some of the most interesting and terrifying objects in space.

Why are we going on about black holes, gravitational effects, and points of no return? Because something analogous is happening right now in computing.

First coined in 2010 by Dave McCrory, the concept of “data gravity” treats data as if it were a planet or celestial object with mass. As data accumulates in an environment, applications and services that rely on that data will naturally be pulled into the same environment. The larger the “mass” of data there is, the stronger the “gravitational pull” and the faster this happens. Applications and services each have their own gravity but data gravity is by far the strongest, especially as:

  • The farther away data is, the more drastic the impacts on application performance and user experience. Keeping applications and services physically nearby reduces latency, maximizes throughput, and makes it easier for teams to build performant applications.
  • Moving data around has a cost. In most cases, it makes sense to centralize data to reduce that cost, which is why data tends to amass in one location or environment. Yes, distributed systems do allow organizations to partition data in different ways for specific purposes — for example, fencing sets of data by geographic borders to comply with regulations — but within those partitions, minimal data movement is still desirable.
  • And finally, efforts to digitize business and organizational activities, processes, and models (dubbed by many as “digital transformation” initiatives) succeed or fail based on how effectively data is utilized. If software is the engine by which digital transformation happens, then data is its fuel.

As in the real world, the larger the mass of an object, the harder it is to move, so data gravity also means that once your mass of data gets large enough, it is also harder (and in some cases, near impossible) to move. What makes this relevant now more than ever is the shift to cloud computing. As companies move to the cloud, they need to make a decision that will have massive implications down the line — where and how are they going to store their data? And how do they not let data gravity in the cloud turn into a data black hole?

There are several options for organizations moving from building their own IT to consuming it as a service in the cloud.

Proprietary Tabular (Relational) Databases

The companies behind proprietary tabular databases often penalize their customers for running these technologies on any cloud platform other than their own. This should not surprise any of us. These are the same vendors that for decades have been relying on selling heavy proprietary software with multi-year contracts and annual maintenance fees. Vendor lock-in is nothing new to them.

Organizations choosing to use proprietary tabular databases in the cloud also carry over all the baggage of those technologies and realize few cloud benefits. These databases scale vertically and often cannot take advantage of cloud-native architectures for scale-out and elasticity without massive compromises. If horizontal scale-out of data across multiple instances is available, it isn’t native to the database and requires complex configurations, app-side changes, and additional software.

Lifting and shifting these databases to the cloud does not change the fact that they’re not designed to take advantage of cloud architectures.

Open Source Tabular Databases

Things are a little better with open source tabular databases insofar as there is no vendor enforcing punitive pricing to keep you on their cloud. However, similar to proprietary tabular databases, most of these technologies are designed to scale vertically; scaling out to fully realize cloud elasticity is often managed with fragile configurations or additional software.

Many companies running these databases in the cloud rely on a managed service to reduce their operational overhead. However, feature parity across cloud platforms is nonexistent, making migrations complicated and expensive. For example, databases running on Amazon Aurora leverage Aurora-specific features not found on other clouds.

Proprietary Cloud Databases

With proprietary cloud databases, it’s very easy to get into a situation where data goes in and nothing ever comes out. These database services run only in their parent cloud and often provide very limited database functionality, requiring customers to integrate additional cloud services for anything beyond very simple use cases.

For example, many of the proprietary cloud NoSQL services offer little more than key-value functionality; users often need to pipe data into a cloud data warehouse for more complex queries and analytics. They also tend to be operationally immature, requiring additional integrations and services to address data protection and provide adequate performance visibility. And it doesn’t stop there. New features are often introduced in the form of new services, and before users know it, instead of relying on a single cloud database, they’re dependent on an ever-growing network of cloud services. This makes it all the more difficult to ever get data out.

The major cloud providers know that if they’re able to get your data in one of their proprietary database services, they’ve got you right where they want you. And while some may argue that organizations should actually embrace this new, ultimate form of vendor lock-in to get the most out of the cloud, that doesn’t leave customers with many options if their requirements, or if data regulations, change. What if the cloud provider you’re not using releases a game-changing service you need to edge out your competition? What if they open up a data center in a new geographic region you’ve prioritized and yours doesn’t have it on their roadmap? What if your main customer dictates that you should sever ties with your cloud provider? It’s happened before.

These are all scenarios where you could benefit from using a database that runs the same, everywhere.

The database that runs the same ... everywhere

As you move into the cloud, how you prevent data gravity from turning against you and limiting your flexibility is simple — use a database that runs the same in any environment.

One option to consider is MongoDB. As a database, it combines the flexibility of the document data model with sophisticated querying and indexing required by a wide range of use cases, from simple key-value to real-time aggregations powering analytics.

MongoDB is a distributed database designed for the cloud at its core. Redundancy for resilience, horizontal scaling, and geographic distribution are native to the database and easy to use.

And finally, MongoDB delivers a consistent experience regardless of where it is deployed:

  • For organizations not quite ready to migrate to the cloud, they can deploy MongoDB on premises behind their own firewalls and manage their databases using advanced operational tooling.
  • For those that are ready to migrate to the cloud, MongoDB Atlas delivers the database as a fully managed service across more than 50 regions on AWS, Azure, and Google Cloud Platform. Built-in automation of proven practices helps reduce the number of time-consuming database administration tasks that teams are responsible for, and prevents organizations from migrating their operational overhead into the cloud as well. Of course, if you want to self-manage MongoDB in the cloud, you can do so.
  • And finally, for teams that are well-versed in cloud services, MongoDB Atlas delivers a consistent experience across AWS, Azure, and Google, allowing the development of multi-cloud strategies on a single, unified data platform.

Data gravity will no doubt have a tremendous impact on how your IT resources coalesce and evolve in the cloud. But that doesn’t mean you have to get trapped. Choose a database that delivers a consistent experience across different environments and avoid going past the point of no return.


To learn more about MongoDB, check out our architecture guide.

You can also get started with a free 512 MB database managed by MongoDB Atlas here.

Header image via Paramount