GIANT Stories at MongoDB

Cloud Manager Automation with DigitalOcean

Jay Gordon

Cloud

Cloud Manager automation is an easy way to create new MongoDB deployments without having to do much of the underlying configuration work. This post is how you can get started using automation on your droplets.

A note: This does not provide details on configuring security for your system. Please review our documentation on configuring MongoDB with iptables on Linux.

Initial setup:

Let’s begin by determining what our use case and requirements are for our application. I think today I will start by working on a simple three node replica set using 1GB sized droplets. First ensure you’ve got credit and the appropriate ssh key you plan on working imported to your account.

https://webassets.mongodb.com/_com_assets/blog/tblr/66.media.tumblr.com--92ba6c5e72d79436fb127ccab9a304a8--tumblr_o6xclnmiHF1sdaytmo2_1280.png

First I’ve created the three droplets required, I am using Ubuntu 14.04 LTS x64, however feel free whichever of the supported versions we have agents for. Cloud Manager supports many different OSs and has a number of agents available. DigitalOcean has a Private Networking option, making it easy to configure your deployment securely. When creating these new droplets, it’s a good idea to enable this service.

https://webassets.mongodb.com/_com_assets/blog/tblr/67.media.tumblr.com--4f6dbaa2f6410ab2cb817f8eb1a436b9--tumblr_o6xclnmiHF1sdaytmo1_1280.png

Next, select a hostname, in this case we’ll use something able to identify where we are located and what we are doing:

https://webassets.mongodb.com/_com_assets/blog/tblr/67.media.tumblr.com--133120f5f2a414abb321a2257097bf1b--tumblr_o6xclnmiHF1sdaytmo3_1280.png

Configuring the Droplets:

Your droplets will need a few post-configuration options for our automation to begin working correctly.

Hostname and Private IPs

SSH into your droplets and validate their hostname:

root@mongodb-1gb-nyc2-01:~# hostname -f
mongodb-1gb-nyc2-01

root@mongodb-1gb-nyc2-02:~# hostname -f
mongodb-1gb-nyc2-02

root@mongodb-1gb-nyc2-03:~# hostname -f
mongodb-1gb-nyc2-03

Review each host’s private IP:

root@mongodb-1gb-nyc2-01:~# /sbin/ifconfig eth1 | grep 'inet addr:' | cut -d: -f2| cut -d' ' -f1
10.128.32.63

root@mongodb-1gb-nyc2-02:~# /sbin/ifconfig eth1 | grep 'inet addr:' | cut -d: -f2| cut -d' ' -f1
10.128.32.64

root@mongodb-1gb-nyc2-03:~# /sbin/ifconfig eth1 | grep 'inet addr:' | cut -d: -f2| cut -d' ' -f1
10.128.32.66

Create hosts file entry for each site and then enter it into each server’s /etc/hosts/ file:

10.128.32.63 mongodb-1gb-nyc2-01
10.128.32.64 mongodb-1gb-nyc2-02
10.128.32.66 mongodb-1gb-nyc2-03

Our entire /etc/hosts file example for mongodb-1gb-nyc2-01:

127.0.1.1 mongodb-1gb-nyc2-01 mongodb-1gb-nyc2-01
127.0.0.1 localhost

10.128.32.63 mongodb-1gb-nyc2-01
10.128.32.64 mongodb-1gb-nyc2-02
10.128.32.66 mongodb-1gb-nyc2-03

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

Test you can ping by name from a host:

root@mongodb-1gb-nyc2-01:~# ping mongodb-1gb-nyc2-02
PING mongodb-1gb-nyc2-02 (10.128.32.64) 56(84) bytes of data.
64 bytes from mongodb-1gb-nyc2-02 (10.128.32.64): icmp_seq=1 ttl=64 time=1.75 ms

Install the automation agent on all three:

Next, we install the automation agent to all three servers. Log onto your Cloud Manager Deployment, click the “Agents” tab and you’ll then find a tab titled “Downloads & Setttings.” Here you will be able to review the different agents available, in this case we will use the Ubuntu (12.X, 14.X) - DEB agent.

https://webassets.mongodb.com/_com_assets/blog/tblr/65.media.tumblr.com--238776b002da267be18834f80241ed8a--tumblr_o6xclnmiHF1sdaytmo4_1280.png

Once you click this link, you’ll see a set of instructions on how to install the agent, enter the appropriate API key information and how to prep your data directory for the MongoDB database files. To save time, you can repeat each step of these instructions in parallel across servers with the same OS, on Mac OS X I’ve found iTerm 2 useful for this feature along with ClusterSSH in Linux.

Install the package as instructions say and then start the agent across your three droplets:

root@mongodb-1gb-nyc2-01:~# sudo dpkg -i mongodb-mms-automation-agent-manager_2.7.1.1631-1_amd64.deb
Selecting previously unselected package mongodb-mms-automation-agent-manager.
(Reading database ... 86823 files and directories currently installed.)
Preparing to unpack mongodb-mms-automation-agent-manager_2.7.1.1631-1_amd64.deb ...
Unpacking mongodb-mms-automation-agent-manager (2.7.1.1631-1) ...
Setting up mongodb-mms-automation-agent-manager (2.7.1.1631-1) ...
Adding system user `mongodb' (UID 106) ...
Adding new user `mongodb' (UID 106) with group `nogroup' ...
Not creating home directory `/home/mongodb'.
Adding group `mongodb' (GID 114) ...
Done.
Adding user `mongodb' to group `mongodb' ...
Adding user mongodb to group mongodb
Done.
Processing triggers for ureadahead (0.100.0-16) …

Open the config file, add your API and Group information in the file as recommended:

sudo vi /etc/mongodb-mms/automation-agent.config

This file may look something like this, just paste in the key values given to you in the installer instructions on the Cloud Manager UI:

#
# REQUIRED
# Enter your Group ID - It can be found at https://cloud.mongodb.com/settings/group
#
mmsGroupId=


#
# REQUIRED
# Enter your API key - It can be found at https://cloud.mongodb.com/settings/group
#
mmsApiKey=

Save the file and close it up

Prep your data directory

The default location we normally like to recommend is /data. You may choose a different directory but it must be configured with much of the same values:

Prepare the /data directory to store your MongoDB data. This directory must be owned by the mongodb user.

sudo mkdir -p /data
sudo chown mongodb:mongodb /data 

Start your agent (startup method may depend on your OS install):

And finally let’s start the agent!

sudo start mongodb-mms-automation-agent
mongodb-mms-automation-agent start/running

Monitoring Agent

Now that we have our automation agents installed, wait a few seconds and they should begin to report back to Cloud Manager (Deployment - > Agents):

https://webassets.mongodb.com/_com_assets/blog/tblr/66.media.tumblr.com--91634a0402cef6b27eb4b8099bedc28a--tumblr_o6xclnmiHF1sdaytmo7_1280.png

As the red box says, we’re not quite ready to get started. We still need to install a monitoring agent, which we can do right here from Cloud Manager’s UI. Be aware, unlike the automation agent, the monitoring agent only requires installation on one server in your group. [You can review this detailed blog post on agents and how many are required for additional information on this subject] (http://blog.cloud.mongodb.com/post/120124825840/how-many-agents-do-i-need-and-why/).

First, go to the Server’s tab under Deployment. Your droplets should be visible here:

https://webassets.mongodb.com/_com_assets/blog/tblr/66.media.tumblr.com--4a0795582c093353c2543d8b06b8b33a--tumblr_o6xcm60nFG1sdaytmo4_1280.png

Since we only require this monitoring agent on one of the nodes, we can go ahead and just click the “…” and then select to install the monitoring agent. We’ll do this on mongodb-1gb-nyc02-01:

https://webassets.mongodb.com/_com_assets/blog/tblr/67.media.tumblr.com--211345cebeeae974d2491b1a337a268a--tumblr_o6xcm60nFG1sdaytmo1_1280.png

Click the yellow bar on the top to review deploy and allow the agent to install.

https://webassets.mongodb.com/_com_assets/blog/tblr/67.media.tumblr.com--b203824d38443aa670652fbc77c49eb7--tumblr_o6xclnmiHF1sdaytmo6_1280.png

Automate your installation:

Now we are finally ready to configure our systems to run with MongoDB as a replica set.

Click Deployment and then find the green +ADD button, and select “New Replica Set”

https://webassets.mongodb.com/_com_assets/blog/tblr/66.media.tumblr.com--005f3b277431ce6778011d127f9cb4c2--tumblr_o6xcm60nFG1sdaytmo3_500.png

This will bring up your replica set configuration setting within your Cloud Manager UI, you can then set your systems and MongoDB configuration here for your new deployment, then click apply:

https://webassets.mongodb.com/_com_assets/blog/tblr/67.media.tumblr.com--fb72c575be92558fca8aad23aa0867b9--tumblr_o6xclnmiHF1sdaytmo5_1280.png

Now we have a preview of our new replica set, check all looks right and then click Review & Deploy:

https://webassets.mongodb.com/_com_assets/blog/tblr/65.media.tumblr.com--26cba3fdf6b7a819d7397ebdda4537f2--tumblr_o6xcm60nFG1sdaytmo2_1280.png

Now Wait…

Now wait and you should see your deployment begin, in but a few minutes you should then see your new DigitalOcean MongoDB Cluster in the Cloud Manager UI:

https://webassets.mongodb.com/_com_assets/blog/tblr/66.media.tumblr.com--4b2472ae7fef7499524ab362c9114b6f--tumblr_o6xcm60nFG1sdaytmo5_1280.png

That’s it, MongoDB has been deployed on your droplets. At this time, it’s a good idea to begin the process of configuring your MongoDB authentication and security of your network. Once you’ve done these tasks, you can begin inserting your data and working to develop your application.

Jay Gordon is a Technical Account Manager with MongoDB and is available via our chat to discuss Cloud Manager at cloud.mongodb.com

Microsoft Azure + Cloud Manager

MongoDB

Cloud

Editor’s Note: Azure provisioning from Cloud Manager is labeled as “BETA” because it’s relatively new. We are making this feature available early so that we can explore usage patterns and gather feedback. If you would like to be added to the beta program for Azure testing please send an e-mail to beta-team@mongodb.com! There are lots of great incentives such as Cloud Manager discounts, Amazon gift cards, and cool MongoDB t-shirts and stickers!

We have introduced a very cool new feature to Cloud Manager - integration with Microsoft Azure Cloud! If you use Azure, you will now be able to use Cloud Manager to automatically provision your instances in the cloud and automatically install MongoDB on it with just a few clicks.

  1. To get started, log in to Cloud Manager, and choose Add->Provision New Servers, or build a new deployment and choose New Cloud Servers. You will now be able to choose Microsoft Azure as a provider.
    https://webassets.mongodb.com/_com_assets/blog/tblr/40.media.tumblr.com--df0e6fee519347b0e9472c6bd2330f01--tumblr_nwhcxaOVSP1sdaytmo2_1280.png
  2. Before we can provision instances in Azure for you, we’ll first need to establish a secure connection with your account. Download the management certificate from Cloud Manager and upload to your Azure Account.
  3. Provide your Azure Subscription ID. You’ll find the Subscription ID for your account in the Settings->Subscription section of Azure console.
  4. Next you’ll be able to configure your Azure instances.
    1. Choose Number of Servers
    2. Azure Region
    3. Operating System
    4. Instance Type
    5. Storage Type
    6. Volume Size
    7. Network Settings (Cloud Service, Virtual Network, Subnet)
      https://webassets.mongodb.com/_com_assets/blog/tblr/36.media.tumblr.com--30e8bfa2c3aa640c71f488e06fedda1b--tumblr_nwhcxaOVSP1sdaytmo1_1280.png
  5. Cloud Manager will then provision instances in your Azure account on which to host your chosen MongoDB deployment. During this time Cloud Manager will:
    1. Create the instances in Azure
    2. Configure your server according to MongoDB production best practices
    3. Apply the latest security patches to the instances
    4. Install an Automation Agent on each instance
      Please be patient as this process may take up to 10 minutes to complete.

Thanks for reading!

How Backup pricing works in Cloud Manager

MongoDB

Cloud

Many customers have asked us how we charge for MongoDB Cloud Manager backup. Your first 1GB per replica set is always free. After that, we charge:

  • $2.50 per GB per month
  • or $30 per GB per year (for prepaid customers)

How do we calculate your data size (that is being backed up)? Total data size is calculated as the sum of dataSize plus indexSize. If you go into each database and check these values from the db.stats() function you can get the total dataset size. Here is a handy script you can run from the mongo shell that automates this calculation for each of your databases:

var dbNames = [];
var sum = 0;
var dblist = db.getMongo().getDBs();
for (var key in dblist) {
    if(key === "databases") {
        for (var i = 0; i < dblist.databases.length; i++){
            if (dblist.databases[i].name !== "local") {
                dbNames.push(dblist.databases[i].name);
            }
        }
    }
} 
function formatSizeUnits(bytes){
  if      (bytes>=1000000000) {bytes=(bytes/1000000000).toFixed(2)+' GB';}
  else if (bytes>=1000000)    {bytes=(bytes/1000000).toFixed(2)+' MB';}
  else if (bytes>=1000)       {bytes=(bytes/1000).toFixed(2)+' KB';}
  else if (bytes>1)           {bytes=bytes+' bytes';}
  else                        {bytes='0 byte';}
  return bytes;
}
for (var i = 0; i < dbNames.length; i++) {
  var indexSize = db.getMongo().getDB(dbNames[i]).stats().indexSize;
  var dataSize =  db.getMongo().getDB(dbNames[i]).stats().dataSize;
  var total = indexSize + dataSize;
  sum += total;
  print("db name: " + dbNames[i] + " | indexSize: " + formatSizeUnits(indexSize) + " | dataSize: " + formatSizeUnits(dataSize) + " | total: " + formatSizeUnits(total));
}
print("total size of all dbs: " + formatSizeUnits(sum));

You can also find the script at this Gist.

Here is some sample output of that script:

$ mongo getTotalSizeofDatabases.js
MongoDB shell version: 3.0.1
connecting to: customerDB:
db name: course | indexSize: 1.41 MB | dataSize: 2.40 MB | total: 3.81 MB
db name: course2 | indexSize: 8.18 KB | dataSize: 448 bytes | total: 8.62 KB
db name: test | indexSize: 829.97 MB | dataSize: 1.23 GB | total: 2.06 GB
total size of all dbs: 2.06 GB

Note that we don’t include the local database here. This is because Cloud Manager Backup does not backup this database, and therefore you will not be charged for its size.

Also, please note that if you are still running Cloud Manager Classic, our old pricing with oplog processing and data size still applies. Check your Billing/Subscriptions page under Administration for an estimate on how the new Cloud Manager pricing can work for you – many customers will save money under the new pricing.

Upgrade your MongoDB Deployment from 2.6.9 to 3.0.1 with Automation

MongoDB

Cloud

Using the MongoDB Management Service, you can now upgrade your MongoDB deployment safely and quickly from 2.6.9 to 3.0.1 via the user interface. To upgrade your own deployment, follow the tutorial below. As always, if you should run into any trouble, you can reach out to MongoDB via the MMS Support page.

Step 0. Considerations Before Upgrading

  • MongoDB 3.0.1 introduces significant changes to the authorization model. If you are using authSchemaVersion 1 (the schema used in MongoDB 2.4 and earlier), you must upgrade to authSchemaVersion 3 before proceeding to upgrade to 3.0.1. If using Automation, you can do the schema upgrade by clicking the wrench for your deployment. If you haven’t imported for Automation yet, here are the manual instructions.
  • You cannot upgrade directly from 2.4 to 3.0, you must upgrade to 2.6 first.
  • Driver compatibility: As long as you leave the authSchemaVersion on 3, and are still on MMAPv1, your old driver should continue to work. However, if you upgrade the authSchemaVersion to 5, or upgrade to wiredTiger, a MongoDB 3.0 compatible driver will be necessary. Many third-party tools will not work after you upgrade to authSchemaVersion 5. Check for MongoDB 3.0 compatible versions of your utilities.

This is not a guide to upgrading to WiredTiger, but this is a tutorial for using Automation to change your storage engine to WiredTiger.

Step 1. Review Your Current Deployment

On the deployment page, you should see that your current replica set has recent pings indicated by the green light to the right of each member and have no startup warnings. I’ve set up a sharded cluster here, but the process is the same.

On the server tile page, each server where a node is located should have an automation agent already up and running.

If each server does not have an automation agent on it, you will need to download the automation agent from the Administration -> Agents page, install it, and make sure that you have imported your deployment for automation. If one of the hosts is unreachable, you will need to resolve this issue as well before being able to upgrade the replica set.

Step 2. Select the deployment and modify.

Click the wrench for your deployment to bring up the modification panel.

Click the version drop-down and select 3.0.1 and then click “Apply”. Review and deploy your changes. Now the Automation Agents will download the proper MongoDB version and do the upgrade in the proper order for you.

If you do not see 3.0.1 in your list of MongoDB versions, then it is not enabled in your Version Manager. You can enable it by going to “Deployment” and then selecting “Version Manager”. Add the versions you wish to upgrade to and click “Review & Deploy” and then “Confirm & Deploy”.

Step 3: Enjoy your updated deployment

If you are ready with 3.0-Compatible drivers and utilities, you can now change the AuthSchema to 5 via the modifications sidebar from Step 2, above.

MongoDB Performance Optimization with MMS

MongoDB

Cloud

This is a guest post by Michael De Lorenzo, CTO at CMP.LY.

CMP.LY is a venture-funded startup offering social media monitoring, measurement, insight and compliance solutions. Our clients include Fortune 100 financial services, automotive, consumer packaged goods companies, as well as leading brands, advertising, social and PR agencies.

Our patented monitoring, measurement and insight (MMI) tool, CommandPost provides real-time, actionable insights into social performance. Its structured methodology, unified cross-platform reporting and direct channel monitoring capabilities ensure marketers can capture and optimize the real value of their engagement, communities and brand advocates. All of CommandPost’s products have built-in compliance solutions including plain language disclosure URLs (such as rul.es, ter.ms, disclosur.es, paid-po.st, sponsored-po.st and many others).

MongoDB at CMP.LY

At CMP.LY, MongoDB provides the data backbone for all of CommandPost’s social media monitoring services. Our monitoring services collect details about social media content across multiple platforms, all engagements with that content and builds profiles around each user that engages. This amounts to thousands of writes hitting our MongoDB replica set every second across multiple collections. While our monitoring services are writing new and updated data to the database in the background, our clients are consuming the same data in real-time via our dashboards from those same collections.

More Insights Mean More Writes

With the launch of CommandPost, we expanded the number of data points our monitoring services collected and enhanced analysis of those we were already collecting. These changes saw our MongoDB deployment come under a heavier load than we had previously seen - especially in terms of the number of writes performed.

Increasing the number of data points collected also meant we had more interesting data for our clients to access. From a database perspective, this meant more reads for our system to handle. However, it appeared we had a problem - our API was slower than ever in returning the data clients requested.

We had been diligent about adding indexes and making sure the most frequent client-facing queries were covered, but reads were still terribly slow. We turned to our MongoDB Management Service dashboard for clues as to why.

MongoDB Management Service

By turning to MMS, we knew we would have a reliable source to provide insight into what our database was doing both before and after our updates to CommandPost. Most (if not all) of the stats and charts we typically pay attention to in MMS looked normal for our application and MongoDB deployment. As we worked our way through each metric, we finally came across one that had changed significantly- Lock Percentage.

Since releasing the latest updates to CommandPost, our deployment’s primary client-serving database saw its lock percentage jump from about 10% to a constant rate of 150-175%. This was a huge jump with a very negative impact on our application - API requests timed out, queries took minutes to complete and our client-facing applications became nearly unusable.

Why is Lock Percentage important?

A quick look at how MongoDB handles concurrency tells us exactly why Lock Percentage became so important for us.

MongoDB uses a readers-writer lock that allows concurrent reads access to a database but gives exclusive access to a single write operation. When a read lock exists, many read operations may use this lock. However, when a write lock exists, a single write operation holds the lock exclusively, and no other read or write operations may share the lock.

Locks are “writer greedy,” which means writes have preference over reads. When both a read and write are waiting for a lock, MongoDB grants the lock to the write. As of version 2.2, MongoDB implements locks at a per-database granularity for most read and write operations.

The “greediness” of our writes was not only keeping our clients from being able to access data (in any of our collections), but causing additional writes to be delayed.

Strategies to Reduce Lock Contention

Once we identified the collections most affected by the locking, we identified three possible remedies to the issue and worked to apply all of them.

Schema Changes

The collection that saw our greatest load (in terms of writes and reads) originally contained a few embedded documents and arrays that tended to make updating documents hard. We took steps to denormalize our schema and, in some cases, customized the _id attribute. Denormalization allowed us to model our data for atomic updates. Customizing the _id attribute, allowed us to simplify our writes without additional queries or indexes by leverage the existing index on the document’s _id attribute. Enabling atomic updates allowed us to simplify our application code and reduce the time spent in application write lock.

Use of Message Queues

To manage the flow of data, we refactored some writes to be managed using a Publish-Subscribe pattern. We chose to use Amazon’s SQS service to do this, but you could just as easily use Redis, Beanstalkd, IronMQ or any other message queue.

By implementing message queuing to control the flow of writes, we were able to spread the frequency of writes over a longer period of time. This became crucially important during times where our monitoring services came under higher-than-normal load.

Multiple Databases

We also chose to take advantage of MongoDB’s per database locking by creating and moving write-heavy collections into separate databases. This allowed us to move non-client-facing collections into databases that didn’t need to be accessed by our API and client queries.

Splitting into multiple databases meant that only the database taking on an update needed to be locked, leaving all other databases to remain available to serve client requests.

How did things change?

The aforementioned changes yielded immediate results. The results were so drastic that many of our users commented to us that the application seemed faster and performed better. It wasn’t their imaginations - as you can see from the “after” Lock Percentage chart below, we reduced the value to about 50% on our primary client-serving database.

What’s Next?

In working with MongoDB Technical Services, we also identified one more strategy we intend to implement to further reduce our Lock Percentage - Sharding. Sharing will allow us to horizontally scale our write workload across multiple servers and easily add additional capacity to meet our performance targets.

We’re excited about the possibility of not just improving the performance of our MongoDB deployment, but offering our users faster access to their data and a better overall experience using CommandPost.

If you want to learn more about how to use MongoDB Management Service to identify potential issues with your MongoDB deployment, keep it healthy and keep your application running smoothly, attend my talk “Performance Tuning on the Fly at CMP.LY” at MongoDB World in New York City, on Tuesday, June 24th at 2:20pm in the New York East Ballroom.

Increasing MMS Security via Two-Factor Authentication

MongoDB

Cloud

As of May 28th, the MongoDB Management Service (MMS) requires Two Factor Authentication (2FA) for all MMS users. Two-factor authentication requires you to know your password and have a physical item that proves your identity. In our implementation, that second factor is your phone. So when you log in, after you enter your password correctly, MMS will prompt you for a code that proves you have your phone.

There are multiple ways to receive a 2FA code in real time:

  • Google Authenticator for Android or Apple iOS on your smartphone. Google Authenticator produces time-based codes that do not require a connection to the internet. You seed the Google Authenticator app by scanning a QR code shown to you by MMS during setup. Once seeded, the Google Authenticator app will show you the current code whenever it is running.
  • Text message to a cellphone number. When you set up your MMS account, you can provide a cell phone number to receive your 2FA codes. Whenever you need to login, MMS will send you a code via SMS. SMS works well for most users, however, certain network providers and countries may impose delays on SMS messages. If you’re using text messaging, you’ll also have to have cell service whenever you want to log in to MMS. For example, you may want to log in on an airplane or when traveling internationally. In these cases, Google Authenticator is a good alternative since it does not require a network connection.
  • Voice call to a cell phone. This option is almost exactly like text messaging. When you try to log in, you will get an automated phone call that reads out the 2FA code required to login.

As a backup, you can also generate recovery codes when setting up 2FA within MMS. These are longer codes that can be used in place of a 2FA code when you don’t have access to a phone or your Google Authenticator app. Each recovery code can be used exactly once, and you should save these codes in a secure place. Additionally, you can re-generate your recovery codes in your Two Factor Authentication link under Settings->Profile in MMS. When you generate new recovery codes, you invalidate previously generated ones.

MMS 2FA requires a little extra work but we believe that it provides a significantly improved level of security to MMS users. If you run into any problems setting up your 2FA, please reach out to the MMS Support team.