Andrew Erlichson

2 results

Announcing the Next Generation Drivers for MongoDB

When developers get started with MongoDB, they are delighted to discover that it supports the programming languages they love. For as long as the database has been around, we’ve offered at least eight official drivers that help to make developers productive quickly. Moving in lockstep with the server to support all major changes, these drivers ensure that you’re always able to take advantage of the latest features in the database. Over the years, each driver engineering team has designed its own approach to querying the database, providing fault tolerance, exposing database options, offering an asynchronous interface, and implementing dozens of other features. Each driver has also accumulated its fair share of technical debt, having been initially designed for a much simpler version of MongoDB. Meanwhile, as MongoDB has proliferated and matured, the number of organizations that use more than one driver has increased. Although each driver strives to be idiomatic in the language it supports, there is little reason why the core CRUD API needs to vary across drivers, and after several years, there was some divergence. We needed a fresh start, driven by more formal specifications and informed by all the learning we have collectively done since MongoDB was first introduced in 2009. We are excited to announce that in the next few weeks we will be rolling out new drivers for all languages that build on everything we've learned. Stay tuned for major releases from the Java, .NET, Python, Node.js, and Ruby teams. PHP and Perl will follow soon after. An updated C++ driver is also in development. These new MongoDB drivers conform to published specifications. Server Selection - Deciding which server to send database operations to in a MongoDB deployment. Server Discovery and Monitoring - All the logic required to make a MongoDB application highly available. CRUD API - The API for how we Create, Read, Update and Delete data from MongoDB. Authentication - The rules for how to authenticate to MongoDB servers. Let us know what you think! As you are the developers using these drivers every day to access MongoDB, we both value and rely heavily on your input. You directly inform many of the choices made in the specifications. Please continue to give us feedback so we can make the best possible interface for you. If you’d like to report an issue in a specific driver, please do so in the individual driver project in JIRA (e.g. <a href="" target="_blank" JAVA, PYTHON ). See a full list of our JIRA projects here . If you’d like to make a feature request or report an issue common to all drivers, please do so in the DRIVERS project . We look forward to your feedback on this latest release. To learn more about what's new in MongoDB 3.0, download the white paper below. Discover MongoDB 3.0 About the Author - Andrew Andrew Erlichson manages the engineering teams that create the developer experience around MongoDB, including MongoDB drivers, Documentation and Education. Prior to MongoDB, Andrew was CEO and founder of Phanfare, an online photo hosting company, now part of Carbonite. Prior to Carbonite, he was founder and CEO of Flashbase, a web service that offered self-service online database forms and analysis tools. Flashbase was acquired by DoubleClick. At DoubleClick, Andrew was Vice President of Technology for the Research and Development group. He has worked at Mips Computer Systems, Silicon Graphics and BlackRock. Andrew received his A.B. from Dartmouth College and his M.S. and Ph.D. in Electrical Engineering from Stanford University.

March 25, 2015

Backup vs. Replication - Why Do You Need Both?

Sometimes when MongoDB users evaluate Cloud Manager , they question whether they need a backup if they are already using replication. Doesn’t replication already protect their data sufficiently? The answer can be reduced to a single distinction: replication is for availability and backups are for disaster recovery. Availability is a measure of percentage of time an application is working and accessible to end users at full capacity. ^1 Over time, infrastructures always suffer predictable failures that can impact this availability. All drives eventually fail, often without warning. So do power supplies, network cards, and other individual machine components. Sometimes your server runs out of space on its disk, at which point you’ll need to take it offline so you can install a bigger one. These issues impact cloud infrastructures as well, although in different ways. For example, you might decide you need to re-provision a machine with twice as much RAM. All these issues would lead to your system being unavailable. To insulate infrastructures from these completely inevitable events, we build redundancy into our systems. If a component becomes unavailable, a standby system takes over immediately and transparently. The MongoDB feature that enables redundancy is called Replication . For those not familiar, when you organize MongoDB into a replica set, all members of the set stay in sync. All database writes go to the primary member of the replica set and are quickly synced to all secondary members. In the event a primary node fails, an election takes place between the remaining members and a new primary is chosen. Members can also be removed from the set for maintenance, and reintegrated seamlessly. Replication ensures that in the event of a node failure, your application will still be available. Failover occurs automatically within seconds and with few exceptions will be invisible to end users. As a result, individual machines can fail, or be serviced, without impacting your availability. Disaster recovery, in contrast, is about dealing with events that are far less predictable, and which by their nature defeat the protections offered by redundancy. ^2 These events fall into two categories. The first category is human error and the second is catastrophic failure. In the category of human error, you have application bugs, deliberate hacking and accidental deletion or corruption of all data on the primary node. In all those cases, the errors introduced to the primary will propagate automatically to all members of your replica set, often within seconds! Given that human error is just as guaranteed as disk failure, this alone is enough reason to have backups. In the category of catastrophic failure, this includes scenarios that permanently destroy all members of your replica set. If you keep all replica set servers in the same data center, a fire that destroys the servers would qualify. So would a disgruntled employee who goes out and deliberately deletes the data. For these lower probability events, you want a relatively inexpensive solution that is well isolated from your production system. The MongoDB feature that enables protection against disasters is Backup . All backup systems share the characteristic that they offer a snapshot of your system at a past moment in time. That this snapshot is forever frozen is a critical feature of the backup. Backup snapshots should be stored far from your production system, far both physically, logically and administratively. Restoring the backup snapshot rolls back the clock on the event that caused the loss of all data across your replica set. Taking backups of MongoDB systems is incredibly easy, so you have no excuse for not developing a comprehensive backup strategy. MongoDB Cloud Manager , the hosted backup solution from MongoDB, Inc., provides a number of extra benefits. Cloud Manager further reduces the window of data loss vulnerability by continually recording your replica set oplog and providing point-in-time recovery for the most recent 24 hours (in addition to keeping the point-in-time snapshots). This is the holy grail for backup: recovery to an arbitrary moment in time before the error occurred. Cloud Manager is also separate from Amazon Web Services, a popular place to put production data. That means that failures that affect AWS will be uncorrelated to failures that might affect the Cloud Manager infrastructure (and we have our own levels of redundancy). That’s good for any insurance policy. Restoring from backup (your own, or Cloud Manager) is not instantaneous. You incur some downtime. But because backup covers catastrophes, this is acceptable. If you are restoring from backup, your application is already so broken that the cost of the downtime is less than the cost of running continuously with bad or missing data. To sum up: you want backup to cover the events that you hope to never see happen. (If it’s happening often, you need a new process.) Replication, on the other hand, offers fault tolerance against events that are fairly common and must be addressed without the user being aware of the event. Long story short, you need backup and replication. They address a different set of risks. More Information Replication in MongoDB MongoDB Backup Strategies Backing up with Cloud Manager Read More about Disaster Recovery

December 11, 2013