What’s New in MongoDB 3.2, Part 1: Extending Use Cases with New Pluggable Storage Engines
MongoDB 3.2 is now Generally Available (GA) and ready for production deployment! It is a “giant” release in every sense of the term – packing more features and enhancements than anything that has come before. This 3-part blog series aims to help you navigate everything that is new, and provide the most important resources to get you started:
- Part 1 covers the availability of new storage engines, and illustrates new use-cases served by MongoDB 3.2.
- Part 2 discusses features designed to support mission-critical applications, including document validation and the enhanced replication protocol.
- Part 3 concludes with new tools and integrations designed to support data analysts, DBAs, and operations teams working with MongoDB.
If you want to get the detail now on everything MongoDB 3.2 offers, download the What’s New white paper.
New Use Cases Served by MongoDB
For developers building increasingly complex data-driven apps, there is no longer a "one size fits all" database storage technology that will perform optimally for every type of application required by the business. Modern applications need to support a variety of services with different access patterns, security requirements and price/performance profiles – from high throughput in-memory operations, to real-time analytics, to managing highly sensitive data.
MongoDB 3.0 introduced a new flexible storage architecture, making it fast and easy for MongoDB and the ecosystem to build new pluggable storage engines that allow the database to be extended with new capabilities, and to be configured for specific workload requirements. Moving beyond the two original storage engines supported with the 3.0 release, MongoDB 3.2 now adds two new options to the mix. The supported storage engines comprise:
- The default WiredTiger storage engine. For many applications, WiredTiger's granular concurrency control and native compression will provide the best all-around performance and storage efficiency for the broadest range of applications.
- The MMAPv1 engine, an improved version of the storage engine used in pre-3.x MongoDB releases.
- NEW: The Encrypted storage engine, protecting highly sensitive data, without the performance or management overhead of separate filesystem encryption.
- NEW: The In-Memory storage engine, delivering extreme performance and predictable latency coupled with real-time analytics for the most demanding, applications.
MongoDB uniquely allows users to mix and match multiple storage engines within a single MongoDB cluster. This flexibility provides a more simple and reliable approach to meeting diverse application needs for data. Traditionally, multiple database technologies would need to be managed to meet these needs, with complex, custom integration code to move data between the technologies, and to ensure consistent, secure access.
With MongoDB’s flexible storage architecture, the database automatically manages the movement of data between storage engine technologies using native replication. This approach significantly reduces developer and operational complexity when compared to running multiple distinct database technologies. Users can leverage the same MongoDB query language, data model, scaling, security, and operational tooling across different parts of their application, with each powered by the optimal storage engine.
New Default MongoDB Storage Engine: WiredTiger
MongoDB 3.2 now uses WiredTiger as its default storage engine. When compared to the original MMAP storage engine used in earlier MongoDB releases, WiredTiger's more granular concurrency control and native compression improve performance by 7-10x, while reducing storage overhead by up to 80%. WiredTiger is ideal for a wide range of operational applications, and is therefore the default storage engine.
New MongoDB Encrypted Storage Engine
The frequency and severity of data breaches continues to escalate year on year. Research from PWC identified over 117,000 attacks against information systems every day in 2014, representing an increase of 48% over the previous year. With databases storing an organization’s most important information assets, securing them is top of mind for administrators.
With advanced authentication, authorization, auditing and network encryption security controls, MongoDB is widely used in regulated industries such as finance, retail, healthcare, education and government. However, protecting data stored “at-rest” on persistent storage required encryption to be implemented either at the application level, or via external filesystem and disk encryption solutions. By introducing additional technology into the stack, both of these approaches can add cost and complexity.
With the introduction of the Encrypted storage engine, protection of data at-rest now becomes an integral feature of the database. The raw database “plaintext” content is encrypted using an algorithm that takes a random encryption key as input and generates ciphertext that can only be read if decrypted with the decryption key. The process is entirely transparent to the application. MongoDB supports a variety of encryption schema, with AES-256 (256 bit encryption) in CBC mode being the default. AES-256 in GCM mode is also supported. The encryption schema can be configured for FIPS 140-2 compliance.
The storage engine encrypts each database with a separate key. The key-wrapping scheme in MongoDB wraps all of the individual internal database keys with one external master key for each server. The Encrypted storage engine supports two key management options – in both cases, the only key being managed outside of MongoDB is the master key:
- Local key management via a keyfile.
- Integration with a third party key management appliance via the KMIP protocol (recommended).
Most regulatory requirements mandate that the encryption keys must be rotated and replaced with a new key at least once annually. MongoDB can achieve key rotation without incurring downtime by performing rolling restarts of the replica set. When using a KMIP appliance, the database files themselves do not need to be re-encrypted, thereby avoiding the significant performance overhead imposed by key rotation in other databases. Only the master key is rotated, and the internal database keystore is re-encrypted.
The Encrypted storage engine is based on WiredTiger, and so is designed for operational efficiency and performance:
- Document level concurrency control and compression.
- Support for Intel’s AES-NI equipped CPUs for acceleration of the encryption/decryption process.
- As documents are modified, only updated storage blocks need to be encrypted, rather than the entire database.
Based on user testing, the Encrypted storage engine minimizes performance overhead to around 15% (this can vary, based on data types being encrypted), which can be much less than the observed overhead imposed by some filesystem encryption solutions.
The Encrypted storage engine is available as part of MongoDB Enterprise Advanced. Refer to the documentation to learn more, and see a tutorial on how to configure the storage engine. Download the MongoDB Security Architecture guide for an overview of all MongoDB’s security controls.
Flexible In-Memory Computing with MongoDB
The advantages of in-memory computing are well understood. Data can be accessed in RAM nearly 100,000 times faster than retrieving it from disk, delivering orders-of-magnitude higher performance for the most demanding applications. Examples include real-time re-scoring of personalized product recommendations as users are browsing a site, or trading stocks in immediate response to market events.
With the addition of the new In-Memory engine based on WiredTiger, MongoDB users can now realize the performance advantages of in-memory computing, without trading away the rich query flexibility, real-time analytics, scalable capacity, or durability guarantees offered by conventional disk-based databases.
The benefits of storage engine flexibility extend beyond the boundaries of a single application. Unlike monolithic code bases of the past, modern applications typically comprise multiple services, each can have its own unique data access patterns and performance profiles. MongoDB’s storage architecture allows users to optimize for the requirements of each service. As illustrated by the e-commerce example in Figure 3, user data is managed by the In-Memory engine to provide the throughput and bounded latency essential for great customer experience. However, the product catalog’s data storage requirements exceed server memory capacity, so is provisioned to another MongoDB replica set configured with the disk-based WiredTiger storage engine.
In this example, MongoDB’s flexible storage architecture means developers are freed from the complexity of having to use different in-memory and disk-based databases to support the e-commerce application. Administrators are freed from the complexity of having to configure and manage separate data layers. Instead, the application uses the same MongoDB database with each service powered by the storage engine best optimized for the use case.
The In-Memory storage engine is part of MongoDB Enterprise Advanced. It is available for beta testing now, and is scheduled to reach GA in early 2016.
That wraps up the first part of our 3-part blog series. Remember, you can get the detail now on everything MongoDB 3.2 offers by downloading the What’s New white paper.
Alternatively, if you’d had enough of reading about it and want to get your hands on the code now, then:
To start using MongoDB 3.2 as quickly and efficiently as possible, bring in the experts. MongoDB’s consulting engineers can deliver a private training on 3.2 features tailored to your needs, then work with you to develop a customized upgrade plan for your deployment. Interested?
About the Author - Mat Keep
Mat is a director within the MongoDB product marketing team, responsible for building the vision, positioning and content for MongoDB’s products and services, including the analysis of market trends and customer requirements. Prior to MongoDB, Mat was director of product management at Oracle Corp. with responsibility for the MySQL database in web, telecoms, cloud and big data workloads. This followed a series of sales, business development and analyst / programmer positions with both technology vendors and end-user companies.