We can’t wait until this year’s MongoDB World, and are working hard behind the scenes. We want to make sure our biggest event is a fun-filled learning experience, and that attendees reflect our diverse community. As part of our commitment to changing the gender ratio in technology, we're excited to launch the Female Innovators at MongoDB World initiative.
Nominees will be notified of their nomination by April 15, 2016. The first 50 nominees to register will receive complimentary admission to MongoDB World, June 28-29 in NYC.
At the event they’ll be able to:
- Network with other female innovators through sessions and activities at the Women Who Code Lounge
- Meet the engineers behind the technology
- Learn best practices to amp up their skills, and the inside scoop on who’s doing what with MongoDB
- Explore community activities, including the Leaf Lounge, poster sessions, and our famous after party
Fill out the short nomination form to invite a woman who works or aspires to work in tech to attend our global conference. You can also nominate yourself! Make sure to act fast, a limited amount of tickets are available.
The deadline to submit a nomination is Friday, April 8. Eligible nominees will be notified of their acceptance status by April 15.
- Nominees must be 18 years old or older, and must identify as a female.
- Open to new participants of MongoDB World only.
- Can you nominate yourself? Absolutely!
Nominees will be notified of their nomination by April 15, 2016. The first 50 nominees to register will receive complimentary admission to MongoDB World, June 28-29 in NYC. View terms and conditions.
At-Rest Encryption in MongoDB 3.2: Features and Performance
Introduction MongoDB 3.2 introduces a new option for at-rest data encryption. In this post we take a closer look at the forces driving the need for increased encryption, MongoDB features for encrypting your data, as well as the performance characteristics of the new Encrypted Storage Engine. Data security is top of mind for many executives due to increased attacks as well as a series of data breaches in recent years that have negatively impacted several high profile brands. For example, in 2015, a major health insurer was a victim of a massive data breach in which criminals gained access to the Social Security numbers of more than 80 million people — resulting in an estimated cost of $100M. In the end, one of the critical vulnerabilities was the health insurer did not encrypt sensitive patient data stored at-rest. Data encryption is a key part of a comprehensive strategy to protect sensitive data. However, encrypting and decrypting data is potentially very resource intensive. It is important to understand the performance characteristics of your encryption technology to accurately conduct capacity planning. MongoDB 3.2: Delivering Native Encryption At-Rest MongoDB 3.2 provides a comprehensive encryption solution that protects your data, both in-flight and at-rest. For encryption-in-flight, MongoDB uses SSL/TLS, which ensures secure communication between your database and client, as well as inter-cluster traffic between nodes. Learn more about MongoDB and SSL/TLS . With the latest version 3.2 , MongoDB also includes a fully integrated encryption-at-rest solution that reduces cost and performance overhead. Encryption-at-rest is part of MongoDB Enterprise Advanced only, but is freely available for development and evaluation. We will take a closer look at this new option later in the post. Before 3.2, the primary methods to provide encryption-at-rest were to use 3rd party applications that encrypt files at the application, file system, or disk level. These methods work well with MongoDB but can add extra cost, complexity, and overhead. Additionally, disk and file system encryption might not protect against all situations. While disk level encryption protects from someone taking the physical drive from the machine, it does not protect from someone that has physical access to the machine and can override the file system. Similarly, file system encryption will prevent someone from overriding the file system, but does not preclude someone from gaining unauthorized access through the application or database layer. Database encryption mitigates these problems by adding an extra layer of security. Even if an administrator has access to the file system, he/she will first need to be authenticated to the database before decrypting the data files. MongoDB’s Encrypted Storage Engine supports a variety of encryption algorithms from the OpenSSL library. AES-256 in CBC mode is the default, while other options include GCM mode, as well as FIPS mode for FIPS-140-2 compliance. Encryption is performed at the page level to provide optimal performance. Instead of having to encrypt/decrypt the entire file or database for each change, only the modified pages need to be encrypted or decrypted. Additionally, the Encrypted Storage Engine provides safe and secure management of the encryption keys. Each encrypted node contains an internal database key that is used to encrypt/decrypt the data files. The database key is wrapped with an external master key, which must be given to the node for it to initialize. MongoDB uses operating system protection mechanisms, such as VirtualLock and mlock , that lock the process’ virtual memory space into memory, ensuring that keys are never written or paged to disk in unencrypted form. Evaluating Performance Encrypting and decrypting data requires the use of additional resources, and administrators will want to understand the performance impact to adjust capacity planning accordingly. In our Encrypted Storage Engine benchmarking tests, we saw an average throughput overhead between 10% and 20%. Let’s take a closer look at some benchmark data to show the results for Insert Only, Read Only, and 50%-Read/50%-Insert workloads. For our benchmark, we used Intel Xeon X5675 CPUs, which support the AES-NI instruction set, and ran the CPUs at high load(100%). There were four different configurations that we evaluated; “ Working Set Fits In Memory ”, “ Working Set Exceeds Memory ”, “ Encrypted ”, and “ Unencrypted ”. The ‘ Working Set ’ refers to the amount of data and indexes that is actively used by your system. Let’s first look at an Insert-Only workload. With a high CPU load, we see an encryption overhead of around ~16%. Now, let’s take a look at the results of our Read-Only Workload. We ran the benchmark between two scenarios; “ Working Set Fits In Memory ” and “ Working Set Exceeds Memory ”. From the benchmark results, the decryption overhead for a Read-Only workload ranges between 5–20%. Lastly, here are the benchmark results for a 50%-Read, 50%-Insert workload. For the 50%-Read/50%-Insert workloads, the encryption overhead ranges between 12%–20%. In addition to throughput, latency is also a critical component of encryption overhead. From our benchmark, average latency overheads ranged between 6% to 30%. Though average latency overhead was slightly higher than throughput overhead, latencies were still very low—all under 1ms. Average Latency(us) Unencrypted Encrypted % Overhead Insert Only Average Latency(us) 32.4 40.9 -26.5% Read Only Working Set Fits In Memory Avg Latency(us) 230.5 245.0 -6.3% Read Only Working Set Exceeds Memory Avg Latency(us) 447.0 565.8 -26.6% 50% Insert/50% Read Working Set Fits In Memory Avg Latency(us) 276.1 317.4 -15.0% 50% Insert/50% Read Working Set Exceeds Memory Avg Latency(us) 722.3 936.5 -29.7% MongoDB Atlas Encryption At Rest MongoDB Atlas is a database as a service and provides all the features of the database without the heavy lifting of setting up operational tasks. Developers no longer need to worry about provisioning, configuration, patching, upgrades, backups, and failure recovery. Atlas offers elastic scalability, either by scaling up on a range of instance sizes or scaling out with automatic sharding, all with no application downtime. MongoDB Atlas provides encryption of data-in-flight over the network and at rest on disk. Data-at-rest can be optionally protected using encrypted data volumes. Encrypted data volumes secure your data without the need for you to build, maintain, and secure your own key management infrastructure. Summary In this post, we looked at a few workloads to determine the impact of encryption with MongoDB's new Encrypted Storage Engine. The results demonstrate that the Encrypted Storage Engine provides a secure way to encrypt your data-at-rest, while maintaining exceptional performance. With the Encrypted Storage Engine and diligent capacity planning, you shouldn't have to make a tradeoff between high performance and strong security when encrypting data-at-rest. For users interested in a database as a service, MongoDB Atlas provides encrypted data volumes to ensure your data at rest is secure. Environment These tests were conducted on bare metal servers. Each server had the following specification: CPU : 3.06GHz Intel Xeon Westmere(X5675-Hexcore) RAM : 6x16GB Kingston 16GB DDR3 2Rx4 OS : Ubuntu 14.04-64 Network Card : SuperMicro AOC-STGN-i2S Motherboard : SuperMicro X8DTN+_R2 Document Size : 1KB Workload : YCSB Version : MongoDB 3.2 Learn More About Encryption and all of the security features available for MongoDB by reading our guide. MongoDB Security Architecture Guide Additional Resources Try MongoDB’s New Encrypted Storage Engine. Users can try the Encrypted Storage Engine free for unlimited development and evaluation. Read our installing MongoDB Enterprise 3.2 documentation . About the Author - Jason Ma Jason is a Principal Product Marketing Manager based in Palo Alto, and has extensive experience in technology hardware and software. He previously worked for SanDisk in Corporate Strategy doing M&A and investments, and as a Product Manager on the Infiniflash All-Flash JBOF. Before SanDisk, he worked as a HW engineer at Intel and Boeing. Jason has a BSEE from UC San Diego, MSEE from the University of Southern California, and an MBA from UC Berkeley.
How to Get Started with MongoDB Atlas and Confluent Cloud
Every year more and more applications are leveraging the public cloud and reaping the benefits of elastic scale and rapid provisioning. Forward-thinking companies such as MongoDB and Confluent have embraced this trend, building cloud-based solutions such as MongoDB Atlas and Confluent Cloud that work across all three major cloud providers. Companies across many industries have been leveraging Confluent and MongoDB to drive their businesses forward for years. From insurance providers gaining a customer-360 view for a personalized experience to global retail chains optimizing logistics with a real-time supply chain application, the connected technologies have made it easier to build applications with event-driven data requirements. The latest iteration of this technology partnership simplifies getting started with a cloud-first approach, ultimately improving developer’s productivity when building modern cloud-based applications with data in motion. Today, the MongoDB Atlas source and sink connectors are generally available within Confluent Cloud. With Confluent’s cloud-native service for Apache Kafka® and these fully managed connectors, setup of your MongoDB Atlas integration is simple. There is no need to install Kafka Connect or the MongoDB Connector for Apache Kafka, or to worry about scaling your deployment. All the infrastructure provisioning and management is taken care of for you, enabling you to focus on what brings you the most value — developing and releasing your applications rapidly. Let’s walk through a simple example of taking data from a MongoDB cluster in Virginia and writing it into a MongoDB cluster in Ireland. We will use a python application to write fictitious data into our source cluster. Step 1: Set up Confluent Cloud First, if you’ve not done so already, sign up for a free trial of Confluent Cloud . You can then use the Quick Start for Apache Kafka using Confluent Cloud tutorial to create a new Kafka cluster. Once the cluster is created, you need to enable egress IPs and copy the list of IP addresses. This list of IPs will be used as an IP Allow list in MongoDB Atlas. To locate this list, select “Custer Settings” and then the “Networking” tab. Keep this tab open for future reference: you will need to copy these IP addresses into the Atlas cluster in Step 2. Step 2: Set Up the Source MongoDB Atlas Cluster For a detailed guide on creating your own MongoDB Atlas cluster, see the Getting Started with Atlas tutorial. For the purposes of this article, we have created an M10 MongoDB Atlas cluster using the AWS cloud in the us-east-1 (Virginia) data center to be used as the source, and an M10 MongoDB Atlas cluster using the AWS cloud in the eu-west-1 (Ireland) data center to be used as the sink. Once your clusters are created, you will need to configure two settings in order to make a connection: database access and network access. Network Access You have two options for allowing secure network access from Confluent Cloud to MongoDB Atlas: You can use AWS PrivateLink, or you can secure the connection by allowing only specific IP connections from Confluent Cloud to your Atlas cluster. In this article, we cover securing via IPs. For information on setting up using PrivateLink, read the article Using the Fully Managed MongoDB Atlas Connector in a Secure Environment . To accept external connections in MongoDB Atlas via specific IP addresses, launch the “IP Access List” entry dialog under the Network Access menu. Here you add all the IP addresses that were listed in Confluent Cloud from Step 1. Once all the egress IPs from Confluent Cloud are added, you can configure the user account that will be used to connect from Confluent Cloud to MongoDB Atlas. Configure user authentication in the Database Access menu. Database Access You can authenticate to MongoDB Atlas using username/password, certificates, or AWS identity and access management (IAM) authentication methods. To create a username and password that will be used for connection from Confluent Cloud, select the “+ Add new Database User” option from the Database Access menu. Provide a username and password and make a note of this credential, because you will need it in Step 3 and Step 4 when you configure the MongoDB Atlas source and sink connectors in Confluent Cloud. Note: In this article we are creating one credential and using it for both the MongoDB Atlas source and MongoDB sink connectors. This is because both of the clusters used in this article are from the same Atlas project. Now that the Atlas cluster is created, the Confluent Cloud egress IPs are added to the MongoDB Atlas Allow list, and the database access credentials are defined, you are ready to configure the MongoDB Atlas source and MongoDB Atlas sink connectors in Confluent Cloud. Step 3: Configure the Atlas Source Now that you have two clusters up and running, you can configure the MongoDB Atlas connectors in Confluent Cloud. To do this, select “Connectors” from the menu, and type “MongoDB Atlas” in the Filters textbox. Note: When configuring MongoDB Atlas source And MongoDB Atlas sink, you will need the connection host name of your Atlas clusters. You can obtain this host name from the MongoDB connection string. An easy way to do this is by clicking on the "Connect" button for your cluster. This will launch the Connect dialog. You can choose any of the Connect options. For purposes of illustration, if you click on “Connect using MongoDB Compass.” you will see the following: The highlighted part in the above figure is the connection hostname you will use when configuring the source and sink connectors in Confluent Cloud. Configuring the MongoDB Atlas Source Connector Selecting “MongoDbAtlasSource” from the list of Confluent Cloud connectors presents you with several configuration options. The “Kafka Cluster credentials” choice is an API-based authentication that the connector will use for authentication with the Kafka broker. You can generate a new API key and secret by using the hyperlink. Recall that the connection host is obtained from the MongoDB connection string. Details on how to find this are described at the beginning of this section. The “Copy existing data” choice tells the connector upon initial startup to copy all the existing data in the source collection into the desired topic. Any changes to the data that occur during the copy process are applied once the copy is completed. By default, messages from the MongoDB source are sent to the Kafka topic as strings. The connector supports outputting messages in formats such as JSON and AVRO. Recall that the MongoDB source connector reads change stream data as events. Change stream event metadata is wrapped in the message sent to the Kafka topic. If you want just the message contents, you can set the “Publish full document only” output message to true. Note: For source connectors, the number of tasks will always be “1”: otherwise you will run the risk of duplicate data being written to the topic, because multiple workers would effectively be reading from the same change stream event stream. To scale the source, you could create multiple source connectors and define a pipeline that looks at only a portion of the collection. Currently this capability for defining a pipeline is not yet available in Confluent Cloud. Step 4: Generate Test Data At this point, you could run your python data generator application and start inserting data into the Stocks.StockData collection at your source. This will cause the connector to automatically create the topic “demo.Stocks.StockData.” To use the generator, git-clone the stockgenmongo folder in the above-referenced repository and launch the data generation as follows: python stockgen.py -c "< >" Where the MongoDB connection URL is the full connection string obtained from the Atlas source cluster. An example connection string is as follows: mongodb+srv://kafkauser:email@example.com Note: You might need to pip-install pymongo and dnspython first. If you do not wish to use this data generator, you will need to create the Kafka topic first before configuring the MongoDB Atlas sink. You can do this by using the Add a Topic dialog in the Topics tab of the Confluent Cloud administration portal. Step 5: Configuring the MongoDB Atlas Sink Selecting “MongoDB Atlas Sink” from the list of Confluent Cloud connectors will present you with several configuration options. After you pick the topic to source data from Kafka, you will be presented with additional configuration options. Because you chose to write your data in the source by using JSON, you need to select “JSON” in the input message format. The Kafka API key is an API key and secret used for connector authentication with Confluent Cloud. Recall that you obtain the connection host from the MongoDB connection string. Details on how to find this are described previously at the beginning of Step 3. The “Connection details” section allows you to define behavior such as creating a new document for every topic message or updating an existing document based upon a value in the message. These behaviors are known as document ID and write model strategies. For more information, check out the MongoDB Connector for Apache Kafka sink documentation . If order of the data in the sink collection is not important, you could spin up multiple tasks to gain an increase in write performance. Step 6: Verify Your Data Arrived at the Sink You can verify the data has arrived at the sink via the Atlas web interface. Navigate to the collection data via the Collections button. Now that your data is in Atlas, you can leverage many of the Atlas platform capabilities such as Atlas Search, Atlas Online Archive for easy data movement to low-cost storage, and MongoDB Charts for point-and-click data visualization. Here is a chart created in about one minute using the data generated from the sink cluster. Summary Apache Kafka and MongoDB help power many strategic business use cases, such as modernizing legacy monolithic systems, single views, batch processing, and event-driven architectures, to name a few. Today, Confluent and MongoDB Cloud and MongoDB Atlas provide fully managed solutions that enable you to focus on the business problem you are trying to solve versus spinning your tires in infrastructure configuration and maintenance. Register for our joint webinar to learn more!