MongoDB’s Diversity Scholarship program supports members of groups who are underrepresented in the technology industry. This includes, but is not limited to, people who identify as African American, Hispanic, LGBTQ, women, low-income, and people with disabilities who may not otherwise have the opportunity to attend MongoDB events.
Eligible candidates can apply online. Hurry, the applications close this Friday, April 8!
Diversity Scholarship recipients receive:
- Complimentary admission to MongoDB World
- Complimentary admission to a pre-conference workshop of their choice
- A MongoDB certification voucher
- Three-month access to paid MongoDB University courses
Additionally, scholarship recipients may be featured in a blog post.
Applicants must be 18 years old or older, and must belong to a group that is underrepresented in the technology industry.
Scholarships are awarded based on a combination of need and impact. Selection will be made by a committee that will review each application. All application info will be kept confidential. Recipients will be notified by April 15.
Don’t qualify, but would like to help? You can contribute to the Diversity Scholarship!
While MongoDB World registration is open, we're raising funds to support Diversity Scholarship recipients. There’s a donation opportunity when registering for the conference. Contributors will be listed as Diversity Champions on our website, unless otherwise requested.
Contact email@example.com with any questions.
Running MongoDB as a Microservice with Docker and Kubernetes
Update – November 2018 This post is now 2.5 years old, and neither MongoDB nor Kubernetes have been standing still! In particular, Kubernetes has introduced StatefulSets and we've introduced the MongoDB Enterprise Operator for Kubernetes . Both of these capabilities make working with MongoDB in Kubernetes much simpler and more robust. Read this post for the state-of-the-art in running MongoDB in Kubernetes . Introduction Want to try out MongoDB on your laptop? Execute a single command and you have a lightweight, self-contained sandbox; another command removes all traces when you're done. Need an identical copy of your application stack in multiple environments? Build your own container image and let your development, test, operations, and support teams launch an identical clone of your environment. Containers are revolutionizing the entire software lifecycle: from the earliest technical experiments and proofs of concept through development, test, deployment, and support. Read the Enabling Microservices: Containers & Orchestration Explained white paper . Orchestration tools manage how multiple containers are created, upgraded and made highly available. Orchestration also controls how containers are connected to build sophisticated applications from multiple, microservice containers. The rich functionality, simple tools, and powerful APIs make container and orchestration functionality a favorite for DevOps teams who integrate them into Continuous Integration (CI) and Continuous Delivery (CD) workflows. This post delves into the extra challenges you face when attempting to run and orchestrate MongoDB in containers and illustrates how these challenges can be overcome. Considerations for MongoDB Running MongoDB with containers and orchestration introduces some additional considerations: MongoDB database nodes are stateful. In the event that a container fails, and is rescheduled, it's undesirable for the data to be lost (it could be recovered from other nodes in the replica set, but that takes time). To solve this, features such as the Volume abstraction in Kubernetes can be used to map what would otherwise be an ephemeral MongoDB data directory in the container to a persistent location where the data survives container failure and rescheduling. MongoDB database nodes within a replica set must communicate with each other – including after rescheduling. All of the nodes within a replica set must know the addresses of all of their peers, but when a container is rescheduled, it is likely to be restarted with a different IP address. For example, all containers within a Kubernetes Pod share a single IP address, which changes when the pod is rescheduled. With Kubernetes, this can be handled by associating a Kubernetes Service with each MongoDB node, which uses the Kubernetes DNS service to provide a hostname for the service that remains constant through rescheduling. Once each of the individual MongoDB nodes is running (each within its own container), the replica set must be initialized and each node added. This is likely to require some additional logic beyond that offered by off the shelf orchestration tools. Specifically, one MongoDB node within the intended replica set must be used to execute the rs.initiate and rs.add commands. If the orchestration framework provides automated rescheduling of containers (as Kubernetes does) then this can increase MongoDB's resiliency since a failed replica set member can be automatically recreated, thus restoring full redundancy levels without human intervention. It should be noted that while the orchestration framework might monitor the state of the containers, it is unlikely to monitor the applications running within the containers, or backup their data. That means it's important to use a strong monitoring and backup solution such as MongoDB Cloud Manager , included with MongoDB Enterprise Advanced and MongoDB Professional . Consider creating your own image that contains both your preferred version of MongoDB and the MongoDB Automation Agent . Implementing a MongoDB Replica Set using Docker and Kubernetes As described in the previous section, distributed databases such as MongoDB require a little extra attention when being deployed with orchestration frameworks such as Kubernetes. This section goes to the next level of detail, showing how this can actually be implemented. We start by creating the entire MongoDB replica set in a single Kubernetes cluster (which would normally be within a single data center – that clearly doesn't provide geographic redundancy). In reality, little has to be changed to run across multiple clusters and those steps are described later. Each member of the replica set will be run as its own pod with a service exposing an external IP address and port. This 'fixed' IP address is important as both external applications and other replica set members can rely on it remaining constant in the event that a pod is rescheduled. The following diagram illustrates one of these pods and the associated Replication Controller and service. **Figure 1:** MongoDB Replica Set member configured as a Kubernetes Pod and exposed as a service Stepping through the resources described in that configuration we have: Starting at the core there is a single container named mongo-node1 . mongo-node1 includes an image called mongo which is a publicly available MongoDB container image hosted on Docker Hub . The container exposes port 27107 within the cluster. The Kubernetes volumes feature is used to map the /data/db directory within the connector to the persistent storage element named mongo-persistent-storage1 ; which in turn is mapped to a disk named mongodb-disk1 created in the Google Cloud. This is where MongoDB would store its data so that it is persisted over container rescheduling. The container is held within a pod which has the labels to name the pod mongo-node and provide an (arbitrary) instance name of rod . A Replication Controller named mongo-rc1 is configured to ensure that a single instance of the mongo-node1 pod is always running. The LoadBalancer service named mongo-svc-a exposes an IP address to the outside world together with the port of 27017 which is mapped to the same port number in the container. The service identifies the correct pod using a selector that matches the pod's labels. That external IP address and port will be used by both an application and for communication between the replica set members. There are also local IP addresses for each container, but those change when containers are moved or restarted, and so aren't of use for the replica set. The next diagram shows the configuration for a second member of the replica set. **Figure 2:** Second MongoDB Replica Set member configured as a Kubernetes Pod 90% of the configuration is the same, with just these changes: The disk and volume names must be unique and so mongodb-disk2 and mongo-persistent-storage2 are used The Pod is assigned a label of instance: jane and name: mongo-node2 so that the new service can distinguish it (using a selector) from the rod Pod used in Figure 1. The Replication Controller is named mongo-rc2 The Service is named mongo-svc-b and gets a unique, external IP address (in this instance, Kubernetes has assigned 220.127.116.11 ) The configuration of the third replica set member follows the same pattern and the following figure shows the complete replica set: **Figure 3:** Full Replica Set member configured as a Kubernetes Service Note that even if running the configuration shown in Figure 3 on a Kubernetes cluster of three or more nodes, Kubernetes may (and often will) schedule two or more MongoDB replica set members on the same host. This is because Kubernetes views the three pods as belonging to three independent services. To increase redundancy (within the zone), an additional headless service can be created. The new service provides no capabilities to the outside world (and will not even have an IP address) but it serves to inform Kubernetes that the three MongoDB pods form a service and so Kubernetes will attempt to schedule them on different nodes. **Figure 4:** Headless service to avoid co-locating of MongoDB replica set members The actual configuration files and the commands needed to orchestrate and start the MongoDB replica set can be found in the Enabling Microservices: Containers & Orchestration Explained white paper . In particular, there are some special steps required to combine the three MongoDB instances into a functioning, robust replica set which are described in the paper. Multiple Availability Zone MongoDB Replica Set There is risk associated with the replica set created above in that everything is running in the same GCE cluster, and hence in the same availability zone. If there were a major incident that took the availability zone offline, then the MongoDB replica set would be unavailable. If geographic redundancy is required, then the three pods should be run in three different availability zones or regions. Surprisingly little needs to change in order to create a similar replica set that is split between three zones – which requires three clusters. Each cluster requires its own Kubernetes YAML file that defines just the pod, Replication Controller and service for one member of the replica set. It is then a simple matter to create a cluster, persistent storage, and MongoDB node for each zone. **Figure 5:** Replica set running over multiple availability zones Next Steps To learn more about containers and orchestration – both the technologies involved and the business benefits they deliver – read the Enabling Microservices: Containers & Orchestration Explained white paper . The same paper provides the complete instructions to get the replica set described in this post up and running on Docker and Kubernetes in the Google Container Engine. Interested in learning more about Microservices? Microservices Resources About the Author - Andrew Morgan Andrew is a Principal Product Marketing Manager working for MongoDB. He joined at the start last summer from Oracle where he spent 6+ years in product management, focused on High Availability. He can be contacted @andrewmorgan or through comments on his blog ( clusterdb.com ).
How DataSwitch And MongoDB Atlas Can Help Modernize Your Legacy Workloads
Data modernization is here to stay, and DataSwitch and MongoDB are leading the way forward. Research strongly indicates that the future of the Database Management System (DBMS) market is in the cloud, and the ideal way to shift from an outdated, legacy DBMS to a modern, cloud-friendly data warehouse is through data modernization. There are a few key factors driving this shift. Increasingly, companies need to store and manage unstructured data in a cloud-enabled system, as opposed to a legacy DBMS which is only designed for structured data. Moreover, the amount of data generated by a business is increasing at a rate of 55% to 65% every year and the majority of it is unstructured. A modernized database that can improve data quality and availability provides tremendous benefits in performance, scalability, and cost optimization. It also provides a foundation for improving business value through informed decision-making. Additionally, cloud-enabled databases support greater agility so you can upgrade current applications and build new ones faster to meet customer demand. Gartner predicts that by 2022, 75% of all databases will be on the cloud – either by direct deployment or through data migration and modernization. But research shows that over 40% of migration projects fail. This is due to challenges such as: Inadequate knowledge of legacy applications and their data design Complexity of code and design from different legacy applications Lack of automation tools for transforming from legacy data processing to cloud-friendly data and processes It is essential to harness a strategic approach and choose the right partner for your data modernization journey. We’re here to help you do just that. Why MongoDB? MongoDB is built for modern application developers and for the cloud era. As a general purpose, document-based, distributed database, it facilitates high productivity and can handle huge volumes of data. The document database stores data in JSON-like documents and is built on a scale-out architecture that is optimal for any kind of developer who builds scalable applications through agile methodologies. Ultimately, MongoDB fosters business agility, scalability and innovation. Key MongoDB advantages include: Rich JSON Documents Powerful query language Multi-cloud data distribution Security of sensitive data Quick storage and retrieval of data Capacity for huge volumes of data and traffic Design supports greater developer productivity Extremely reliable for mission-critical workloads Architected for optimal performance and efficiency Key advantages of MongoDB Atlas , MongoDB’s hosted database as a service, include: Multi-cloud data distribution Secure for sensitive data Designed for developer productivity Reliable for mission critical workloads Built for optimal performance Managed for operational efficiency To be clear, JSON documents are the most productive way to work with data as they support nested objects and arrays as values. They also support schemas that are flexible and dynamic. MongoDB’s powerful query language enables sorting and filtering of any field, regardless of how nested it is in a document. Moreover, it provides support for aggregations as well as modern use cases including graph search, geo-based search and text search. Queries are in JSON and are easy to compose. MongoDB provides support for joins in queries. MongoDB supports two types of relationships with the ability to reference and embed. It has all the power of a relational database and much, much more. Companies of all sizes can use MongoDB as it successfully operates on a large and mature platform ecosystem. Developers enjoy a great user experience with the ability to provision MongoDB Atlas clusters and commence coding instantly. A global community of developers and consultants makes it easy to get the help you need, if and when you need it. In addition, MongoDB supports all major languages and provides enterprise-grade support. Why DataSwitch as a partner for MongoDB? Automated schema re-design, data migration & code conversion DataSwitch is a trusted partner for cost-effective, accelerated solutions for digital data transformation, migration and modernization through a modern database platform. Our no-code and low-code solutions along with cloud data expertise and unique, automated schema generation accelerates time to market. We provide end-to-end data, schema and process migration with automated replatforming and refactoring, thereby delivering: 50% faster time to market 60% reduction in total cost of delivery Assured quality with built-in best practices, guidelines and accuracy Data modernization: How “DataSwitch Migrate” helps you migrate from RDBMS to MongoDB DataSwitch Migrate (“DS Migrate”) is a no-code and low-code toolkit that leverages advanced automation to provide intuitive, predictive and self-serviceable schema redesign from a traditional RDBMS model to MongoDB’s Document Model with built-in best practices. Based on data volume, performance, and criticality, DS Migrate automatically recommends the appropriate ETTL (Extract, Transfer, Transform & Load) data migration process. DataSwitch delivers data engineering solutions and transformations in half the timeframe of the existing typical data modernization solutions. Consider these key areas: Schema redesign – construct a new framework for data management. DS Migrate provides automated data migration and transformation based on your redesigned schema, as well as no-touch code conversion from legacy data scripts to MongoDB Atlas APIs. Users can simply drag and drop the schema for redesign and the platform converts it to a document-based JSON structure by applying MongoDB modeling best practices. The platform then automatically migrates data to the new, re-designed JSON structure. It also converts the legacy database script for MongoDB. This automated, user-friendly data migration is faster than anything you’ve ever seen. Here’s a look at how the schema designer works. Refactoring – change the data structure to match the new schema. DS Migrate handles this through auto code generation for migrating the data. This is far beyond a mere lift and shift. DataSwitch takes care of refactoring and replatforming (moving from the legacy platform to MongoDB) automatically. It is a game-changing unique capability to perform all these tasks within a single platform. Security – mask and tokenize data while moving the data from on-premise to the cloud. As the data is moving to a potentially public cloud, you must keep it secure. DataSwitch’s tool has the capability to configure and apply security measures automatically while migrating the data. Data Quality – ensure that data is clean, complete, trustworthy, consistent. DataSwitch allows you to configure your own quality rules and automatically apply them during data migration. In summary: first, the DataSwitch tool automatically extracts the data from an existing database, like Oracle. It then exports the data and stores it locally before zipping and transferring it to the cloud. Next, DataSwitch transforms the data by altering the data structure to match the re-designed schema, and applying data security measures during the transform step. Lastly, DS Migrate loads the data and processes it into MongoDB in its entirety. Process Conversion Process conversion, where scripts and process logic are migrated from legacy DBMS to a modern DBMS, is made easier thanks to a high degree of automation. Minimal coding and manual intervention are required and the journey is accelerated. It involves: DML – Data Manipulation Language CRUD – typical application functionality (Create, Read, Update & Delete) Converting to the equivalent of MongoDB Atlas API Degree of automation DataSwitch provides during Migration Schema Migration Activities DS Automation Capabilities Application Data Usage Analysis 70% 3NF to NoSQL Schema Recommendation 60% Schema Re-Design Self Services 50% Predictive Data Mapping 60% Process Migration Activities DS Automation Capabilities CRUD based SQL conversion (Oracle, MySQL, SQLServer, Teradata, DB2) to MongoDB API 70% Data Migration Activities DS Automation Capabilities Migration Script Creation 90% Historical Data Migration 90% 2 Catch Load 90% DataSwitch Legacy Modernization as a Service (LMaas): Our consulting expertise combined with the DS Migrate tool allows us to harness the power of the cloud for data transformation of RDBMS legacy data systems to MongoDB. Our solution delivers legacy transformation in half the time frame through pay-per-usage. Key strengths include: ● Data Architecture Consulting ● Data Modernization Assessment and Migration Strategy ● Specialized Modernization Services DS Migrate Architecture Diagram Contact us to learn more.