Why MongoDB for an Operational Data Layer?
MongoDB meets the required capabilities for an Operational Data Layer. When you choose MongoDB as the foundation for an ODL, you’re investing in the best technology, people, and processes for your system of innovation.
Technology
MongoDB is the best way for an ODL to work with data
The core of an Operational Data Layer is the data it contains, and MongoDB is the best way to work with that data.
Ease: MongoDB’s document model makes it simple to model — or remodel — data in a way that fits the needs of your applications. Documents are a natural way to describe data. They present a single data structure, with related data embedded as sub-documents and arrays. This allows documents to be closely aligned to the structure of objects in an application. As a result, it’s simpler and faster for developers to model how data in the application will map to data stored in the database. In addition, MongoDB guarantees the multi-record ACID transactional semantics that developers are familiar with, making it easier to reason about data.
Flexibility: A flexible data model is essential to integrate multiple source systems into a single ODL. With MongoDB, there’s no need to pre-define a schema. Documents are polymorphic: Fields can vary from document to document within a single collection. For example, all documents that describe customers might contain the customer ID and the last date they purchased a product or service, but only some of these documents might contain the user’s social media handle or location data from a mobile app. This makes it possible to merge data from source systems storing records on overlapping but non-identical sets of entities. On the consuming side, MongoDB’s flexibility makes it easy to alter the data model as needed to meet the requirements of new applications being built on the ODL.
Speed: Using MongoDB for an ODL means you can get better performance when accessing data, and write less code to do so. In most legacy systems, accessing data for an entity, such as a customer, typically requires JOINing multiple tables together. JOINs entail a performance penalty, even when optimized — which takes time, effort, and advanced SQL skills. The situation is even worse when you consider that for a given requirement, a consuming system may need to access multiple legacy databases. In MongoDB, a document is a single place for the database to read and write data for an entity. This locality of data ensures the complete document can be accessed in a single database operation that avoids the need internally to pull data from many different tables and rows. For most queries, there’s no need to JOIN multiple records. If the MongoDB-based ODL integrates data from multiple source systems, the performance benefits of accessing that unified data are even greater.
Versatility: Building upon the ease, flexibility, and speed of the document model, MongoDB enables developers to satisfy a range of application requirements, both in the way data is modeled and how it is queried. The flexibility and rich data types of documents make it possible to model data in many different structures, representative of entities in the real world. The embedding of arrays and sub-documents makes documents very powerful for modeling complex relationships and hierarchical data, with the ability to manipulate deeply nested data without the need to rewrite the entire document. But documents can also do much more: They can be used to model flat, table-like structures, simple key-value pairs, text, geospatial data, the nodes and edges used in graph processing, and more. With MongoDB’s expressive query language, documents can be queried in many ways, from simple lookups and range queries to creating sophisticated processing pipelines for data analytics and transformations, faceted search, JOINs, geospatial processing, and graph traversals. This versatility is crucial for an Operational Data Layer. As an ODL evolves over time, it typically incorporates new source systems, with data model implications that weren’t planned from the outset. Similarly, new consuming systems that connect to the ODL will have access patterns and query requirements that haven’t been seen before. An Operational Data Layer needs to be versatile enough to meet a wide variety of requirements.
Data access and APIs: Consuming systems require powerful and secure access methods to the data in the ODL. If the ODL is writing back to source systems, this channel also needs to be handled.
MongoDB’s drivers provide access to a MongoDB-based ODL from the language of your choice. Most organizations building an ODL choose to develop a common Data Access API; this layer communicates to the ODL via a driver, in turn exposing a common set of data access methods to all consumers. This API layer can be custombuilt, or MongoDB Stitch can be used to expose access methods with a built-in rules engine for fine-grained security policies, giving precise control over what data each consumer can access, down to the level of individual fields.
When writes are issued to the ODL but must also be propagated back to source systems, this can be accomplished in one of two ways. First, the API layer described above could simultaneously send writes to the MongoDB ODL and the relevant source system. Second, writes can be issued only to the ODL, while Stitch Triggers or MongoDB Change Streams watch for changes in the ODL and notify source systems. (Alternatively, this could be accomplished with a commercial CDC tool).
Consuming systems aren’t limited to operational applications. Many data analytics tools can use MongoDB’s connectors to access the ODL. The Connector for Business Intelligence allows analysts to connect to a MongoDB ODL with their BI and visualization tools of choice. Alternatively, MongoDB Charts can connect directly to the ODL for native visualization. The Connector for Apache Spark exposes MongoDB data for use by all of Spark’s libraries, enabling advanced analytics such as machine learning processes.
MongoDB lets you intelligently distribute an ODL
Consuming systems depend on an ODL. It needs to be reliable, scalable, and offer a high degree of control over data distribution to meet latency and data sovereignty requirements.
Availability: Availability is always crucial for mission-critical applications. An ODL may need to meet even higher standards because it is often implemented precisely in order to provide improved availability in the event of source system outages. MongoDB maintains multiple copies of data using replica sets. Replica sets are self-healing as failover and recovery is fully automated, so it is not necessary to manually intervene to restore a system in the event of a failure, or to add additional clustering frameworks and agents that are needed for many legacy relational databases.
Scalability: Even if an ODL starts at a small scale, you need to be prepared for growth as new source systems are integrated, adding data volume, and new consuming systems are developed, increasing workload. To meet the needs of an ODL with large data sets and high throughput requirements, MongoDB provides horizontal scale-out on low-cost, commodity hardware or cloud infrastructure using sharding. Sharding automatically partitions and distributes data across multiple physical instances, or shards, all in a completely application-transparent way. To respond to fluctuating workload demand, nodes can be added or removed from the ODL in real time, and MongoDB will automatically rebalance the data accordingly, without manual intervention.
Workload isolation: An ODL must be able to safely serve disparate workloads. It should be able to serve analytical queries on up-to-date data, without having an impact on production applications. This can obviate the need for a new data warehouse, data mart, or just extracting a new cut of data whenever analysts call for it.
MongoDB’s replication provides a foundation for combining different classes of workload on the same MongoDB cluster, each operating against its own copy of the data. With workload isolation, business analysts can run exploratory queries and generate reports, and data scientists can build machine learning models without impacting operational applications. Within a replica set, one set of nodes can be provisioned to serve operational applications, replicating data in real time to other nodes dedicated to serving analytic workloads.
Data locality: An ODL works best if data is distributed according to the needs of its consuming systems. MongoDB allows precise control over where data is physically stored in a single logical cluster. For example, data placement can be controlled by geographic region for latency and governance requirements, or by hardware configuration and application features to meet specific classes of service for different consuming systems. Data placement rules can be continuously refined as required, and MongoDB will automatically migrate the data to its new zone.
MongoDB gives you the freedom to run anywhere
Portability: MongoDB runs the same everywhere — on-premises in your data centers, on developers’ laptops, in the cloud, or as an on-demand fully managed database as a service: MongoDB Atlas. Wherever you need to deploy MongoDB, you can make full use of its capabilities. This also offers the flexibility to change deployment strategy over time — e.g., to move from on-prem to the cloud — or even to deploy clusters in hybrid environments.
Global coverage: MongoDB’s distributed architecture allows a single logical cluster to be distributed around the world, situating data close to users. When you use MongoDB Atlas, global coverage is even easier; Atlas supports 50+ regions across all the major cloud providers. Global cluster support enables the simplified deployment and management of a single geographically distributed Operational Data Layer. Organizations can easily control where data is physically stored to allow low-latency reads and writes and meet the data sovereignty requirements demanded by new data privacy regulations like the GDPR. Data can also be easily replicated and geographically distributed, enabling multi-region fault tolerance and fast, responsive reads of all the data in an ODL, from anywhere.
No lock-in: With MongoDB, you can reap the benefits of a multi-cloud strategy. Since Atlas clusters can be deployed on all major cloud providers, you get the advantage of an elastic, fully-managed service without being locked into a single cloud provider. If business conditions require you to change the underlying infrastructure, it’s easy to do so. Of course, there’s no requirement to use MongoDB Atlas. You can always deploy MongoDB on-prem, manage it yourself in the cloud, or use MongoDB Atlas — and change between deployment methods over time if needed. When you build an Operational Data Layer with MongoDB, you’re fully in charge of how and where you run it.
For more details on MongoDB technology, read the MongoDB Architecture Guide.
People and process
Of course, implementing an Operational Data Layer does not only depend on technology. Success also relies on skilled people and carefully designed processes.
MongoDB’s Data Layer Realization methodology is a tried and tested approach to constructing an Operational Data Layer. It helps you unlock the value of data stored in silos and legacy systems, driving rapid, iterative integration of data sources for new and consuming applications. Data Layer Realization offers the expert skills of MongoDB’s consulting engineers, but also helps develop your own in-house capabilities, building deep technical expertise and best practices.
This process for constructing an Operational Data Layer has been successfully implemented with many customers. Starting with clear definitions of project scope and identifying required producing and consuming systems is the first step to ensure success. Based on these findings, we assign data stewards for clear chains of responsibility, then begin the process of data modeling and infrastructure design. Data is loaded and merged into the Operational Data Layer, then data access APIs are built and consuming systems are modified to use the ODL instead of legacy systems. We conduct validation of both development capabilities using the ODL and the deployment architecture, then optimize ODL usage, implementing maintenance processes and evolving the ODL based on the next wave of required functionality. This process is iterative, repeating in order to add new access patterns and consuming apps or enrich the ODL with new data sources.
A successfully implemented ODL is a springboard for agile implementation of new business requirements. MongoDB can help drive continued innovation through a structured program that facilitates prototyping and development of new features and applications.
Operational data layer success stories
HSBC
HSBC’s data assets are growing rapidly — from 56 PB in 2014 to 93 PB in 2017. Customers are demanding more, regulators are asking for more, and the business is generating more. In order to make trading data available to a multitude of new digital services, HSBC implemented an Operational Data Layer to become the single source of truth. The ODL, powered by MongoDB, enables HSBC’s development and architecture teams to meet the board’s strategy of using technology to make the bank “simpler, faster, and better.”
Learn More
RBS
RBS implemented an Operational Data Layer — which they call an Enterprise Data Fabric — in order to improve data quality, reduce duplication, and simplify architectures to become leaner. The results? Cost reduction, plans to decommission hundreds of legacy servers, an environment of collaboration and data sharing, and the ability to develop new applications in days, rather than weeks or months on the old systems.
“Data Fabric provides data storage, query, and distribution as a service, enabling application developers to concentrate on business functionality.”
— Michael Fulke, Development Team Lead, Royal Bank of Scotland
Learn More
Alight Solutions
Alight Solutions (formerly part of Aon PLC) provides outsourced benefits administration for close to 40 million employees from over 1,400 organizations, but retrieving customer data from multiple frontend and backend source systems meant high mainframe MIPS costs, scaling difficulties, and high query latency. Moving to data as a service delivered from an ODL on MongoDB reduced query latency by 250x for better customer experience, lowered peak mainframe consumption to reduce costs, and unlocked new business innovation.
Learn More
Barclays
Barclays is solving one of the hardest challenges facing any enterprise: a true 360-degree view of the customer with an ODL that gives all support staff a complete single view of every interaction a customer has had with the bank. This is helping Barclays drive customer interactions to new digital channels and improve the customer experience.
Learn More
Conclusion
An Operational Data Layer makes your enterprise data available as a service on demand, simplifying the process of building transformational new applications. It can reduce load on source systems, improve availability, unify data from multiple systems into a single real-time platform, serve as a foundation for re-architecting a monolith into microservices, unlock the value of data trapped inside legacy systems, and more. An ODL becomes a system of innovation, allowing an evolutionary approach to legacy modernization.
Implementing an ODL can be a challenging undertaking. It requires technical skills, the right tools, and coordination among many different parts of the business. By working with MongoDB, you get access to the right technology, people, and process for success.
To discuss building an Operational Data Layer with MongoDB, contact us.
We can help
We are the MongoDB experts. Over 6,600 organizations rely on our commercial products. We offer software and services to make your life easier:
MongoDB Enterprise Advanced is the best way to run MongoDB in your data center. It's a finely-tuned package of advanced software, support, certifications, and other services designed for the way you do business.
MongoDB Atlas is a database as a service for MongoDB, letting you focus on apps instead of ops. With MongoDB Atlas, you only pay for what you use with a convenient hourly billing model. With the click of a button, you can scale up and down when you need to, with no downtime, full security, and high performance.
MongoDB Stitch is a serverless platform which accelerates application development with simple, secure access to data and services from the client — getting your apps to market faster while reducing operational costs and effort.
MongoDB Mobile MongoDB Mobile lets you store data where you need it, from IoT, iOS, and Android mobile devices to your backend – using a single database and query language.
MongoDB Cloud Manager is a cloud-based tool that helps you manage MongoDB on your own infrastructure. With automated provisioning, fine-grained monitoring, and continuous backups, you get a full management suite that reduces operational overhead, while maintaining full control over your databases.
MongoDB Consulting packages get you to production faster, help you tune performance in production, help you scale, and free you up to focus on your next release.
MongoDB Training helps you become a MongoDB expert, from design to operating mission-critical systems at scale. Whether you're a developer, DBA, or architect, we can make you better at MongoDB.