NewAnnouncing MongoDB Atlas Vector Search and Dedicated Search Nodes for genAI use cases

Implementing an Operational Data Layer

Introduction

An Operational Data Layer (or ODL) is an architectural pattern that centrally integrates and organizes siloed enterprise data, making it available to consuming applications. It enables a range of board-level strategic initiatives such as legacy modernization and data as a service, and use cases such as single view, real-time analytics, and mainframe offload.

The simplest representation of this pattern is something like the diagram shown in Figure 1. An Operational Data Layer is an intermediary between existing data sources and consumers that need to access that data. An ODL deployed in front of legacy systems can enable new business initiatives and meet new requirements that the existing architecture can’t handle — without the difficulty and risk of a full rip and replace of legacy systems. It can reduce workload on source systems, improve availability, reduce end-user response times, combine data from multiple systems into a single repository, serve as a foundation for re-architecting a monolithic application into a suite of microservices, and more. The Operational Data Layer becomes a system of innovation, allowing the business to take an iterative approach to digital transformation.

Conceptual model of an Operational Data Layer
Figure 1: Conceptual model of an Operational Data Layer

Other terms sometimes used for this architectural pattern include Operational Data Store, Data Fabric, and Operational Data Hub. It is conceptually similar to a Master Data Management or Reference Data Management system.

Sample reference architecture for an Operational Data Layer
Figure 2: Sample reference architecture for an Operational Data Layer
Why Implement an ODL?

An Operational Data Layer is an architectural pattern, but implementing one isn’t a goal in and of itself. Organizations choose to build an ODL to unlock the value of previously siloed enterprise data and to power applications that can’t be served by existing systems. Some of the most common strategic business initiatives that benefit from an ODL include:

Legacy modernization

Companies are finding enormous value in new development methodologies, architectural patterns, and technologies, enabling them to build new business functionality three to five times faster, scale to millions of users wherever they are, cut costs by 70%, and more. Legacy systems, however, can hold them back from realizing these benefits. Monolithic code bases, complex dependencies between apps, siloed data, and outdated development processes are a drag on innovation.

There are many ways of modernizing legacy systems, and each has its place. Gradual refactoring of an application can make sense when incremental changes are needed (and feasible), or a wholesale rewrite may be appropriate when a system can safely be cut out of the environment. But in many instances, constructing an Operational Data Layer parallel to legacy systems can be the best way to achieve modernization.

An ODL can offer a “best of both worlds” approach, providing the benefits of modernization without the risk of a full rip and replace. Legacy systems are left intact — at least at first — meaning that existing applications can continue to work as usual without interruption. New or improved data consumers will access the ODL, rather than subjecting legacy data stores to new workloads that may strain their capacity and expose single points of failure. At the same time, building an ODL offers a chance to redesign the application’s data model, allowing for new development and features that aren’t possible with the rigid tabular structure of existing relational systems. With an ODL, it is possible to combine data from multiple legacy sources into a single repository where new applications, such as a customer single view or artificial intelligence processes, can access the entire corpus of data.

Gradually, existing workloads can be shifted to the ODL, with value delivered at each step. Eventually, the ODL can be promoted to a system of record, and legacy systems can be decommissioned.

Data as a service

In many organizations, the state of data access is grim. Crucial enterprise data is locked in dozens or hundreds of silos, controlled by different teams, and stuck in systems that aren’t able to serve new workloads and access patterns. This is a blocker for innovation and insight, harming the business. For example, building a new mobile app for customers to view their account data could require months just to navigate internal processes for getting access to the legacy systems that hold that data, let alone integrate with them.

The solution to this is to invest in a data platform – an Operational Data Layer – that provides data as a service. The ODL itself gathers all important data in one place, while Data Access APIs on top of the ODL provide a common set of methods for working with this data. When development teams need access to existing enterprise data, they can simply subscribe to the relevant API to access it from the ODL, with no need to open up source systems for each new requirement. An ODL can offer data as a service not only to developers, but also to analysts and data scientists; new analyses can be run and new insights generated. When data from multiple source systems is unified in an ODL, consuming applications and analytics can get the full picture of enterprise data, not available in a world of silos.

Cloud data strategy

An Operational Data Layer can be implemented on-premises, in the cloud, or in a hybrid deployment model, as part of a broader cloud data strategy. When legacy on-prem systems can’t be migrated but new applications are being deployed in the cloud, a cloud-hosted ODL can provide an intermediary layer. Existing enterprise data is duplicated in an ODL in the cloud, which makes it available to new cloud-based applications. This ensures a faster and more easily-secured data connection, especially if the ODL is deployed on the same cloud provider and in the same regions as its consuming systems. Source systems remain in place on-premises and can be decommissioned over time as the ODL becomes the system of record. This provides a gradual, non-disruptive approach to cloud migration. A cloud-based ODL can also benefit from the on-demand elasticity of cloud infrastructure. It’s easy to scale the ODL when an additional set of enterprise data is added, when new consuming applications are created, or when workloads grow.

Use cases for an ODL
Single view

Single view” describes a unified, real-time representation of all relevant information on a given entity. Often, this is a customer, and the term ”customer 360” is sometimes used. But organizations may also develop single views of products, financial assets, or other entities relevant for the business. The value of a single view comes from unifying data from multiple sources that otherwise must be accessed separately. MetLife’s customer service organization, for example, had to navigate 70 different systems in 15 different screens. By unifying all customer data into one repository, MetLife could expose a single view to customer service representatives, who now use only one screen to access all the information they need. This resulted in significantly faster customer call times and the ability to analyze customer data for cross-sell and upsell opportunities. A single view can be exposed to whatever users, whether internal or external, will benefit from the unified view of data. Internally, this is often customer service representatives, fraud and risk systems, sales and marketing staff, quantitative analysts, and others. Externally, single views can directly power customer service: For example, the full picture of a customer’s information can be shown in an online account, and the customer can make changes as needed. A single view is a natural fit for an Operational Data Layer. When your goal is to produce a single view, integrating data from multiple sources in a single ODL is the obvious solution. Conversely, if you build an Operational Data Layer for other reasons and already have a store of unified data, it is a simple step to expose that data as a single view for the users who would benefit from it.

Mainframe offload

Mainframes have powered backend business processes for decades, but the access patterns and 24/7 requirements of modern applications stretch their limits. As more business moves online and to mobile, exposing the mainframe directly to new digital channels rapidly drives up costs. Even more critically, it can expose a single point of failure when the mainframe periodically needs to be taken offline for maintenance. Beyond the hardware, the databases running on mainframes are typically based on rigid tabular data models, making it very hard to evolve applications to meet new digital business demands.

For all of these reasons, organizations can benefit from offloading workloads from the mainframe. Implementing an Operational Data Layer in front of legacy systems allows you to redirect consuming systems to the ODL, improving availability, helping to meet regulatory requirements, and reducing MIPS costs. At the same time, new apps — or new features built into existing apps — benefit from the option to revise the data model, supporting workloads that weren’t possible before. An ODL makes it significantly easier to serve mainframe data to new digital channels without straining legacy systems. For example, a major UK-based global bank built an ODL to offload work from its mainframe, powering web and mobile banking. The ODL became the foundation for data as a service, supporting rapid development of new features and applications. Data as a service is the advantage that lets a traditional bank move at the speed of a digital disruptor.

Analytics

The business world has come to a consensus: Decisions should be driven by data. A recent survey of senior executives found that 99% are trying to move toward a data-driven culture, but only a third have succeeded. Data should support not only long-term decisions for the business, such as what product lines or regions are the most profitable, but also real-time decisions, like what recommendations should be presented to a customer when they are on your website or using your mobile app. Advanced analytics organizations go further, conducting machine learning on enterprise data to extract new insights or improve operational efficiency. This presents a problem: Drawing accurate insights and making the right choices based on analytics requires a complete view of the relevant enterprise data. An enterprise data warehouse or a Hadoop-based analytics platform can unify all the relevant data from operational systems for batch analysis. But an EDW or Hadoop platform can’t meet today’s demand for real-time analytics: Data is typically loaded in daily or weekly batches, leading to a stale view, and often is designed for long-running queries taking minutes or hours to complete — immediate responses are out of the question. An Operational Data Layer, on the other hand, contains an up-to-the-minute state of data and services low latency analytics queries, either as part of operational processes or for ad-hoc questions that need an answer now. When you have an ODL serving other requirements, it should be a simple matter to expose that same data for analytics. A consistent real-time view of data is available for all uses, but it is critical that complex or long-running analytics workloads don’t have a performance impact on production applications that also rely on the Operational Data Layer. When the ODL is built from a distributed cluster of nodes, it’s possible to achieve workload isolation by dedicating a specific replica of data to analytics queries.

And more

An ODL is an excellent way of exposing existing data to new applications while mitigating the potential impact to legacy systems. Whenever the business calls for a new application or experience but the changes to (or additional workload on) the data sources would be prohibitively difficult, building an Operational Data Layer can be the solution. Common use cases include:

  • Developing new mobile applications to extend customer engagement channels, exposing all the information that’s otherwise found on the web, in store, or via customer service.
  • Building recommendation engines to present the best choices to users based on their past actions – whether that’s product recommendations, suggested articles or movies, or connections to other users they may know.
  • Personalizing content in real time based on user data — for example, auto-filling fields based on account information or ranking options based on proximity.
  • Adding social components to a UI, drawing on user information

These use cases — and any others that need access to enterprise data — benefit from an ODL that aggregates all the data they require and which is built to fit the demands of modern application requirements. To meet these demands, an ODL must be resilient to failure, globally scalable, and cloud-ready, and it must enhance developer productivity to build and evolve apps faster. Once an ODL is in place, it’s often a simple matter to point new apps at it, powering multiple innovative developments with a common, enterprise-wide data layer.

Architecture

Consider again the conceptual model of an Operational Data Layer in Figure 3. Outside of the ODL itself, the chief components in this architecture are the source systems and the consuming systems.

Source systems and data producers

These are usually databases, but sometimes file systems or other data stores. Generally, they are systems of record for one or more applications, either off-the-shelf packaged apps (ERP, CRM, etc.) or internally-developed custom apps.

In some cases, there may be only one source system feeding the Operational Data Layer. Usually, this is if the main goal of implementing an ODL is to add an abstraction layer on top of that single system. This could be for the purpose of caching or offloading queries from the source system, or it could be to create an opportunity to revise the data model for modernization or new uses that don’t fit with the structure of the existing source system. An ODL with a single source system is most useful when the source is a highly-used system of record and/or is unable to handle new demands being placed on it; often, this is a mainframe. More often, there are multiple source systems. In this case, the ODL can unify disparate data sets, providing a complete picture of data that would not otherwise be available. When planning an Operational Data Layer, start by identifying the applications that generate the required source data and their associated databases, along with the business and technical owners of the applications. For each data source, it can be helpful to appoint a data steward. The steward needs to command a deep understanding of the source database, with specific knowledge of:

  • The schema of the source data, plus an understanding of which tables store the required attributes and in what format.
  • The clients and applications that generate the source data.
  • The clients and applications that currently consume the source data.
Conceptual model of an Operational Data Layer
Figure 3: Conceptual model of an Operational Data Layer
The data steward should also be able to define how the required data can be extracted from the source database to meet the ODL’s requirements (e.g., frequency of data transfer), without affecting either current producing or consuming applications.
Operational Data Layer model with data loading
Figure 4: Operational Data Layer model with data loading
Consuming systems

An ODL can support any consuming systems that require access to data. These can be either internal or customer-facing. Existing applications can be updated to access the ODL instead of the source systems, while new applications (often delivered as domains of microservices) will typically use the ODL first and foremost. The requirements of a single application may drive the initial implementation of an ODL, but usage usually expands to additional applications once the ODL’s value has been demonstrated to the business. An Operational Data Layer can also feed analytics, providing insights that weren’t possible without a unified data system. Ad hoc analytical tools can connect to an ODL for an up-to-the-minute view of the company — without interfering with operational workloads — while the data can also support programmatic real-time analytics to drive richer user experiences with dashboards and aggregations embedded directly into applications. When identifying consuming systems that will rely on the ODL, consider:

  • How their business processes operate, including the types of queries executed as part of their day-to-day responsibilities, and the required Service Level Agreements (SLAs).
  • The specific data they need to access.
  • The sources from which the required data is currently extracted or would be extracted.
Data loading

For a successful ODL implementation, the data must be kept in sync with the source systems. Once the source systems’ producers have been identified, it’s important to understand the frequency and quantity of data changes in producer systems. Similarly, consuming systems should have clear requirements for data currency. Once you understand these, it’s much easier to develop an appropriate data loading strategy.

  1. Batch extract and load: This is typically used for an initial one-time operation to load data from source systems. Batch operations extract all required records from the source systems and load them into the Operational Data Layer for subsequent merging. If none of the consuming systems requires up-to-the-second level data currency and overall data volumes are low, it may also suffice to refresh the entire data set with periodic (daily/weekly) data refreshes. Batch operations are also good for loading data from producers that are reference data sources, where data changes are typically less frequent — for example, country codes, office locations, tax codes, and similar. Commercial ETL tools or custom implementations are used for carrying out batch operations: Extract data from producers, transform the data as needed, and then load it into the ODL. If, after the initial load, the development team discovers that additional refinements are needed to the transformation logic, then the related data from the ODL may need to be dropped, and the initial load repeated.

  2. Delta extract and load: This is an ongoing operation that propagates incremental updates committed to the source systems into the ODL, in real time. To maintain synchronization between the source systems and the ODL, it’s important that the delta load starts immediately following the initial batch load. The frequency of delta operations can vary drastically. In some cases, they may be captured and propagated at regular intervals — for example, every few hours. In other cases, they are event-based, propagated to the ODL as soon as new data is committed to the source systems. To keep the ODL current, most implementations use Change Data Capture (CDC) mechanisms to catch the changes to source systems as they happen. After the changes are captured, ETL or custom handlers can be used to transform the data into the required format for the ODL. From there, message queues can stream the new delta changes into the ODL. Increasingly, the message queue itself transforms the data, removing the need for a separate ETL mechanism. Matching, merging, and reconciling data from disparate systems as part of the data load is a topic of its own. This is discussed in our 10-step methodology for building a single view.

Maturity model of an Operational Data Layer
Figure 5: Maturity model of an Operational Data Layer
Data flow and maturity model

Often, an Operational Data Layer evolves over time. From a focused and simple start, it grows in scope and strategic importance, delivering increased benefits to the business. In fact, it is a best practice to begin an ODL implementation with a limited scope and grow it over time. An Operational Data Layer that tries to incorporate all possible data sources from the beginning may stall before it proves its value; it’s much better to demonstrate the ODL’s capabilities with a small set of sources and consumers, then iterate from there, incorporating best practices that have been developed along the way. One of our observations of working with many enterprises on their ODL projects is that as more data sources are integrated, the ODL becomes valuable for serving reads from a broadening range of consumers. Over time, it begins accepting writes, and ultimately can become a system of record in its own right.

Phase 1: Simple ODL, offloading reads

Initial use cases for an ODL are often tightly scoped, with one (or a few) source systems and one (or a few) consuming systems. The goal is usually to serve only read operations — the ODL can reduce workload on source systems to cut costs, provide high availability during source system downtime, improve performance, and handle long-running analytics queries. Whatever the motivation, the ODL has the potential to easily offload a significant amount of read traffic from overburdened source systems. Most organizations look to realize the value of always-on offloading from the beginning, improving performance and reducing costs. Sometimes, an ODL may start with a more limited scope, in which the source system serves all reads normally, and the ODL only takes over during source system downtime to preserve application availability. Usually, the ODL in these cases quickly transitions to serving certain workloads at all times. In this phase, all write traffic continues to go to source systems, and data changes are then pushed into the ODL via the delta load mechanisms described above.

Phase 2: Enriched ODL for new use cases

Once the ODL has proven its value, a logical next step is to enrich its data by adding useful metadata or integrating new (related) data sources. A typical use case for this is to enable advanced analytics across a fuller picture of data or create a single customer view. For example, credit card transactions could be enriched by categorizing purchases according to third-party information. A card issuer could, for instance, then make it much easier for customers to determine their spend on each category — e.g., travel purchases over the last n months. With this enrichment, the ODL can not only offload more reads from source systems, but also enable use cases that weren’t possible before.

Phase 3: Offloading reads and writes

The ODL’s scope can be expanded by introducing a smarter architecture to orchestrate writes between both source systems and the ODL concurrently. In this phase, when a given consuming system performs a write, it goes to both ODL and source system, either directly from application logic or via a messaging system, API layer, or other intermediary from which both repositories can receive the write. This pattern is also referred to as “Y-loading” and can help lay the foundations for a more transformational shift of the ODL’s role in your enterprise architecture. Some organizations move directly to Phase 4 below, but Y-loading can allow you to run both systems in parallel and road-test the ODL before using it as the primary system for writes.

Phase 4: ODL first

By default, all writes are directed to the ODL. Where necessary, changes are routed from the ODL back to the source systems, either so that legacy applications can continue to rely on a source system before being ported to the ODL, or merely as a fallback, in case it should be needed. The secondary write to the source system can be accomplished with a CDC system listening to the ODL or a similar system, like MongoDB Stitch Triggers.

Phase 5: System of Record

Ultimately, the Operational Data Layer can evolve to serve as the System of Record. Once all consuming systems for a given legacy source have been ported to the ODL, and its stability has been proven, the source system can be decommissioned for cost savings and architectural simplicity. A model of this phase is shown in Figure 6.

Operational Data Layer accepting writes and optionally pushing them back to sources systems with select source systems decommissioned
Figure 6: Operational Data Layer accepting writes and optionally pushing them back to sources systems with select source systems decommissioned
Why MongoDB for an Operational Data Layer?

MongoDB meets the required capabilities for an Operational Data Layer. When you choose MongoDB as the foundation for an ODL, you’re investing in the best technology, people, and processes for your system of innovation.

Technology
MongoDB is the best way for an ODL to work with data

The core of an Operational Data Layer is the data it contains, and MongoDB is the best way to work with that data.

  1. Ease: MongoDB’s document model makes it simple to model — or remodel — data in a way that fits the needs of your applications. Documents are a natural way to describe data. They present a single data structure, with related data embedded as sub-documents and arrays. This allows documents to be closely aligned to the structure of objects in an application. As a result, it’s simpler and faster for developers to model how data in the application will map to data stored in the database. In addition, MongoDB guarantees the multi-record ACID transactional semantics that developers are familiar with, making it easier to reason about data.

  2. Flexibility: A flexible data model is essential to integrate multiple source systems into a single ODL. With MongoDB, there’s no need to pre-define a schema. Documents are polymorphic: Fields can vary from document to document within a single collection. For example, all documents that describe customers might contain the customer ID and the last date they purchased a product or service, but only some of these documents might contain the user’s social media handle or location data from a mobile app. This makes it possible to merge data from source systems storing records on overlapping but non-identical sets of entities. On the consuming side, MongoDB’s flexibility makes it easy to alter the data model as needed to meet the requirements of new applications being built on the ODL.

  3. Speed: Using MongoDB for an ODL means you can get better performance when accessing data, and write less code to do so. In most legacy systems, accessing data for an entity, such as a customer, typically requires JOINing multiple tables together. JOINs entail a performance penalty, even when optimized — which takes time, effort, and advanced SQL skills. The situation is even worse when you consider that for a given requirement, a consuming system may need to access multiple legacy databases. In MongoDB, a document is a single place for the database to read and write data for an entity. This locality of data ensures the complete document can be accessed in a single database operation that avoids the need internally to pull data from many different tables and rows. For most queries, there’s no need to JOIN multiple records. If the MongoDB-based ODL integrates data from multiple source systems, the performance benefits of accessing that unified data are even greater.

  4. Versatility: Building upon the ease, flexibility, and speed of the document model, MongoDB enables developers to satisfy a range of application requirements, both in the way data is modeled and how it is queried. The flexibility and rich data types of documents make it possible to model data in many different structures, representative of entities in the real world. The embedding of arrays and sub-documents makes documents very powerful for modeling complex relationships and hierarchical data, with the ability to manipulate deeply nested data without the need to rewrite the entire document. But documents can also do much more: They can be used to model flat, table-like structures, simple key-value pairs, text, geospatial data, the nodes and edges used in graph processing, and more. With MongoDB’s expressive query language, documents can be queried in many ways, from simple lookups and range queries to creating sophisticated processing pipelines for data analytics and transformations, faceted search, JOINs, geospatial processing, and graph traversals. This versatility is crucial for an Operational Data Layer. As an ODL evolves over time, it typically incorporates new source systems, with data model implications that weren’t planned from the outset. Similarly, new consuming systems that connect to the ODL will have access patterns and query requirements that haven’t been seen before. An Operational Data Layer needs to be versatile enough to meet a wide variety of requirements.

  5. Data access and APIs: Consuming systems require powerful and secure access methods to the data in the ODL. If the ODL is writing back to source systems, this channel also needs to be handled.

MongoDB’s drivers provide access to a MongoDB-based ODL from the language of your choice. Most organizations building an ODL choose to develop a common Data Access API; this layer communicates to the ODL via a driver, in turn exposing a common set of data access methods to all consumers. This API layer can be custombuilt, or MongoDB Stitch can be used to expose access methods with a built-in rules engine for fine-grained security policies, giving precise control over what data each consumer can access, down to the level of individual fields.

When writes are issued to the ODL but must also be propagated back to source systems, this can be accomplished in one of two ways. First, the API layer described above could simultaneously send writes to the MongoDB ODL and the relevant source system. Second, writes can be issued only to the ODL, while Stitch Triggers or MongoDB Change Streams watch for changes in the ODL and notify source systems. (Alternatively, this could be accomplished with a commercial CDC tool).

Consuming systems aren’t limited to operational applications. Many data analytics tools can use MongoDB’s connectors to access the ODL. The Connector for Business Intelligence allows analysts to connect to a MongoDB ODL with their BI and visualization tools of choice. Alternatively, MongoDB Charts can connect directly to the ODL for native visualization. The Connector for Apache Spark exposes MongoDB data for use by all of Spark’s libraries, enabling advanced analytics such as machine learning processes.

MongoDB lets you intelligently distribute an ODL

Consuming systems depend on an ODL. It needs to be reliable, scalable, and offer a high degree of control over data distribution to meet latency and data sovereignty requirements.

  1. Availability: Availability is always crucial for mission-critical applications. An ODL may need to meet even higher standards because it is often implemented precisely in order to provide improved availability in the event of source system outages. MongoDB maintains multiple copies of data using replica sets. Replica sets are self-healing as failover and recovery is fully automated, so it is not necessary to manually intervene to restore a system in the event of a failure, or to add additional clustering frameworks and agents that are needed for many legacy relational databases.

  2. Scalability: Even if an ODL starts at a small scale, you need to be prepared for growth as new source systems are integrated, adding data volume, and new consuming systems are developed, increasing workload. To meet the needs of an ODL with large data sets and high throughput requirements, MongoDB provides horizontal scale-out on low-cost, commodity hardware or cloud infrastructure using sharding. Sharding automatically partitions and distributes data across multiple physical instances, or shards, all in a completely application-transparent way. To respond to fluctuating workload demand, nodes can be added or removed from the ODL in real time, and MongoDB will automatically rebalance the data accordingly, without manual intervention.

  3. Workload isolation: An ODL must be able to safely serve disparate workloads. It should be able to serve analytical queries on up-to-date data, without having an impact on production applications. This can obviate the need for a new data warehouse, data mart, or just extracting a new cut of data whenever analysts call for it.

    MongoDB’s replication provides a foundation for combining different classes of workload on the same MongoDB cluster, each operating against its own copy of the data. With workload isolation, business analysts can run exploratory queries and generate reports, and data scientists can build machine learning models without impacting operational applications. Within a replica set, one set of nodes can be provisioned to serve operational applications, replicating data in real time to other nodes dedicated to serving analytic workloads.

  4. Data locality: An ODL works best if data is distributed according to the needs of its consuming systems. MongoDB allows precise control over where data is physically stored in a single logical cluster. For example, data placement can be controlled by geographic region for latency and governance requirements, or by hardware configuration and application features to meet specific classes of service for different consuming systems. Data placement rules can be continuously refined as required, and MongoDB will automatically migrate the data to its new zone.

MongoDB gives you the freedom to run anywhere
  1. Portability: MongoDB runs the same everywhere — on-premises in your data centers, on developers’ laptops, in the cloud, or as an on-demand fully managed database as a service: MongoDB Atlas. Wherever you need to deploy MongoDB, you can make full use of its capabilities. This also offers the flexibility to change deployment strategy over time — e.g., to move from on-prem to the cloud — or even to deploy clusters in hybrid environments.

  2. Global coverage: MongoDB’s distributed architecture allows a single logical cluster to be distributed around the world, situating data close to users. When you use MongoDB Atlas, global coverage is even easier; Atlas supports 50+ regions across all the major cloud providers. Global cluster support enables the simplified deployment and management of a single geographically distributed Operational Data Layer. Organizations can easily control where data is physically stored to allow low-latency reads and writes and meet the data sovereignty requirements demanded by new data privacy regulations like the GDPR. Data can also be easily replicated and geographically distributed, enabling multi-region fault tolerance and fast, responsive reads of all the data in an ODL, from anywhere.

  3. No lock-in: With MongoDB, you can reap the benefits of a multi-cloud strategy. Since Atlas clusters can be deployed on all major cloud providers, you get the advantage of an elastic, fully-managed service without being locked into a single cloud provider. If business conditions require you to change the underlying infrastructure, it’s easy to do so. Of course, there’s no requirement to use MongoDB Atlas. You can always deploy MongoDB on-prem, manage it yourself in the cloud, or use MongoDB Atlas — and change between deployment methods over time if needed. When you build an Operational Data Layer with MongoDB, you’re fully in charge of how and where you run it.

For more details on MongoDB technology, read the MongoDB Architecture Guide.

People and process

Of course, implementing an Operational Data Layer does not only depend on technology. Success also relies on skilled people and carefully designed processes.

MongoDB’s Data Layer Realization methodology is a tried and tested approach to constructing an Operational Data Layer. It helps you unlock the value of data stored in silos and legacy systems, driving rapid, iterative integration of data sources for new and consuming applications. Data Layer Realization offers the expert skills of MongoDB’s consulting engineers, but also helps develop your own in-house capabilities, building deep technical expertise and best practices.

This process for constructing an Operational Data Layer has been successfully implemented with many customers. Starting with clear definitions of project scope and identifying required producing and consuming systems is the first step to ensure success. Based on these findings, we assign data stewards for clear chains of responsibility, then begin the process of data modeling and infrastructure design. Data is loaded and merged into the Operational Data Layer, then data access APIs are built and consuming systems are modified to use the ODL instead of legacy systems. We conduct validation of both development capabilities using the ODL and the deployment architecture, then optimize ODL usage, implementing maintenance processes and evolving the ODL based on the next wave of required functionality. This process is iterative, repeating in order to add new access patterns and consuming apps or enrich the ODL with new data sources.

A successfully implemented ODL is a springboard for agile implementation of new business requirements. MongoDB can help drive continued innovation through a structured program that facilitates prototyping and development of new features and applications.

Operational data layer success stories
HSBC

HSBC’s data assets are growing rapidly — from 56 PB in 2014 to 93 PB in 2017. Customers are demanding more, regulators are asking for more, and the business is generating more. In order to make trading data available to a multitude of new digital services, HSBC implemented an Operational Data Layer to become the single source of truth. The ODL, powered by MongoDB, enables HSBC’s development and architecture teams to meet the board’s strategy of using technology to make the bank “simpler, faster, and better.”

Learn More

RBS

RBS implemented an Operational Data Layer — which they call an Enterprise Data Fabric — in order to improve data quality, reduce duplication, and simplify architectures to become leaner. The results? Cost reduction, plans to decommission hundreds of legacy servers, an environment of collaboration and data sharing, and the ability to develop new applications in days, rather than weeks or months on the old systems.

“Data Fabric provides data storage, query, and distribution as a service, enabling application developers to concentrate on business functionality.”

Michael Fulke, Development Team Lead, Royal Bank of Scotland

Learn More

Alight Solutions

Alight Solutions (formerly part of Aon PLC) provides outsourced benefits administration for close to 40 million employees from over 1,400 organizations, but retrieving customer data from multiple frontend and backend source systems meant high mainframe MIPS costs, scaling difficulties, and high query latency. Moving to data as a service delivered from an ODL on MongoDB reduced query latency by 250x for better customer experience, lowered peak mainframe consumption to reduce costs, and unlocked new business innovation.

Learn More

Barclays

Barclays is solving one of the hardest challenges facing any enterprise: a true 360-degree view of the customer with an ODL that gives all support staff a complete single view of every interaction a customer has had with the bank. This is helping Barclays drive customer interactions to new digital channels and improve the customer experience.

Learn More

Conclusion

An Operational Data Layer makes your enterprise data available as a service on demand, simplifying the process of building transformational new applications. It can reduce load on source systems, improve availability, unify data from multiple systems into a single real-time platform, serve as a foundation for re-architecting a monolith into microservices, unlock the value of data trapped inside legacy systems, and more. An ODL becomes a system of innovation, allowing an evolutionary approach to legacy modernization.

Implementing an ODL can be a challenging undertaking. It requires technical skills, the right tools, and coordination among many different parts of the business. By working with MongoDB, you get access to the right technology, people, and process for success.

To discuss building an Operational Data Layer with MongoDB, contact us.

We can help

We are the MongoDB experts. Over 6,600 organizations rely on our commercial products. We offer software and services to make your life easier:

MongoDB Enterprise Advanced is the best way to run MongoDB in your data center. It's a finely-tuned package of advanced software, support, certifications, and other services designed for the way you do business.

MongoDB Atlas is a database as a service for MongoDB, letting you focus on apps instead of ops. With MongoDB Atlas, you only pay for what you use with a convenient hourly billing model. With the click of a button, you can scale up and down when you need to, with no downtime, full security, and high performance.

MongoDB Stitch is a serverless platform which accelerates application development with simple, secure access to data and services from the client — getting your apps to market faster while reducing operational costs and effort.

MongoDB Mobile MongoDB Mobile lets you store data where you need it, from IoT, iOS, and Android mobile devices to your backend – using a single database and query language.

MongoDB Cloud Manager is a cloud-based tool that helps you manage MongoDB on your own infrastructure. With automated provisioning, fine-grained monitoring, and continuous backups, you get a full management suite that reduces operational overhead, while maintaining full control over your databases.

MongoDB Consulting packages get you to production faster, help you tune performance in production, help you scale, and free you up to focus on your next release.

MongoDB Training helps you become a MongoDB expert, from design to operating mission-critical systems at scale. Whether you're a developer, DBA, or architect, we can make you better at MongoDB.

Get Started With MongoDB Atlas

Try Free

Resources

Case Studies

Presentations

Free Online Training

Webinars and Events

Documentation

MongoDB Enterprise Download

MongoDB Atlas database as a service for MongoDB

MongoDB Stitch backend as a service