The Great Data Divide: Here's What's Hindering Your AI Goals
Organizational data is arguably the lifeblood of most digital-era companies. And yet, despite its significance and importance to the organization as a whole, the creation and subsequent management of data in most organizations are bifurcated - split between whether the data is transactional or analytical (operational vs after-the-fact & historical). Between these two worlds, a great divide exists. Like the Equator line which divides our planet into northern and southern hemispheres, many of our organizations operate with separate transactional and analytics hemispheres. Rooted in hardware and software limitations, transactional and analytics data processing workloads are run against different systems and hardware, which are run by separate teams as well. While this has been an effective strategy for managing organizational data assets for a very long time, advances in hardware and software, and the availability of cloud infrastructure, have changed that. When it comes to an organization's ability to deploy AI at scale now, we need to change this approach towards processing and managing data if we want to deliberately increase our organization's overall data processing proficiency. We’re here to suggest a different operating model for consideration, one that’s based on the collective experiences of working with over 40K data-processing customers, many of whom are leading the way when it comes to reorganizing themselves for high data proficiency, to support their AI ambitions and programs. Treating data as a product Let’s erase the line between transactional and analytics for a moment, and instead, view the overall flow and use of information within an organization. It’s created, it’s updated, and it’s read by employees, customers, data & analytics workers, and executives. Sometimes it finds itself inside an application, sometimes it manifests itself on a month-end report. Sometimes it’s used to train and retrain machine learning models. It’s this last scenario that’s starting to reveal significant deficiencies associated with traditional methods of managing data found within many organizations. Thanks to things like mobile, cloud, and IoT, data is moving at a breakneck pace. 40 years ago, we primarily transacted in a business application and then shuttled the deltas overnight into a data warehouse. Why? Because it was simply not possible to execute analytics queries against a running transactional system. At best, the queries would time out. At worst, you would slow or halt business transactional processing, and bring the business to a stand-still. In addition, all analytics were after-the-fact. We didn’t need to try and execute analytics queries against transactional systems. A single enterprise data warehouse repository was good enough to satisfy the reporting demands the organization placed on the data. Today, however, our historical data assets are becoming ever more significant, and sometimes even within real-time transactional business processing. Insights that can be gleaned from historical data, can be fed into decision-making transactional systems, to drive better, or more efficient outcomes. Think of automated decisions and inferences. Machine learning models are now supplementing some of the data analysis and decision-making that humans have traditionally had to perform. As the benefits from these models become more commonplace within transactional business systems, it’s important that they make accurate decisions, especially in heavily regulated industries such as insurance and financial services . A machine-learning model, as such, may need to be retrained often, and many models now demand access to data that is real-time, or as near real-time as possible. It’s this hunger for data that is causing AI models to cross over the great data equator. Not being satisfied with historical data, these models are increasingly demanding to be trained and retrained on data that is as fresh from having been created or updated, as possible. When we treat our data as a product, we see it as a thing, a business entity, or a noun. A customer, a policy, a claim, etc. However, it also has characteristics like state, age, and context. Is it in motion, or is it at rest? Has it just been created? Is it in the process of being updated, or is it years old, sitting in a warehouse? For which business context is it being leveraged - a customer browsing products, or a data scientist looking for trends in past sales? Across all of these characteristics and contexts, the data itself isn’t any more or less important. It’s simply important because it’s the data. Worlds apart When we task entirely separate teams, however, to manage it - transactional vs analytics - we lose this holistic data-as-a-product perspective. Instead, we put very different lenses on, whether we’re looking at a software delivery team, vs a data engineering team supporting data scientists. The meaning of data, after it’s transacted, for example, may change once it’s landed in the enterprise data warehouse or data lake. Transformations and manipulations are applied to it as it crosses over the great data equator, sometimes creating very different instances of the data. The journey often alters it from its original ground-truth state, done so while in between being copied from a transactional database, and loaded into an analytics one. After that data lands in analytics databases and platforms, it’s often further transformed and copied into even more subsequent databases and platforms. For the past decade, most AI efforts have been executed within the analytics hemisphere. Historical data assets in our data warehouses and data lakes have been sufficient to serve experiments and even production AI use cases. The more AI becomes commonplace, however, the more we can expect that AI models will want both historical and real-time data. As such, we should be re-aligning our bifurcated transactional and analytics organizations to help them operate as efficiently as possible, to serve the right state of the data to the right consumer, for the right context. Uniting with Domain Driven Design Some of the best things that have come from software delivery organizations embracing Domain Driven Design come from aligning developers, architects, business SMEs, and scrum masters into the same team, or team of teams. A bounded context in which all the folks who care about, interact with, or manipulate the software and the data, can work together without having to cross departmental boundaries, or bureaucracy that can cause friction when trying to deliver working software. If we consider the goal of being highly proficient and effective with data, especially complicated data (data that has fast changed state and context), it stands to reason that an Agile team of teams, or Bounded Context, should include not only the business SME’s, the software developers, and the architects and site reliability engineers (SRE’s) who maintain applications, but also the data engineers and the data scientists who currently manage after-the-fact data assets, and are using it to bring AI models to life. If we truly want to embrace and treat data as a product, however, we need to eradicate the notion that data should be managed in two different hemispheres across the organization - transactional and analytics. The data will change state, often, and only continue to do so for the foreseeable future. Engineering the organization for success - efficiency, and accuracy when it comes to data processing - requires deliberateness. For that, we have to actively seek out and make our goals happen. Those goals should be focused on removing known friction points. The junctions at which the exchange, or processing of information is inefficient, struggling to scale, costing too much effort and dollars, or all of the above. All hands on deck When it comes to building sophisticated digital applications, when it comes to managing data (in whatever state or context), when it comes to building and maintaining AI models, when it comes to incorporating those models into actual business workflows and applications, it truly takes a village. As AI begins to accelerate the ability to write and deploy code, for example, the pace of application feature delivery in most organizations will increase. In short, we’re going to be expected to do more, in less time, thanks to the forthcoming generation of AI-enabling assistants. This will place even greater demands and expectations on the organization's technology and data workers, and especially the data infrastructure. Similarly, as AI models consume either real-time or historical data, our ability to accurately, efficiently, and quickly process and manage all of this data will need to increase significantly. The way forward Aligning people and resources to common goals is an effective way to transform an organization. Setting goals like treating data as a product, and embracing principles of domain-driven when it comes to an organization’s data-engineering practices, can help tremendously in moving towards more accurate, efficient, and performant data processing. In organizations we work with, large and small, this transformation is beginning, and it’s erasing the hard line that’s existed between two distinct data hemispheres in the organization. As AI becomes more significant, so do your developers, data scientists, and data engineers. We need them working together as efficiently and effectively as possible, to meet our organization’s aspirations. A way to achieve this comes from reducing the friction when it comes to working with data - for developers, data scientists, and AI models alike. We invite you to learn more about our work in insurance .
Build a ML-Powered Underwriting Engine in 20 Minutes with MongoDB and Databricks
The insurance industry is undergoing a significant shift from traditional to near-real-time data-driven models, driven by both strong consumer demand, and the urgent need for companies to process large amounts of data efficiently. Data from sources such as connected vehicles and wearables are utilized to calculate precise and personalized premium prices, while also creating new opportunities for innovative products and services. As insurance companies strive to provide personalized and real-time products, the move towards sophisticated and real-time data-driven underwriting models is inevitable. To process all of this information efficiently, software delivery teams will need to become experts at building and maintaining data processing pipelines. This blog will focus on how you can revolutionize the underwriting process within your organization, by demonstrating how easy it is to create a usage-based insurance model using MongoDB and Databricks. This blog is a companion to the solution demo in our Github repository . In the GitHub repo, you will find detailed step-by-step instructions on how to build the data upload and transformation pipeline leveraging MongoDB Atlas platform features, as well as how to generate, send, and process events to and from Databricks. Let’s get started. Part 1: the Use Case Data Model Part 2: the Data Pipeline Part 3: Automated Decision Support with Databricks Part 1: The use case data model Figure 1: Entity relationship diagram - Usage-based insurance example Imagine being able to offer your customers personalized usage-based premiums that take into account their driving habits and behavior. To do this, you'll need to gather data from connected vehicles, send it to a Machine Learning platform for analysis, and then use the results to create a personalized premium for your customers. You’ll also want to visualize the data to identify trends and gain insights. This unique, tailored approach will give your customers greater control over their insurance costs while helping you to provide more accurate and fair pricing. A basic example data model to support this use case would include customers, the trips they take, the policies they purchase, and the vehicles insured by those policies. This example builds out three MongoDB collections, as well two Materialized Views . The full Hackloade data model which defines all the MongoDB objects within this example can be found here . Part 2: The data pipeline Figure 2: The data pipeline - Usage-based insurance The data processing pipeline component of this example consists of sample data, a daily materialized view, and a monthly materialized view. A sample dataset of IoT vehicle telemetry data represents the motor vehicle trips taken by customers. It’s loaded into the collection named ‘customerTripRaw’ (1) . The dataset can be found here and can be loaded via MongoImport , or other methods. To create a materialized view, a scheduled Trigger executes a function that runs an Aggregation Pipeline. This then generates a daily summary of the raw IoT data, and lands that in a Materialized View collection named ‘customerTripDaily’ (2) . Similarly for a monthly materialized view, a scheduled Trigger executes a function that runs an Aggregation Pipeline that, on a monthly basis, summarizes the information in the ‘customerTripDaily’ collection, and lands that in a Materialized View collection named ‘customerTripMonthly’(3). For more info on these, and other MongoDB Platform Features: MongoDB Materialized Views Building Materialized View on TimeSeries Data MongoDB Scheduled Triggers Cron Expressions Part 3: Automated decisions with Databricks Figure 3: The data pipeline with Databricks - Usage-based insurance The decision-processing component of this example consists of a scheduled trigger and an Atlas Chart. The scheduled trigger collects the necessary data and posts the payload to a Databricks ML Flow API endpoint (the model was previously trained using the MongoDB Spark Connector on Databricks). It then waits for the model to respond with a calculated premium based on the miles driven by a given customer in a month. Then the scheduled trigger updates the ‘customerPolicy’ collection, to append a new monthly premium calculation as a new subdocument within the ‘monthlyPremium’ array. You can then visualize your newly calculated usage-based premiums with an Atlas Chart! In addition to the MongoDB Platform Features listed above, this section utilizes the following: MongoDB Atlas App Services MongoDB Functions MongoDB Charts Go hands on Automated digital underwriting is the future of insurance. In this blog, we introduced how you can build a sample usage-based insurance data model with MongoDB and Databricks. If you want to see how quickly you can build a usage-based insurance model, check out our GitHub repository and dive right in! Learn more about MongoDB and Insurance .
Digital Underwriting: Riding the Insurance Transformation Wave With MongoDB
In our previous article about digital underwriting, “ A Digital Transformation Wave in Insurance ,” we covered the main challenges insurers face when it comes to streamlining and modernizing their underwriting processes, along with key areas that can be improved by leveraging the power of data and artificial intelligence. We analyzed how modern IT trends require a complete redesign of manual underwriting processes to enable insurers to leverage new market opportunities and stay relevant in an ever-changing risk landscape. We explored how the full underwriting workflow — from the intake of new cases to risk assessment and pricing — can be redesigned to ease the burden on underwriting teams and enable them to focus on what matters most. In this second article, we’ll expand on how new technology paradigms can support transformation initiatives in this space and describe the pivotal role MongoDB plays in disrupting the industry. The importance of data and new technology paradigms For digital underwriting transformation initiatives to succeed, organizations must move away from monolithic applications, where data is siloed and functionality is fragmented across different technologies. However, as many organizations have additionally come to realize, lifting and shifting these monolithic applications to the cloud does not automatically bring them closer to achieving their digital objectives. Organizations that are successful in their transformation efforts are increasingly adopting MACH architecture principles to modernize their application stacks. The acronym stands for Microservices, API-first, Cloud-based, and Headless, and, combined, those principles enable developers to leverage best-of-breed technology and build services that can be used across multiple different business workflows and applications. These principles allow software delivery teams to reduce the time it takes to deliver new business features and promote significant reuse and flexibility far beyond the monolithic applications that pre-date them. From an insurance perspective, this approach enables underwriting systems to be decoupled into business and capability domains, each working independently, yet sharing data as part of an event-driven design and microservices architecture. Often overlooked, shared capability domains can provide significant value to an organization's business domains, as seen in the visual below. Figure 1. Key business and capability domains. Each function of the application should be owned by the team holding expertise in that particular domain and be loosely coupled with the others. Services can communicate with each other via APIs, as well as listen for and consume one another's events. Building a domain-based data modernization strategy can also enable a phased migration away from legacy systems. This allows for immediate realization of the organization's digital objectives, without first engaging in a costly and timely legacy system replacement effort. An event-driven, and API-enabled architecture allows for real-time data processing, a core component of digital enablement. Figure 2. Microservices and event-driven architecture. Read the previous post in this series, " A Digital Transformation Wave in Insurance ." Decision support services Once monolithic systems are decomposed into finer-grained domains and services and begin interacting via APIs and events, it is possible to focus on the most crucial component that brings all of them together — the decision support domain. Its role is to streamline and, where possible, automate underwriting and other decision-making processes that traditionally require heavy administrative and manual work in order to reduce operational expenses and enable critical underwriting staff to focus on highest priority work. Effective underwriting processes require pulling together multiple teams and capability domains (e.g., claim, customer, pricing, billing, and so forth) to be able to reach a decision on whether to insure a new customer or define an adequate pricing and coverage model, among other factors. A decision support engine has the power to fully automate those steps by automatically triggering workflows based on specific events (e.g., a new claim is submitted in the system) as part of the event-driven design referenced earlier to enable real-time decision making. Why MongoDB With the added burden of integrating and working with various sources of data — from APIs to events to legacy databases — and doing so in real time, software delivery teams need a developer data platform that allows them to tame complexity, not increase it. Refactoring systems that have been around for decades is not an easy feat and typically results in multi-year transformation initiatives. MongoDB provides insurers with the same ACID capabilities of relational databases, while introducing new tools and flexibility to ease transformation by increasing developer productivity and fully supporting the MACH principles. The MongoDB application data model MongoDB provides a developer data platform leveraged by some of the world’s largest insurers. It possesses key capabilities that allow it to: Integrate legacy siloed data into a new single view. The flexibility of the document model enables the integration of separate, legacy data stores into an elegant, single-view data model that reduces rather than increases complexity. Without the complexities of another canonical, relational model, application development and data migration efforts are dramatically simplified, and delivery timelines shortened. Manage the full lifecycle of containerized applications. MongoDB’s Enterprise Operator for Kubernetes lets you deploy and manage the full lifecycle of applications and MongoDB clusters from your Kubernetes environment for a consistent experience regardless of an on-premises, hybrid, or public cloud topology. Automate workflows, leveraging events in real-time. MongoDB provides the data persistence at the heart of event-driven architectures with connectors and tools that make it easy to move data between systems (e.g., MongoDB Connector for Apache Kafka ), providing a clear separation between automated underwriting workflows and those requiring manual intervention. Enable business agility using DevOps methodologies. MongoDB Atlas , the global cloud database for MongoDB, provides users with quick access to fully managed and automated databases. This approach allows development teams to add new microservices and make changes to application components much more quickly. It also saves a substantial amount of operations effort, since database administrators are not required in every sprint to make and manage changes. Work quickly with complex data. Developers can analyze many types of data directly within the database, using the MongoDB Aggregation Pipeline framework. And, with the power of Atlas Federation, developers can do this without the need to move data across systems and complex data warehouse platforms, providing real-time analytics capabilities that underwriting algorithms require. MongoDB offers a flexible developer data platform that maps to how developers think and code, while allowing data governance when needed. It is strongly consistent and comes with full support for ACID transactions. Figure 3. The MongoDB developer data platform. The MongoDB developer data platform addresses a range of use cases without added complexity, including full-text search, support for storing data at the edge on mobile, data lake, charts, and the ability to deliver real-time analytics without moving data between systems. It also provides developers with a powerful yet simplified query interface suitable for a variety of workloads, enabling polymorphism and idiomatic access. Thank you to Ainhoa Múgica and Karolina Ruiz Rogelj for their contributions to this post. Contact us to find out more about how the MongoDB developer data platform can help you streamline your insurance business.
Connected Data: How IoT Will Save Healthcare and Why MongoDB Matters
Over the next decade, healthcare systems around the world will face a two-fold challenge: Delivering higher quality care while managing rising costs, and doing so for increasingly larger populations of patients. For decades, healthcare systems have operated predominantly with traditional fee-for-service models, in which reimbursements are given to providers based on services rendered. Value-based healthcare, in contrast, attempts to lower the cost of care by keeping patients healthier longer through more effective and efficient use of healthcare systems. This article — Part 2 of our series on connected healthcare data — looks at how IoT, with support from MongoDB, can help meet future healthcare challenges. Read Part 1 of this series on connected healthcare data Increased demand It's expected that by 2050, 22% of the world's population will be over 60 years old . This adds increased pressure to the goals of optimizing both patient outcomes and healthcare spend, because there are more people within healthcare systems than ever before. And, as these patients live longer, they experience more chronic conditions and, therefore, require more care. Constraints on the ability to graduate enough doctors and nurses to meet this surge of healthcare demand suggest that innovation will be needed to provide adequate supply. Additionally, many healthcare services are delivered in an exam or hospital room, where patient vitals and observations are captured, a chart is reviewed, and medications and treatments are ordered. According to a recent study from the Annals of Internal Medicine , providers spend more than 16 minutes per encounter on these tasks alone. Observation and data collection in healthcare is critical to identifying and subsequently adjusting treatment pathways; however, the process is heavily reliant on in-person visits. How IoT will save healthcare Global adoption of the Internet of Things (IoT) is soaring across numerous industries. In fact, healthcare is forecasted to be the second largest industry in value for IoT by 2030. IoT offers the ability to remotely monitor patients via wearables and connected devices. It provides the means to collect data beyond the patient exam or hospital room and can help providers deliver care outside of traditional, in-person methods. With this power to collect more information, more often, and do so with fewer patient encounters, IoT plays a role in solving the two-fold challenge of delivering better quality of care for increasingly larger populations of patients. A patient wearing a smartwatch, for example, may be able to stream heart rate and oxygen saturation levels during real-world activities to an electronic healthcare record, where the data can be aggregated and summarized for a physician to review, or even for a machine-learning algorithm to periodically interrogate. IoT devices can help collect more data, more often, to help providers deliver more meaningful, timely, and impactful healthcare recommendations and treatments to patients. Through this added value, IoT can further the benefits of telemedicine and promote the idea of “care anywhere,” in which healthcare is not directly tied to or dependent upon in-person encounters. Challenges of healthcare data on the move What challenges face developers when it comes to capturing and leveraging data from healthcare IoT devices? Four significant capabilities top the list, which we will look at in turn: Scalable and efficient storage Global coverage and data synchronization Interoperability Security and privacy Scalable and efficient storage IoT devices have the capability to produce massive volumes of continuous data. In fact, market intelligence provider International Data Corporation (IDC) predicts that IoT devices alone will produce 74.9 ZB of data by 2025, from a staggering 55.9 billion devices. A cloud-based developer data platform will be critical to support these kinds of massive data volumes, which may also exhibit unpredictable peaks in workloads. Additionally, as is the case for many IoT use cases, often only the most recent data is used for analysis. In this scenario, the ability to automatically archive raw and historical data to a more cost-effective storage, and yet still be able to query it when and if needed, would be ideal. MongoDB’s Atlas Online Archive lets developers do just that, with minimal setup and configuration required, as shown in Figure 1. Figure 1. MongoDB automates data tiering while keeping it queryable with Atlas Online Archive. Not all databases are ready to deal with the massive, continuous data generated by IoT devices. Sensor data is typically collected with high frequency, which may mean high concurrency of writes, unpredictable workload peaks, and the need for dynamic scalability. Additionally, IoT data is almost by definition time-series data, meaning it typically comes with a timestamp that allows following the evolution of a parameter through time, at regular or irregular time intervals. Storing time-series data efficiently at scale can be difficult. In fact, specialized time-series databases exist to tackle workloads such as these. Additionally, storing the data is simply one side of the challenge. Another aspect involves running analytics as the data is collected, such as discovering heart anomalies and sending alerts in real time to the patient. Using specialized time-series databases solves these challenges but also introduces new ones: Developers will need to learn the nuances of working with a niche platform, slowing development cycles. Building and maintaining ETL pipelines to move data and merge data across different platforms. Integrating, securing, and maintaining an additional database platform, thereby increasing operational overhead. MongoDB's new time series collection feature allows you to automatically optimize your schema and deployment for high storage efficiency and low-latency queries, without the need of an additional, niche database. Additionally, MongoDB integrates time-series data with operational data and analytics capabilities in one unified environment with built-in scalability, delivering the performance your IoT applications need while simplifying your architecture. Global coverage and data synchronization For many IoT scenarios, users are effectively on the move: They go to work, they go shopping, and they get on planes to see the new beautiful shiny star on top of Barcelona's Sagrada Família. With all of this mobility, they might lose connectivity for a few minutes or even hours. Tracking their health effectively in real time is not just a nice feature, it may be mandatory. Using MongoDB’s Atlas Device Sync , developers can easily deploy IoT applications that seamlessly handle drops in connectivity, without missing critical write operations of the most important data workloads. Interoperability Most IoT devices use proprietary protocols and operating systems, which seriously limit interoperability. The IoT industry advocates the use of standard communication protocols such as MQTT, but, as of this writing, there is no single industry standard. Custom solutions exist that serve one single type of sensor and/or healthcare provider, but these solutions tend to suffer from interoperability challenges when interlinking data across different healthcare networks. As discussed in our first post , sharing healthcare data across different participants of the healthcare ecosystem requires standards such as JSON-based FHIR, which is key to mitigate healthcare fragmentation. Learn how we used MongoDB and MQtt to "listen" and "talk" remotely to an IoT-powered facility. Downloadable code available. Security and privacy Given its sensitive and personal nature (and relatively easy monetization through theft), health data is especially appealing to bad actors. The number of security incidents impacting healthcare systems is sobering. According to a report by Crowdstrike , 82% of health systems experienced some form of IoT cyberattack in 2020. With IoT proliferation on the rise, the need for the highest level of security at the application level and at the database level, becomes non-negotiable. Unsurprisingly, McKinsey cites interoperability, security, and privacy as major headwinds for IoT adoption, especially for healthcare. How MongoDB supports IoT challenges Here's a visual view of how MongoDB helps developers bring IoT applications to market faster: Scalability and efficient storage Global coverage and data synchronization High availability and scalability are built in via replication and native sharding. Online Archive automatically archives aged data to a fully managed cloud object storage, so you can optimize cost and performance without sacrificing data accessibility. Time series collections automatically optimize your schema for high storage efficiency, low-latency queries, and real-time analytics. MongoDB Atlas is a global, multi-cloud platform that lets your apps run anywhere in the world. Atlas Device Sync solves conflict resolution and keeps your data up to date across devices, users, and your backend, regardless of connectivity. Interoperability Security and privacy The document model provides a flexible schema and maps exactly to the objects that developers work with in their code. Different industry communication standards are being built over JSON, such as FHIR, which is a natural fit to MongoDB's document model. Thanks to MongoDB Client-side Field Level Encryption , data is encrypted in motion, in memory, and at rest. Queryable Encryption allows running expressive queries on fully randomized encrypted data. MongoDB provides the strongest levels of data privacy and security for regulated workloads. MongoDB Atlas takes care of the backend, removing friction from the development process and simplifying your technology stack, so you can focus on building differentiating features for your applications. Atlas is a developer data platform that supports a broad array of use cases, from operational to transactional and through analytical workloads. Atlas also offers the following features: Ability to service more loads of the data lifecycle: Enabling development teams to seamlessly analyze, transform, and move data while reducing reliance on batch processes or ETL jobs Built on a modern data model: Aligning to the way developers think and code Integrated: Delivering an elegant developer experience Figure 2. Atlas is a developer data platform built on three pillars: the document model, a unified interface for different data use cases, and a multi-cloud, enterprise-ready foundation. MongoDB for IoT-powered healthcare apps IoT and specifically wearables will play a major role in solving the two-fold challenge of delivering better quality care for increasingly larger populations of patients. The soaring adoption of wearables is accelerating the need for a developer data platform that helps software delivery teams build and manage health applications with: Scalable and efficient storage Global coverage and data synchronization Interoperability Security and privacy MongoDB Atlas is a developer data platform designed to manage the heavy lifting for you, by providing an elegant developer experience and unifying a broad range of data workloads with world-class privacy and security features. Read Part 1 of this series on connected healthcare data , and learn more about MongoDB Atlas and the healthcare industry .
Connected Healthcare Data: Interoperability to Solve Fragmentation and Drive Better Patient Outcomes
Many differences exist across healthcare systems around the globe, but there is one unfortunate similarity: fragmentation. Fragmentation is a consequence of the inability of various healthcare organizations (both public and private) to communicate with each other or to do so in a timely or consistent manner, and it can have a dramatic impact on patient and population well-being. Interoperability and communication A patient can visit a specialist for a specific condition and the family doctor for regular checkups, perhaps even on the same day. But how can both doctors make appropriate decisions if patient data is not shared between them? Fragmented healthcare delivery, as described in this scenario, also leads to data fragmentation. Such data fragmentation can cause misdiagnosis and services duplication. It can also lead to billing issues, fraud, and more, causing preventable harm and representing a massive economic burden for healthcare systems worldwide. To improve healthcare fragmentation, we need truly interoperable health data. The longitudinal patient record A longitudinal patient record (LPR) is a full, life-long view of a patient’s healthcare history and the care they’ve received. It’s an electronic snapshot of every interaction patients have, regardless of provider and service. Ideally, this record can be shared across any or all entities within a country’s healthcare system. The LPR represents a step beyond the electronic health record, extending past a specific healthcare network to a regional or national level. It’s critical that LPRs use the same data format and structure to guarantee the ability of healthcare providers to easily and quickly interact with them. Data standards for LPRs are key to interoperability and can help address healthcare fragmentation, which, in turn, can help save lives by improving care. FHIR Fast Healthcare Interoperability Resources (FHIR) is a commonly used schema that comprises a set of API and data standards for exchanging healthcare data. FHIR enables semantic interoperability to allow effective communication between independent healthcare institutions and essentially defines “how healthcare information can be exchanged between different computer systems regardless of how it is stored in those systems” ( ONC Fact Sheet, “What is FHIR?” ). FHIR aims to solve the fragmentation problem of the healthcare system by directly attacking the root of the problem: miscommunication. As is the case for many other modern communication standards (for example, ISO 20022 for finance ), FHIR builds its REST API from a JSON schema. This foundation is convenient, considering most modern applications are built with object-oriented programming languages that have JSON as the standard file and data interchange format. This approach also makes it easier for developers to build applications, which is perhaps the most important point: The future of healthcare delivery may increasingly depend on the creation of applications that will transform how patients and providers interact with healthcare systems for the better. MongoDB: FHIR and healthcare app-ification MongoDB is a document database and is therefore a natural fit for building FHIR applications. With JSON as the foundation of the MongoDB document model developers can easily store and retrieve data from their FHIR APIs to and from the database, with no translation or change of format needed. In fact, organizations can adopt FHIR resources as the basis of a new, canonical data model that existing internal systems can begin to shift and conform to. One example is the Exafluence FHIR API , which is built on top of MongoDB. Exafluence's API allows for real-time data interchange by leveraging Apache Kafka and Spark, in either an on-premise or multi-cloud deployment. Software teams leveraging the Exafluence solution have experienced velocity increases of their FHIR interoperability projects by 40% to 60% . MongoDB's tool set can develop value-add business solutions on the FHIR-native dataset — without ETL. Beyond FHIR , the trend toward healthcare app-ification (i.e., the increasing use of applications in healthcare) clashes with pervasive legacy architectures, which typically are not optimized for the developer experience. Because of this reliance on legacy architectures, modernization or transformation initiatives often fail to take hold or are postponed as companies perceive the risks to be too high and the return on investment is not evident. It doesn’t have to be this way, however. MongoDB’s industry-proven iterative approach to modernization reduces the risk of application and infrastructure migration and unlocks developer productivity and innovation. Interoperable, modern healthcare applications can now be built in a developer-friendly environment, with all the benefits expected from traditional databases (i.e., ACID transactions, expressive query language, and enterprise-grade security). MongoDB provides the freedom for solutions to be deployed anywhere (e.g., on-premises, multi-cloud), providing a major advantage for healthcare organizations, which typically have multi-environment deployments. Healthcare and the cloud Digital healthcare will accelerate the adoption of cloud technologies within the industry, enabling innovation at scale and unlocking billions of dollars in value. Healthcare organizations, however, have so far been reluctant to move workloads to the cloud, mostly because of data privacy and security concerns. To support such cloud adoption initiatives, MongoDB Atlas offers a unique multi-cloud data platform , integrating MongoDB in a fully managed environment with enterprise-grade security measures and data encryption capabilities. MongoDB Atlas is HIPPA-ready and a key facilitator for GDPR compliance. A holistic view of patient care Interoperable healthcare records and communication standards will make longitudinal patient records possible by providing a much-sought-after holistic view of the patient, helping to fix healthcare fragmentation. Many challenges still exist, including transforming legacy infrastructures into modern, flexible data platforms that can adapt to the exponential changes happening in the healthcare industry. MongoDB provides a developer data platform designed to unlock developer productivity and ultimately giving healthcare organizations the power to focus on what matters most: the patient. Read Part 2 of this series on connected healthcare data , and learn more about MongoDB Atlas and the healthcare industry .
How the Healthcare Industry Benefits From Platform Thinking
As the healthcare industry uncovers new ways to utilize health data to drive value-based care, digital platforms will bring applications together into a seamless and single-user experience for patients and providers. Digital platforms are greater than the sum of the individual applications that underpin them, and why competitive advantage can’t simply be purchased with off-the shelf software solutions. In my last blog, I outlined the three key steps healthcare providers can take to migrate toward a digital health business model: Define where you want to go Put technology in its place Pivot from applications, to platforms Step three is so important that it needed its own blog post. For more details about the first two steps, catch up on my last blog post . Platform thinking for your healthcare IT strategy In healthcare, digital platforms provide a single point of entry, where a patient or a provider can access everything they want or need, regardless of the complexity or location of the underlying systems and data. Platform thinking is the embodiment of digital transformation, leveraging an organization's technology and data to deliver new and innovative products and services. How well an organization can integrate their unique applications and services into digital platforms will be what differentiates them from competitors. Digital platforms also bring new complexities to the fold. Developers must stitch together data and functionality across what may be many siloed sources, and do so in real-time. A new provider digital platform may require ingesting a stream of medical sensor data, contrasting it with deep analytics insights from a data lake, and generating automated decisions and alerts, for example. A platform perspective provides clarity as to why simply lifting and shifting existing applications to the cloud will not bring about the kind of transformation that’s required in healthcare. Lifting existing applications into the cloud is an investment in time and money that likely does more to enhance resumes, than position a healthcare organization for the needs of the future of healthcare delivery. The EHR system that runs in your data center will likely have the exact same features and limitations it does now, even after you lift-and-shift it to the cloud. Platform thinking for healthcare data Digital platforms demand architectures that free developers from the underlying application landscape complexities. Going to market quickly with new features, ones that deliver real-time data, and even swapping out services as your business evolves, are critical capabilities. They are also why digital platforms are almost always underpinned by an ODL (Operational Data Layer). An ODL can serve as a system of record for new features, as well as an integration layer between existing systems and third party services. An ODL supporting a digital platform needs to work like a Swiss Army knife. Twenty years ago, nearly all of an organization's data interactions were executed against a relational database. Today’s software delivery teams need to support features beyond what relational offers, however, such as fuzzy-search, type-ahead, geospatial, graph, IoT, time-series, and even mobile capabilities. Choosing the right technology for your ODL can be overwhelming, but perhaps simplified by first listing out ideal features and capabilities it should support: A digital platform ODL should support: Relational Document Geospatial Graph Search IoT Time-series Mobile On-premise support Cloud-hosted Multi-cloud Underpin digital platforms and ecosystems with an ODL A MongoDB ODL serves as the best technology to underpin digital platforms and ecosystems. With the inherent flexibility of the document model, complex data from a variety of sources can be seamlessly integrated in real time. We believe that the digital platform ODL needs to be a general-purpose workhorse that can serve a wide variety of today's most demanding workloads. When it comes to avoiding the pitfalls associated with so many cloud migration efforts, MongoDB makes it easier than ever to bring your data safely and securely to the cloud, or even multiple clouds. Plus, MongoDB is both HIPAA and GDPR compliant, and it offers rich, unparalleled features such as Client-Side Field Level Encryption, making it arguably the safest choice for your most sensitive healthcare data workloads. We invite you to connect with us here, and let us partner with you as you build out the next generation of Healthcare digital platforms. Additional Resources: [ Blog ] The promise of public clouds for digital healthcare [ Case study ] How Humana took HL7 FHIR to the cloud and drove better patient experiences [ Blog ] FHIR Technology is Driving Healthcare’s Digital Revolution [ Blog ] Drowning in data: why it’s time to end the healthcare data lake
The Promise of Public Clouds for Digital Healthcare
Digital healthcare in 2030: Personalized, preventive care designed by platform thinking Cloud-based, digital platforms are integral to the future of healthcare delivery. It stands to reason that how well an organization executes on a cloud migration strategy today, may be an indicator of possible future performance and success. Picture two scenarios in the year 2030. In scenario A, you’re in an exam room with your physician and you’re connected to a device that can perform a host of diagnostic functions on the fly, like imaging, blood testing, and cardiovascular activity. A band-aid sized medical device you’ve been wearing helped alert your physician to a possible condition, which is the reason for your visit. Data from that wearable device, combined with the data collected by the device in the exam room, is analyzed using several machine learning models. Those models have been trained using data from millions of healthcare records across the globe and by clinicians seeking out effective treatment pathways for numerous conditions. Within minutes, in the exam room, thousands of data points come together to assist your physician in providing you with a diagnosis and treatment that helps you live the healthiest life possible. In scenario B, patients’ interactions with healthcare providers are much like they are today. You may have the benefit of some automated appointment reminders, or you may be able to share healthcare data from your wearable device directly with your physician. Other than that, the average patient’s care isn’t improved by the strategic benefits of digital health data in the public cloud. How healthcare and insurance companies get to scenario A, where the healthcare industry has the benefit of personalized patient experiences, will hinge on the decisions made today. Pitfalls on the road to the digital healthcare future The combination of medical devices, data and machine learning could revolutionize healthcare delivery to be far more predictive and effective than ever before. We currently have the tools to build a healthcare system predominantly delivered within digital platforms and ecosystems with devices, data, and services integrated and interoperable within and across provider networks. These services could be facilitated by cloud computing platforms and infrastructure. Despite having entered its second decade, however, cloud adoption continues to pose a challenge to a significant number of healthcare organizations globally. As McKinsey pointed out , “an overwhelming majority of large organizations [have experienced] … failure modes” when it comes to cloud migration efforts.* McKinsey coined four “failure modes” to describe the different scenarios that can wreak havoc on an organization's public cloud adoption efforts: pilot stall, cloud gridlock, no value from lift and shift, and cloud chaos.* We’ll focus specifically on how your organization can avoid failure with both pilot stall and no value from “lift and shift.” McKinsey describes them as follows: “Pilot stall: Companies have succeeded in implementing a few greenfield applications on public-cloud platforms, but the value derived from these programs has been limited. This makes further progress impossible because tech leaders cannot make a convincing business case to extend the use of the cloud platform into the heart of IT’s technology environment.” “No value from ‘lift and shift’: The migration of significant portions of the technology environment—largely by replacing on-premises virtual machines with off-premises ones without taking advantage of cloud-optimization levers—has failed to significantly reduce costs or increase flexibility. Support for cloud initiatives subsequently collapses.” 3 key steps towards a digital health business model Clearly, a strong partnership between business and IT is essential for success, but is there more to the story? We think so. Organizations that can synthesize both business and IT objectives into a unified vision will be the most successful. This new digital business model is key to truly transforming healthcare. It recognizes that technology is something we put to work to solve healthcare problems and challenges, and create new opportunities. Here are three steps to take: Define where you want to go: As Lewis Carroll wrote, “If you don’t know where you are going, any road will get you there.” When it comes to modernizing healthcare, or any industry, it’s critical for an organization to articulate and communicate a business vision and strategy. What kind of company do you want to be? What kinds of products or services do you want to bring to market? What data management challenges do you need to overcome in order to realize your vision? Put technology in its place: Cloud is not a business strategy. Public cloud platforms and infrastructure facilitate the execution of your business strategy. These tools should inform your business strategy, and even inspire you to articulate more ambitious business outcomes and vision. Pivot from applications, to platforms: We’ve been building IT applications in healthcare for decades. We even use the term application delivery to describe the teams that write our software. Much of this work has helped to digitize previously analog processes or workflow steps. Many EHR (electronic healthcare record) systems are an example of this. While they help physicians and nurses capture and share patient chart information more quickly, they do not fundamentally transform the services rendered to patients. On the road to the digital health future, we need to shift to a newer paradigm – platform thinking – in order to progress past simply digitizing and on to digitally transforming healthcare systems. What is platform thinking? It’s the recognition that as our world becomes increasingly more software driven, web and mobile platforms are becoming the predominant way that we consume and interact with a company's products and services. Stay tuned for part two in our series: How the healthcare industry benefits from platform thinking. [ Case study ] How Humana took HL7 FHIR to the cloud and drove better patient experiences [ Blog ] FHIR Technology is Driving Healthcare’s Digital Revolution [ Blog ] Drowning in data: why it’s time to end the healthcare data lake *Giemzo, Jayne, et al. “How Cios and CTOS Can Accelerate Digital Transformations through Cloud Platforms.” McKinsey & Company, McKinsey & Company, 11 Aug. 2021, https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/how-cios-and-ctos-can-accelerate-digital-transformations-through-cloud-platforms.
FHIR Technology is Driving Healthcare's Digital Revolution
Technology supporting healthcare’s digital transformation is so pervasive that the question isn’t what technology to choose, but rather, what problems need to be solved. Advancing technology and access to secure and real-time data analytics will vastly improve patients’ health and happiness, and growing interoperability standards are pushing organizations forward in their digital transformations. Together with the Healthcare Information and Management Systems Society (HIMSS) and leading healthcare insurance provider Humana , MongoDB recently released a three-part podcast series chronicling the ways Fast Healthcare Interoperability Resources (FHIR), AI, and the cloud are reshaping healthcare for the better. Here’s a quick roundup of our discussions. Data is the future of healthcare . Whether providers are driving patient engagement through wearable devices, wellness programs or connected care, data will take healthcare to the next digital frontier. We’ll see these advancements through AI, FHIR, and the cloud. FHIR is revolutionizing healthcare technology . Not only is FHIR implementation a requirement, it’s also a crossroads for data architects. Choosing the right approach has deep implications for healthcare IT. The operational data layer (ODL) approach to interoperability makes the impossible possible . Through Humana’s digital transformation journey, it became clear that meaningful progress isn’t possible using core legacy database systems. AI, FHIR, and the cloud: Why data is the future of healthcare In this episode , we dive into what a digital transformation would look like for the healthcare industry, and what are some of the biggest technology challenges facing healthcare today. A digitally transformed healthcare industry will weave real-time data analytics with more personalized care. Patients today want a more modern healthcare experience that includes telemedicine, digital forms and touchless mobile check ins. The end goal is simple: maximize the human experience while advancing away from legacy technology systems that slow down both healthcare practitioners and patients. When it comes to today’s biggest healthcare challenges, the cloud stands out as a key driver of promise and peril. The promise is that we can build applications, go to market and reach patients through wellness programs more quickly. The peril lies in the infrastructure, which is unknown to many healthcare organizations. This presents a unique challenge for the architects and certainly the developers at organizations with older legacy systems. The challenge here is avoiding a simple left hand shift or cloud for the sake of cloud, and moving from simple modernization to actual transformation. Listen below to hear the entire conversation Your browser does not support the audio element. Bring the FHIR inside for digital transformation In episode 2 , HIMSS and MongoDB take a closer look at why FHIR is a change agent in healthcare technology, and how healthcare organizations globally are using the new data standard to jump start legacy modernization and digital transformation. What is FHIR? The FHIR standard is a common set of schema definitions and APIs that helps providers and patients manage and exchange healthcare data. Using FHIR, records provided by healthcare organizations are standardized into a common data model over rest-based APIs. It makes the data that healthcare providers and payers use easier to exchange. Growing regulatory pressure has accelerated U.S. FHIR adoption among healthcare organizations and technology vendors.The Centers for Medicare and Medicaid Services (CMS) started a rolling deadline for FHIR compliance in 2020, with fines for institutions that fall behind. As a result, for most U.S.-based healthcare providers, payers, and their technology vendors, the past few years were a headlong race to adopt FHIR. Here are three reasons why FHIR is hugely significant for healthcare technology leaders: It’s a federal mandate from the Centers for Medicare & Medicaid Services. It’s a complex data integration challenge. Legacy systems built before the mid 2010s are not interoperable with the FHIR mandate. FHIR implementation approaches For large organizations with huge data requirements, data architects can experience paralysis from the sheer volume of legacy systems to unwind. These groups have all of their patients’ electronic healthcare record information, payer information and more bound up in legacy systems, none of which is interoperable with FHIR. The second challenge is cloud migration, which can be skirted by organizations using a checkbox compliance approach. In those cases, API layers are used to ingest and serve data to legacy systems, but are not really integrated with the legacy system in real time. The most successful approach to tackling this challenge is not to rewrite, unwind or replace legacy systems completely, but keep them contained. We recommend bringing in an operational data layer that exposes the information in the legacy system and keeps it in sync with the legacy system, but then lands it in an ODL in the FHIR standard. With the FHIR API, patients and providers can interact with data in real time and access records in milliseconds after a diagnosis. Real-time records synced with legacy systems and patients’ private data is protected. Delve into the full conversation below Your browser does not support the audio element. FHIR and the future of healthcare at Humana You don't have to take the rip and replace approach when modernizing your legacy systems with an ODL method. This was a key to successful modernization for Humana, as discussed in the third and final episode in our series. For large enterprises that may have decades’ worth of acquired legacy systems, often pulling similar datasets from disparate databases, the pursuit of modernized interoperability begins to look like an impossible task. Listen to the final episode of our podcast series to here how Humana’s ODL approach met the company’s data velocity requirements, and next steps for personalized healthcare and interoperability at Humana. Listen to the entire conversation below Your browser does not support the audio element. More related FHIR and healthcare resources [ White paper ] Bring the FHIR Inside: Digital Transformation Without the Rip and Replace [ On-demand webinar ] Building FHIR Applications with MongoDB
Drowning in Data: Why It's Time to End the Healthcare Data Lake
From digital check-ins, to connected devices and telehealth programs, patients expect the benefits of a more digitized healthcare experience. At the same time, they’re also demanding a more personalized approach from healthcare providers. This duality - the need to provide a more convenient experience with one that’s more tailored to the patient - is fueling a wave of technology modernization efforts and the replacement of monolithic legacy IT systems. With limited re-use outside of the context they were built for and a reliance on nightly batch processing, legacy IT systems fail to deliver the services healthcare IT teams need or provide the experiences patients demand. Modernization should come with a move to microservices that can be used by multiple applications, agile teams that embrace domain driven design principles, and event busses like Kafka to deliver real-time data and functionality to users. While this transformation is occurring, there’s an 800lb gorilla not being widely addressed. Analytics. What the healthcare industry doesn’t want to talk about, is how costly analytics has become; the people, the software, the infrastructure, and particularly how difficult it is to move data in and out of data lakes and warehouses. It's hindering the industry’s ability to deliver insights to patients and providers in a timely and efficient manner. And yet, so many organizations are modernizing their analytics data warehouses and data lakes with an approach that simply updates the underlying technology. It’s a lift-and-shift effort of tremendous scale and cost, but one that is not addressing the underlying issues preventing the speedy delivery of meaningful insights. Drowning in data: A 1980s model in the 2020s While the business application landscape has changed, healthcare is still clinging to the same 1980’s paradigm when it comes to analytics data. It started by physically moving all the data from transactional systems into a single data warehouse or data lake (or worse, both), so as not to disrupt the performance of business applications by executing analytics queries against the transactional database. Eventually, as data warehouses had enough relational tables and data in them, queries began to slow down, and even time-out before delivering results to end users. This gave rise to data marts, yet another database to copy the warehouse data into, using a star schema model to return query results more efficiently than in the relational warehouse. In the last and current iteration of analytics data platforms, warehouses and data marts became augmented, and were even replaced in some cases, with data lakes. Technologies like Hadoop promised a panacea where all sorts of structured and unstructured data could be stored, and where queries against massive datasets could be executed. In reality it turned out to be a costly distraction, and one that did not make an organization's data easier to work with, or provide real-time data insights. Hence why it earned the nickname “data jail”. It was hard to load data into, and even harder to get data out of. New technology, same challenges While Hadoop and other technologies did not last long, they hung around just long enough to negatively alter the trajectory of many analytics shops, which are now investing heavily in migrating away from Hadoop, to cloud-based platforms. But, are these cloud alternatives solving the challenges of the Hadoop era? Can your organization rapidly experiment, innovate and serve up data insights from your data lake? Can you go from an idea to delivery in days? Or, is it weeks, months even? Despite the significant amounts of time, money and people required to load data into these behemoth cloud data stores, they still exhibit the same challenges as their Hadoop-era predecessors. They are difficult to load and even more difficult to make changes to. They can never realistically offer real-time or even near-real-time processing, the response time that patients and providers expect. Worse, they contain so much data, that making sense of it is a task often left to either a sophisticated add-on like AWS HealthLake, or specialized data engineering and data science teams. To add to this, the cloud based analytics systems are typically managed by a single team that’s responsible for collecting, understanding and storing data from all of the different domains within an organization. This is what we like to call a modernized monolith, the pairing of updated technology with a failure to fundamentally address or improve the overall limitations or constraints of a system or process. It’s an outdated and inefficient approach that’s simply been “lifted and shifted” from one technology to another. Many data lake implementations take a modernized monolithic approach which, like their predecessors, results in a bottleneck and difficulty in getting information out, once it goes in. In a world where data is at the center of every innovative business, and real-time analytics is top-of-mind for executives, product owners and architects alike, most data lakes don’t deliver. Transforming your organization into a data-driven enterprise requires a more agile approach to managing and working with ever-growing sums of data. The rise of the operational data layer — an ODS renaissance To provide meaningful insights to patients in a timely and efficient manner, two very important things need to happen. Healthcare organizations need to overcome the limitations of legacy systems, and they need to make sense of a lot of very complex data. A lift-and-shift approach migrating data into a data lake will not solve these problems. In addition, it’s not feasible or advisable to spend tens, or even hundreds of millions of dollars to replace legacy systems as a precursor to a digital engagement strategy. The competition will leap-frog you before your efforts are even half complete. So, what can be done? Can your organization make better sense of its data, and at the same time mitigate the issues legacy systems impose? Can this be done without a herculean effort? The answer is yes. The solution is an operational data layer (ODL) , formerly known as the operational data store. It’s a method that’s been tried and tested by major corporations, and is the underlying technology that powers many of the apps you interact with on your phone. An ODL lets you build new features without existing system limitations. It lets you summarize, analyze, and respond to data events, in real-time. It helps you migrate from legacy systems, without incurring the cost and complexity of replacing legacy systems. It can give your teams the speed and agility that working against a data lake will simply never have. Data lakes and warehouses have their place, and the kinds of long-term data insights and data science benefits that can be gleaned from them are significant. The challenge, however, is reacting in real-time, and serving those insights to patients, quickly. An ODL strategy offers the best, most cost and time efficient approach to mitigate legacy system issues, without the pain of replacing legacy systems. Investing in an ODL strategy will both solve your legacy modernization dilemma, and it will help you deliver real-time data and analytics at the speed of an agile software delivery team. MongoDB is an ideal ODL provider . Not only does it have the underlying, flexible document-based database, but it is also a data platform, empowering your developers to focus on building features, not managing databases and data. If you’re interested in learning about how MongoDB has enabled organizations large and small to successfully implement ODL strategies and tackle other burning healthcare issues, click here .