MongoDB Applied

Customer stories, use cases and experience

Pagos Digitales - Foco en America Latina

Impulsado por las nuevas tecnologías y las tendencias globales, el mercado de pagos digitales está floreciendo en todo el mundo. Con una valoración de más de $ 68 mil millones en 2021 y expectativas de crecimiento de doble dígitos durante la próxima década, los mercados emergentes están liderando el camino en términos de expansión relativa. Un panorama que una vez fue dominado por grandes bancos y compañías de tarjetas de crédito ahora está siendo atacado por disruptores interesados en capturar una cuota de mercado. Según un estudio de McKinsey , hay cuatro factores principales en el núcleo de esta transformación: Adopción de pagos cashless inducidos por la pandemia E-commerce Impulso del gobierno a los pagos digitales Fintech Cabe destacar como la pandemia ha sido una gran catalizadora en el aumento de la inclusión financiera al fomentar medios de pago alternativos y nuevas formas de pedir préstamos y ahorrar. Estos nuevos servicios digitales son, de hecho, más fáciles de acceder y consumir. En América Latina y el Caribe (LAC), la Covid provocó un aumento dramático en los pagos sin efectivo, el 40% de los adultos realizó una compra en línea, el 14% de los cuales lo hizo por primera vez en su vida. El e-commerce ha visto un crecimiento estelar, con una penetración que probablemente superará el 70% de la población en 2022, los actores nacionales y globales, incluidos Mercado Libre y Falabella, están impulsando la innovación de pagos digitales para proporcionar una experiencia de cliente cada vez más fluida en sus plataformas. Los bancos centrales están promoviendo nuevas infraestructuras para pagos en tiempo real, con el objetivo de proporcionar una tecnología más económica y rápida para la transferencia de dinero tanto para ciudadanos como para empresas. PIX es probablemente el mayor caso de éxito. Una plataforma de pagos instantáneos desarrollada por el Banco Central do Brasil (Banco Central de Brasil), comenzó a operar en noviembre de 2020 y, en 18 meses, más del 75% de los brasileños adultos lo había utilizado al menos una vez. La red procesa alrededor de $250 mil millones en pagos anualizados, aproximadamente el 20 % del gasto total de los clientes. Los usuarios (incluidos los trabajadores autónomos) pueden enviar y recibir pagos en tiempo real a través de una interfaz sencilla, 24 horas al día, 7 días a la semana y de forma gratuita. Las empresas tienen que pagar una pequeña tasa. En Estados Unidos, la Federal Reserve ha anunciado que lanzará FedNow a mediados de 2023, una red de pagos con características similares a PIX. Estas iniciativas tienen como objetivo resolver problemas como los acuerdos lentos y la baja interoperabilidad entre las partes. Los bancos establecidos aún poseen la mayor parte del mercado de pagos digitales, sin embargo, las fintech han estado amenazando este dominio, aprovechando su agilidad para actuar rápidamente y satisfacer las necesidades de los clientes de formas más innovadoras y creativas. Sin el lastre de los sistemas legacy, o los modelos comerciales atados a las viejas redes de pago, las fintechs no han dudado en probar y adoptar nuevas tecnologías y sistemas de pago. Su estrategia enfocada a móvil y digital les está ayudando a capturar y retener al segmento más joven del mercado, que exige experiencias integradas en tiempo real con las que pueden interactuar tan sólo pulsando un botón. Un ejemplo es Paggo, una fintech guatemalteca que ayuda a las empresas a agilizar los pagos permitiéndoles compartir un simple código QR que los clientes pueden escanear para transferir dinero. El panorama de los pagos no solo se ve afectado por fuerzas externas, los cambios que provienen de la industria también están remodelando la experiencia del cliente y habilitando nuevos servicios. La norma ISO 20022 es un estándar flexible para el intercambio de datos que está siendo adoptado por la mayoría de las instituciones de la industria financiera para estandarizar la forma en que se comunican entre sí, optimizando así la interoperabilidad. Gracias a la adopción de ISO 20022, es más sencillo para los bancos leer y procesar mensajes, lo que se traduce en procesos internos más fluidos y una automatización más sencilla. Para los usuarios finales, esto significa pagos más rápidos y potencialmente más baratos, así como aplicaciones financieras más ricas e integradas. 3DS2 está siendo adoptado por el ecosistema de pagos con tarjeta de crédito y débito. Se trata, esencialmente, de una solución de autenticación de pagos que sirve para transacciones de compras en línea. De manera similar a ISO 20022, el usuario final ni siquiera conocerá la tecnología subyacente, sino que sólo percibirá un pago más fluido y sin fricciones. 3DS2 evita que el usuario sea redirigido a su aplicación bancaria para confirmar la compra de un artículo en línea, ahora todo sucede en el sitio web o la aplicación del vendedor. Todo esto se hace al mismo tiempo que se mejora la detección y prevención de fraude; esta nueva solución dificulta el uso de la tarjeta de crédito o débito sin autorización. El beneficio de la adopción de 3DS2 es doble: por un lado, el usuario tiene mayor confianza, por otro, los comerciantes están más contentos debido a una menor tasa de abandono de clientes; de hecho, el miedo al fraude en el proceso de pago suele ser una de las principales razones para abandonar una compra en línea. Esta solución es especialmente ventajosa para la región de LAC, donde, a pesar de la amplia adopción del comercio electrónico, las personas aún se muestran reacias a realizar transacciones online. Uno de los factores que contribuyen a esta incongruencia es el miedo al fraude. Cybersource informó que en 2019, una quinta parte de las transacciones de comercio electrónico se marcaron como potencialmente fraudulentas y el 20 % se bloquearon, es decir, más de 6 veces el promedio mundial. Es evidente que la adopción de 3DS2 por parte de las plataformas fomentará la confianza de los compradores online. Vale la pena mencionar también el papel que juegan la blockchain y las criptomonedas. Redes como Ethereum o Lightning son una alternativa descentralizada a las redes de pago más tradicionales. En los últimos años, más y más personas han comenzado a utilizar esta tecnología debido a sus características únicas: tarifas bajas, tiempo de procesamiento rápido y alcance global. América Latina ha visto una explosión en la adopción debido a varios factores, siendo muy prominentes las remesas y los pagos en stablecoins. Los proveedores de servicios de remesas tradicionales son, de hecho, más lentos y más caros que las redes de blockchain. Especialmente en Argentina, un número cada vez mayor de trabajadores autónomos exigen que se les pague en USDC o USDT, dos stablecoins vinculadas al valor del dólar, para así poder protegerse de la inflación. Está claro que el panorama de los pagos está evolucionando rápidamente, por un lado, los clientes esperan productos y servicios que se integren a la perfección con todos los aspectos de sus vidas digitales. Cada vez que una aplicación se percibe como lenta, mal diseñada o simplemente le faltan algunas funciones, el usuario puede cambiar fácilmente a la alternativa de un competidor. Por otro lado, la cantidad de actores que compiten por su participación en el mercado de pagos digitales está en auge, lo que reduce los márgenes de los productos tradicionales. La única forma de navegar con éxito en este entorno complejo es invertir en innovación y en la creación de nuevos modelos de negocio. No existe un planteamiento único para enfrentarse a tales desafíos, pero no hay duda de que toda empresa con éxito necesita aprovechar el poder de los datos y la tecnología para proporcionar a sus clientes la experiencia personalizada y en tiempo real que exigen. En MongoDB creemos que una base sólida para lograrlo está representada por una developer data platform altamente flexible y escalable, que permite a las empresas innovar más rápido y monetizar mejor sus datos de pago. ¡Visite la web de Servicios Financieros de MongoDB para obtener más información!

March 15, 2023
Applied

How Much is Your Data Model Costing Your Business?

Economic volatility is creating an unpredictable business climate, forcing organizations to stretch their dollars further and do more with less. Investments are under the microscope, and managers are looking to wring every ounce of productivity out of existing resources. IT spend is a concern and many IT decision-makers aren't sure what's driving costs. Is it overprovisioning? Cloud sprawl? Shadow IT? One area that doesn't get a lot of attention is how the data is modeled in the database. That's unfortunate because data modeling can have a major impact in terms of the cost of database operations, the instance size necessary to handle workloads, and the work required to develop and maintain applications. Pareto patterns Data access patterns are often an illustration of the Pareto Principle at work, where the majority of effects are driven by a minority of causes. Modern OLTP applications tend to work with data in small chunks. The vast majority of data access patterns (the way applications access and use data) work with either a single row of data or a range of rows from a single table. At least that's what we found at Amazon , looking at 10,000 services across all the various RDBMS based services we deployed. Normalized data models are quite efficient for these simple single table queries, but the less frequent complex patterns require the database to join tables to produce a result, exposing RDBMS inefficiencies. The high time complexity associated with these queries meant significantly more infrastructure was required to support them. The relational database hides much of this overhead behind the scenes. When you send a query to a relational database, you don't actually see all the connections opening up on all the tables, or all the objects merging. Even though 90% of the access patterns at Amazon were for simple things, the 10% that were doing more complex things were burning through CPU to the point that my team estimated they were driving ~50% of infrastructure cost. This is where NoSQL data modeling can be a game-changer. NoSQL data models are designed to eliminate expensive joins, reduce CPU utilization, and save on compute costs. Modeling for efficiency in NoSQL There are two fundamental approaches to modeling relational data in NoSQL databases: Embedded Document - All related data is stored in a single rich document which can be efficiently retrieved when needed. Single Collection - Related data is split out into multiple documents to efficiently support access patterns that require subsets of a larger relational structure. Related documents are stored in a common collection and contain attributes that can be indexed to support queries for various groupings of related documents. The key to building an efficient NoSQL data model and reducing compute costs is using the workload to influence the choice of data model. For example, a read-heavy workload like a product catalog that runs queries like, "get all the data for a product" or "get all the products in a category," will benefit from an embedded document model because it avoids overhead of reading multiple documents. On the other hand, a write-heavy workload where writes are updating bits and pieces of a larger relational structure would run more efficiently with smaller documents stored in a single collection which can be accessed independently and indexed to support efficient retrieval when all the data is needed. The final choice depends on the frequency and nature of the write patterns and whether or not there's a high velocity read pattern that's operating concurrently. If your workload is read-intensive, you want to get as much as you can in one read. For a write-intensive workload, you don't want to have to rewrite the full document every time it changes. Joins increase time complexity. In NoSQL databases, depending on the access pattern mix, all the rows from the relational tables are stored either in a single embedded document or as multiple documents in one collection that are linked together by indexes. Storing multiple related documents in a common collection means there is no need for joins. As long as you're indexing on a common dimension across documents, you can query for related documents very efficiently. Now imagine a query that joins three tables in a relational database and your machine needs to do 1,000 of them. You would need to read at least 3,000 objects from multiple tables in order to satisfy the 1,000 queries. With the document model, by embedding all the related data in one document, the query would read only 1,000 objects from a single collection. Machine wise, having to merge 3,000 objects from three tables versus reading 1,000 from one collection will require a more powerful and expensive instance. With relational databases, you don't have as much control. Some queries may result in a lot of joins, resulting in higher time complexity which translates directly into more infrastructure required to support the workload. Mitigate what matters In a NoSQL database, you want to model data for the highest efficiency where it hurts the most in terms of cost. Analytical queries tend to be low frequency. It doesn't matter as much if they come back in 100 ms or 10 ms. You just want to get an answer. For things that run once an hour, once a day, or once a week, it's okay if they're not as efficient as they might be in a normalized relational database. Transactional workloads that are running thousands of transactions a second need to process as efficiently as possible because the potential savings are far greater. Some users try to practice these data modeling techniques to increase efficiency in RDBMS platforms since most now support document structures similar to MongoDB. This might work for a small subset of workloads. But columnar storage is designed for relatively small rows that are the same size. They do work well for small documents, but when you start to increase the size of the row in a relational database, it requires off-row storage. In Postgres this is called TOAST (The Oversized-Attribute Storage Technique). This circumvents the size limit by putting the data in two places, but it also decreases performance in the process. The row based storage engines used by modern RDBMS platforms were not designed for large documents, and there is no way to configure them to store large documents efficiently. Drawing out the relationship The first step we recommend when modeling data is to characterize the workload by asking a few key questions: What is the nature of the workload? What is the entity relationship diagram (ERD)? What are the access patterns? What is the velocity of each pattern? Where are the most important queries that we need to optimize? Identifying the entities and their relationships to each other is going to form the basis of our data model. Once this is done we can begin to distill the access patterns. If it's a read heavy workload like the product catalog you'll most likely be working with large objects, which is fine. There are plenty of use cases for that. However, if you're working with more complex access patterns where you're accessing or updating small pieces of a larger relational structure independently, you will want the data separated into smaller documents so you can efficiently execute those high velocity updates. We teach many of these techniques in our MongoDB University course, M320: MongoDB Data Modeling . Working with indexes Using indexes for high-frequency patterns will give you the best performance. Without an index, you have to read every document in the collection and examine it to determine which documents match the query conditions. An index is a B-tree structure that can be parsed quickly to identify documents that match conditions on the indexed attributes specified by the query. You may choose to not index uncommon patterns for various reasons. All indexes incur cost as they must be updated whenever a document is changed. You might have a high velocity write pattern that runs consistently and a low velocity read that happens at the end of the day, in which case you'll accept the higher cost of the full collection scan for the read query rather than incur the cost of updating the index on every write. If you are writing to a collection 1,000 times a second and reading once a day, the last thing you want to do is add an index update for every single write just to make the read efficient. Again, it depends on the workload. Indexes in general should be created for high-velocity patterns, and your most frequent access patterns should be covered by indexes to some extent, either partially or fully. Remember that an index still incurs cost even if you don't read it very much or at all. Always make sure when you define an index that there is a good reason for it, and that good reason should be that you have a high frequency access pattern that needs to use it to be able to read the data efficiently. Data modeling and developer productivity Even after you've optimized your data model, cost savings will continue to accrue downstream as developers find that they can develop, iterate, and maintain systems far more efficiently than in a relational database. Specific document design patterns and characteristics of NoSQL can reduce maintenance overhead and in many cases eliminate maintenance tasks altogether. For example, document databases like MongoDB support flexible schema which eliminates the need for maintenance windows related to schema migrations and refactoring of a catalog as with RDBMS. A schema change in a relational database almost always impacts ORM data adapters that would need to be refactored to accommodate the change. That's a significant amount of code maintenance for developers. With a NoSQL database like MongoDB, there's no need for cumbersome and fragile ORM abstraction layers. Developers can store object data in its native form instead of having to normalize it for a tabular model. Updating data objects in MongoDB requires almost zero maintenance. The application just needs to be aware documents may have new properties, and how to update them to the current schema version if they don’t. MongoDB will lower license fees and infrastructure costs significantly, but possibly the biggest savings organizations experience from moving away from RDBMS will come from reduced development costs. Not only is there less code overall to maintain, but the application will also be easier to understand for someone who didn't write the code. MongoDB makes migrations far simpler and less prone to failure and downtime. Applications can be updated more frequently, in an easier fashion, and without stressing about whether a schema update will fail and require a rollback. Overall, maintaining applications over their lifetime is far easier with NoSQL databases like MongoDB. These efficiencies add up to significant savings over time. It's also worth mentioning that a lot of up-and-coming developers see relational databases as legacy technology and not technology they prefer to use. With MongoDB it is easier to attract top talent, a critical factor in any organization's ability to develop best-of-breed products and accelerate time-to-value. Uplevel your NoSQL data modeling skills If you want to start reining in the hidden costs in your software development lifecycle by learning how to model data, MongoDB University offers a special course, M320: MongoDB Data Modeling . There are also dozens of other free courses, self-paced video lessons, on-demand labs, and certifications with digital badges to help you master all aspects of developing with MongoDB.

March 15, 2023
Applied

Digital Payments - Latin America Focus

Pushed by new technologies and global trends, the digital payments market is flourishing all around the world. With a valuation at over USD 68 billion in 2021 and expectations to grow to double digits over the next decade, emerging markets are leading the way in terms of relative expansion. A landscape once dominated by incumbents - big banks and credit card companies - is now being attacked by disruptors that are interested in capturing a market share. According to a McKinsey study , there are four major factors at the core of this transformation: Pandemic-induced cashless payments adoption E-commerce Government push for digital payments Fintechs Interestingly, the pandemic has been a big catalyst in the rise of financial inclusion by encouraging alternative means of payment and new ways of borrowing and saving. These new digital services are in fact easier to access and to consume. In Latin America and the Caribbean (LAC), Covid spurred a dramatic increase in cashless payments, 40% of adults made an online purchase, 14% of which did it for the first time in their life. E-commerce has experienced a stellar growth, with a penetration that will likely exceed 70% of the population in 2022, domestic and global players including Mercado Libre and Falabella are pushing digital payment innovation to provide an ever smoother customer experience on their platforms. Central banks are promoting new infrastructure for near real-time payments, with the goal of providing a cheaper and faster technology for money transfer both for citizens and businesses. PIX is probably the biggest success story. An instant payment platform developed by Banco Central do Brasil (Brazil Central Bank), it began operating in November 2020, and within 18 months, over 75% of adult Brazilians had used it at least once. The network processes around $250 Billion in annualized payments, about 20% of total customer spend. Users (including self employed workers) can send and receive real-time payments through a simple interface, 24/7 and free of charge. Businesses have to pay a small fee. In the United States, the Federal Reserve has announced it will be launching FedNow in mid 2023, a payment network with characteristics similar to PIX. These initiatives aim to solve issues such as slow settlements and low interoperability between parties Incumbent banks still own the lion’s share of the digital payment market, however, fintechs have been threatening this dominance by leveraging their agility to execute fast and cater to customer needs in innovative and creative ways. Without the burden of legacy systems to weigh them down, or business models tied to old payment rails, fintechs have been enthusiastic testers and adopters of new technologies and payment networks. Their mobile and digital first approach is helping them capture and retain the younger segment of the market, which expect integrated real-time experiences they can consume at the touch of a button. An example is Paggo, a Guatemalan fintech that helps businesses streamline payments by enabling them to share a simple QR code that customers can scan to transfer money. The payment landscape is not only affected by external forces, changes coming from within the industry are also reshaping the customer experience and enabling new services: ISO 20022 is a flexible standard for data interchange that is being adopted by most financial industry institutions to standardize the way they communicate between each other, thus streamlining interoperability. Thanks to the adoption of ISO 20022, it’s more straightforward for banks to read and process messages, this translates into smoother internal processes and easier automatization. For end users this means faster and potentially cheaper payments, as well as richer and more integrated financial apps. 3DS2 is being embraced by the credit and debit card payments ecosystem. It essentially is a payment authentication solution that serves online shopping transactions. Similarly to ISO 20022, the end user won’t even be aware of the underlying technology, but will only experience a smoother and frictionless checkout. 3DS2 avoids the user being redirected to their banking app for confirmation when buying an item online, now it’s all happening on the website or app of the seller. This is all done while also enhancing fraud detection and prevention; this new solution makes it harder to use one’s credit or debit card without authorization. 3DS2 adoption benefit is twofold: on the one hand the user has increased confidence, on the other hand merchants are happier because of a lower customer abandonment rate, in fact fear of fraud at checkout is usually one of the main reasons for ditching an online purchase. This solution is especially beneficial for the LAC region, where, despite wide adoption of e-commerce, people are still reluctant to transact online. One of the factors contributing to this oddity is fear of fraud, Cybersource reported that in 2019, a fifth of e-commerce transactions were flagged as potentially fraudulent and 20% were blocked, that’s over 6 times the global average. It is evident how online shoppers’ trust will be encouraged by the platforms’ adoption of 3DS2. It is worth also mentioning the role played by blockchain and cryptocurrencies. Networks such as Ethereum or Lightning are effectively a decentralized alternative to the more traditional payment rails. Over the last few years more and more people have started to use this technology because of its unique features: low fees, fast processing time and global reach. Latin America has seen an explosion in adoption due to several factors, remittances and stablecoin payments being highly prominent. Traditional remittance service providers are in fact slower and more expensive than blockchain networks. Especially in Argentina, an increasing number of autonomous workers are demanding to be paid in USDC or USDT, two stablecoins pegged to the value of the dollar, thus being able to stave off inflation. It is clear that the payment landscape is rapidly evolving, on the one end customers expect products and services that integrate seamlessly with every aspect of their digital lives. Whenever an app is perceived as slow, poorly designed or simply missing some features, the user can easily switch to a competitor’s alternative. On the other hand, the number of players contending for their share in the digital payments market is expanding, driving down margins of traditional products. The only way to successfully navigate this complex environment is investing in innovation and in creating new business models. There’s no unique approach to face such challenges, but there’s no doubt that every successful business needs to harness the power of data and technology to provide its customers with the personalized and real-time experience they demand. We at MongoDB believe that a solid foundation to achieve that is represented by a highly flexible and scalable developer data platform, allowing companies to innovate faster and better monetize their payment data. Visit our Financial Services web page to learn more!

March 14, 2023
Applied

Clear: Enabling Seamless Tax Management for Millions of People with MongoDB Atlas

Building India's largest tax and financial services software platform trusted by more than six million Indians With India’s large population and growing middle class, the country’s tax-paying population has been rising steadily. At the end of the financial year 2021-22, about 5.83 crore (58.3 million ) individuals filed tax returns with the Indian Income Tax department. In addition, India also has about 13.8 million registered Goods and Services Tax (GST) taxpayers. When juxtaposed with growing digitization in India, this opens up massive demand for a convenient and effective platform to manage tax returns. Clear realized this need early on and launched as a SaaS offering for ITR filing to individuals in 2011 that is currently trusted by more than six million Indians. It is second only to the Indian IT Department’s portal in terms of registered users. More recently, Clear has been focused on expanding its B2B portfolio, including launching an e-invoicing system. Today, the system supports about 50,000 tax professionals, one million small businesses, and 4000 enterprises in GST filing. How to ensure a seamless experience for all users at scale Clear built the initial version of its B2B e-invoicing system on MySQL. However, as adoption grew, the team started to see the limits of the systems tested. Certain batches of invoices were taking upwards of 25 minutes to process, an issue for the time-sensitive nature of tax filing. If any Clear customer failed to file in time, that customer could be given a penalty and labeled as non-compliant by the Indian government. The team knew they needed to take a step back and reevaluate the core structure of their system. The Clear team started the system rework by outlining a set of required capabilities. The new database system would need to be able to scale up quickly to handle periods of peak demand and down when traffic was low to save on costs. Tax professionals need to be able to see multiple cuts of the data at different levels, so the database would need to be able to support quick and complex aggregations. Lastly, the team knew that didn’t want to be accountable for the management of the system themselves. They needed a fully-managed option. MongoDB Atlas chosen for best in class scale and performance The company ran a proof of concept (POC) study comparing MySQL’s performance with other competitive offerings, including MongoDB. It found that, in terms of the time taken to execute different batch sizes of data, MongoDB was considerably faster in all instances. For example, MongoDB’s processing time was 122% faster than the closest competitor and 767% faster than the farthest competitor. Comparison of performance among databases Given the document-based nature of invoices, the results of the POC made sense. With MongoDB, the Clear team could store invoice data together instead of splitting it across tables. This minimized the number of costly joins required to obtain data, leading to faster reads. MongoDB also allowed the team to easily split reads and writes in use cases where the system experienced high volumes of reads and where reading slightly stale data was permissible. Clear’s aggregation needs were also easily met with MongoDB’s aggregation pipeline. The combination of aggregation support and MongoDB’s full-text search capabilities meant that the Clear team could easily build filterable and searchable dashboards on top of their invoice data. Lastly, the team also loved the easy-to-use nature of MongoDB Atlas, MongoDB’s fully-managed developer data platform. With Atlas, the team could easily scale up and down their clusters on a schedule to match fluctuations in user traffic. Achieving a 2900% jump in processing speed along with cost savings After Clear replatformed from MySQL to MongoDB Atlas on AWS, their customers were shocked by the improvement. Pranesh Vittal, Director Of Engineering, ClearTax India said, “We have achieved considerable optimization with MongoDB. Our customers are often surprised by the pace of execution. There is a significant improvement in the performance, with as much as a 2900% jump in processing speed in some instances.” Comparing the performance of the new MongoDB powered platform On top of increased speeds, the team is also saving money. “We’ve generated over 20 crore invoices to date running on a single sharded cluster with a 4TB disc,” said Pranesh Vittal. “The ability to store older data in cold storage [with Online Archive] helped us achieve this.” Atlas Triggers also help the team automatically scale down their clusters each night and scale them up in the morning. The triggers are fully-managed and schedule-based, so it’s as easy as setting them up and letting them run. This automatic right-sizing is saving the team upwards of $7000 each month ($700 per cluster for 10 clusters). After seeing such positive results, the team has since decided to replatform multiple other products onto MongoDB. “Here, MongoDB’s live support and consultation have proved very useful,” said Pranesh Vittal. Now, Clear manages 25+ clusters and over 10TB of data on MongoDB Atlas.

March 6, 2023
Applied

Build a ML-Powered Underwriting Engine in 20 Minutes with MongoDB and Databricks

The insurance industry is undergoing a significant shift from traditional to near-real-time data-driven models, driven by both strong consumer demand, and the urgent need for companies to process large amounts of data efficiently. Data from sources such as connected vehicles and wearables are utilized to calculate precise and personalized premium prices, while also creating new opportunities for innovative products and services. As insurance companies strive to provide personalized and real-time products, the move towards sophisticated and real-time data-driven underwriting models is inevitable. To process all of this information efficiently, software delivery teams will need to become experts at building and maintaining data processing pipelines. This blog will focus on how you can revolutionize the underwriting process within your organization, by demonstrating how easy it is to create a usage-based insurance model using MongoDB and Databricks. This blog is a companion to the solution demo in our Github repository . In the GitHub repo, you will find detailed step-by-step instructions on how to build the data upload and transformation pipeline leveraging MongoDB Atlas platform features, as well as how to generate, send, and process events to and from Databricks. Let’s get started. Part 1: the Use Case Data Model Part 2: the Data Pipeline Part 3: Automated Decision Support with Databricks Part 1: The use case data model Figure 1: Entity relationship diagram - Usage-based insurance example Imagine being able to offer your customers personalized usage-based premiums that take into account their driving habits and behavior. To do this, you'll need to gather data from connected vehicles, send it to a Machine Learning platform for analysis, and then use the results to create a personalized premium for your customers. You’ll also want to visualize the data to identify trends and gain insights. This unique, tailored approach will give your customers greater control over their insurance costs while helping you to provide more accurate and fair pricing. A basic example data model to support this use case would include customers, the trips they take, the policies they purchase, and the vehicles insured by those policies. This example builds out three MongoDB collections, as well two Materialized Views . The full Hackloade data model which defines all the MongoDB objects within this example can be found here . Part 2: The data pipeline Figure 2: The data pipeline - Usage-based insurance The data processing pipeline component of this example consists of sample data, a daily materialized view, and a monthly materialized view. A sample dataset of IoT vehicle telemetry data represents the motor vehicle trips taken by customers. It’s loaded into the collection named ‘customerTripRaw’ (1) . The dataset can be found here and can be loaded via MongoImport , or other methods. To create a materialized view, a scheduled Trigger executes a function that runs an Aggregation Pipeline. This then generates a daily summary of the raw IoT data, and lands that in a Materialized View collection named ‘customerTripDaily’ (2) . Similarly for a monthly materialized view, a scheduled Trigger executes a function that runs an Aggregation Pipeline that, on a monthly basis, summarizes the information in the ‘customerTripDaily’ collection, and lands that in a Materialized View collection named ‘customerTripMonthly’(3). For more info on these, and other MongoDB Platform Features: MongoDB Materialized Views Building Materialized View on TimeSeries Data MongoDB Scheduled Triggers Cron Expressions Part 3: Automated decisions with Databricks Figure 3: The data pipeline with Databricks - Usage-based insurance The decision-processing component of this example consists of a scheduled trigger and an Atlas Chart. The scheduled trigger collects the necessary data and posts the payload to a Databricks ML Flow API endpoint (the model was previously trained using the MongoDB Spark Connector on Databricks). It then waits for the model to respond with a calculated premium based on the miles driven by a given customer in a month. Then the scheduled trigger updates the ‘customerPolicy’ collection, to append a new monthly premium calculation as a new subdocument within the ‘monthlyPremium’ array. You can then visualize your newly calculated usage-based premiums with an Atlas Chart! In addition to the MongoDB Platform Features listed above, this section utilizes the following: MongoDB Atlas App Services MongoDB Functions MongoDB Charts Go hands on Automated digital underwriting is the future of insurance. In this blog, we introduced how you can build a sample usage-based insurance data model with MongoDB and Databricks. If you want to see how quickly you can build a usage-based insurance model, check out our GitHub repository and dive right in!

March 6, 2023
Applied

Queenly Builds New Formalwear Shopping Experience With Full Text Search Indexing

Two years ago, we profiled Queenly , a promising startup that's ushering in big changes to the formalwear industry by making it more accessible for everyday people. The San Francisco-based company operates a marketplace and search engine for buying and selling formalwear such as wedding dresses, prom dresses, special occasion attire, and wedding guest dresses. Four years removed from its successful launch, Queenly is now rolling out new social commerce features that co-founders Trisha Bantigue and Kathy Zhou hope will give users a forum to discuss fashion tips, share recommendations, and develop a community of like-minded friends. Ready to wear Zhou, who is also CTO of Queenly, chose MongoDB because she'd previously used it as a student at the University of Pennsylvania doing hackathons. "It was super easy to set up when I was just that starry eyed, 19-year-old kid that, honestly, didn't know anything about databases," she says. That simplicity remains a selling point for Zhou. "It's been really great to train our engineering team on MongoDB," Zhou says. "Even if they're a client-side engineer and don't have a background in databases." That ease of use will continue to pay off as the company scales and grows its technical team. Zhou's domain knowledge from working on search engines and recommendation systems at Pinterest led her to apply the advancements in algorithms and technology to the fashion industry. Full text search is a critical feature for building a truly personalized shopping experience that's tailored to the different life events that require formal wear. MongoDB Atlas Search is a fully integrated solution that makes it easy to add full text search with advanced functionality — fuzzy search, synonyms— to existing datastores. The simplicity of the out-of-the-box solution is huge for startups, Zhou says, because they're constantly growing and trying to structure their data along the way. "We have our own blended algorithms for ranking and delivering the most relevant search results to users, so plugging Atlas Search into our system helped fill in the user experience gaps when needed," Zhou says. "MongoDB was the right choice at the right time," she says. "When it comes to being able to do more complex querying and searching, MongoDB felt pretty easy." She also likes using NoSQL schemas and NoSQL databases because of the flexibility. Startups see so many different curveballs, she says, and so many different things they want to test and try, and having the flexibility to do that has really helped, according to Zhou. Data-driven differentiation Both Zhou and CEO Bantigue have experience in the fashion world and use that experience to customize their service to their audience. As we mentioned in our earlier profile, both grew up in low-income, immigrant households and entered beauty pageants as a way to earn tuition money. So they know the experience of needing to find the dress of your dreams but with limited resources. It's that lived experience that enables them to create a great UI/UX that treats customers the way they want to be treated. The co-founders, both 2022 Forbes 30 Under 30 honorees, combined their knowledge of the fashion industry with the ability to solve problems through data-driven methods to create differentiation in a crowded space. The search and indexing capabilities in MongoDB Atlas enable the Queenly application to curate a highly personalized visitor experience based on what you search for and spend time looking at. Normally, building new shopping categories or recommendation features would entail building a new data pipeline or data science infrastructure. Zhou says the compound filtering and indexing capabilities in MongoDB enable them to get new categories off the ground quickly and iterate as needed. “Communities on Queenly" has recently launched out of beta to all users, allowing them to ask each other questions like, "What kind of hairstyle should I wear for my wedding?" or "What kind of brands do you guys typically like?" Other interactive, social commerce type features that Queenly’s engineering team was able to quickly launch through the help of MongoDB’s indexing features include a Tiktok-style video feed and following feeds for user closets and brands. Support for startups Queenly is part of the MongoDB for Startups program , which helps startups build faster and scale further with free MongoDB Atlas credits, one-on-one technical advice, co-marketing opportunities, and access to a vast partner network. Zhou says the program has given them access to a level of specialized support that they wouldn't have had otherwise. "Clients our size might not get as much help as a really big company. I think it's really great that MongoDB for Startups exists so that us founders and small business owners can feel heard when it comes to just getting support," Zhou says. If you want to learn more about Queenly, check out queenly.com . To apply to become part of a growing team, visit queenly.com/jobs. Are you part of a startup and interested in joining the MongoDB for Startups program? Apply now .

March 1, 2023
Applied

MongoDB Atlas as the Data Hub for Smart Manufacturing with Microsoft Azure

All the source code used in this project, along with a detailed deployment guide, is available on our public Github page . Manufacturing companies are emerging from the pandemic with a renewed focus on digital transformation and smart factories investment. COVID-19 has heightened the need for Industrial IoT technology and innovation as consumers have moved towards online channels, forcing manufacturers to compete in a digitalized business environment. The manufacturing ecosystem can be viewed as a multi-dimensional grouping of systems designed to support the various business units in manufacturing organizations such as operations, engineering, maintenance, and learning & development functions. Process and equipment data is generated on the shop floor from machines and systems such as SCADA and then stored in a process historian or an operational database. The data originating from shop floor devices are generally structured time series data acquired through regular polling and sampling. Historians provide fast insertion rates of time series data, with capacities that reach up to tens of thousands of PLC tags processed per second. They rely on efficient data compression engines which can either be lossy or lossless. Traditional RDBMS storage comes packaged with the manufacturing software applications such as a Manufacturing Execution System (MES). Relational databases are traditionally common in manufacturing systems and thus the choice of database systems for these manufacturing applications are typically driven by historical preferences. Manufacturing companies have long relied on using several databases and data warehouses to accommodate various transactional and analytical workloads. The strategy of separating operational and analytical systems has worked well so far and has caused least interference with the operational process. However this strategy will not fare well in the near future for two reasons: Manufacturers are generating high volume, variety and veracity data using advanced IIoT platforms to create a more connected product ecosystem. The growth of IIoT data has been rapid and in fact, McKinsey and Company estimates that companies will spend over $175B in IIoT and edge computing hardware by 2025. A traditional manufacturing systems setup necessitates the deployment and maintenance of several technologies including graph databases (for asset digital models and relationships) and time series databases (for time series sensor data) and leads to IT sprawl across the organization. A complex infrastructure causes latency and delays in data access which leads to non-realization of real time insights for improving manufacturing operations. To establish an infrastructure that can enable real time analytics, companies need real time access to data and information to make the right decision in time. Analytics can no longer be a separate process, it needs to be brought into the application. The applications have to be supplied with notifications and alerts instantly. This is where application-driven analytics platforms such as MongoDB Atlas come into picture. We understand that to build smarter and faster applications, we can no longer rely on maintaining separate systems for different transactional and analytical workloads. Moving data between disparate systems takes time and energy and results in longer time to market and slower speed of innovation. Many of our customers start out using MongoDB as an operational database for both new cloud-native services as well as modernized legacy apps. More and more of these clients are now improving customer experience and speeding business insight by adopting application-driven analytics within the MongoDB Atlas platform. They use MongoDB to support use cases in real-time analytics, customer 360, internet of Things (IoT) and mobile applications across all industry sectors. As mentioned before, Manufacturing ecosystem employs a lot of databases just to run production operations. Once IIoT solutions are added to the mix, each solution (shown in yellow in Figure 1) may come with its own database (Time Series, relational, graph etc.) and the number of databases will increase dramatically. With MongoDB Atlas, this IT sprawl can be reduced as multiple use cases can be enabled using MongoDB Atlas (Figure 2). The versatility of the document model to structure data any way the application needs, coupled with an expressive API and indexing that allows you to query data any way you want is a powerful value proposition. The benefits of MongoDB Atlas are amplified by the platform’s versatility to address almost any workload. Atlas combines transactional processing, application-driven analytics, relevance-based search, and mobile edge computing with cloud sync. These capabilities can be applied to almost every type of modern applications being built for the digital economy by developers. Figure 1: IT sprawl with IIoT and analytics solutions deployment in Manufacturing Figure 2: MongoDB Atlas simplifying road to Smart Manufacturing MongoDB and Hyperscalers leading the way for smart manufacturing Manufacturers who are actively investing in digital transformation and IIoT are experiencing an exponential growth in data. All this data offers opportunities for new business models and digital customer experiences. To drive the right outcomes from all this data, manufacturers are setting up scalable infrastructures using Hyperscalers such as Azure, AWS and GCP. These hyperscalers offer a suite of components for efficient, scalable implementation of IIoT platforms. Companies are leveraging these accelerators to quickly build solutions, which help access, organize, and analyze previously untapped data from sensors, devices, and applications. In this article, we are focused on how MongoDB integrates with Microsoft Azure IoT modules and acts as the digital data hub for smart manufacturing use cases. MongoDB and Microsoft have been partners since 2019, but last year it was expanded, enabling developers to build data intensive applications within the Azure marketplace and Azure portal. This enables an enhanced developer experience and allows burn down of their Microsoft Azure Consumption Commitment. The alliance got further boost when Microsoft included MongoDB as a partner in its newly launched Microsoft Intelligent Data Platform Ecosystem . MongoDB Atlas can be deployed in 35 regions in Azure and has seamless integration with most of the Azure Developer services (Azure functions, App services, ADS), Analytics services (Azure Synapse), Data Governance (Microsoft Purview), ETL (ADF) and cross cutting services (AD, KMS, AKS etc.) powering building of innovative solutions. Example scenario: Equipment failure prediction Imagine a manufacturing facility that has sensors installed in their Computer Numerical Control (CNC) machines measuring parameters such as temperature, torque, rotational speed and tool wear. A sensor gateway converts analog sensor data to digital values and pushes it to Azure IoT Edge which acts as a gateway between factory and the Cloud. This data is transmitted to Azure IoT Hub where the IoT Edge is registered as an end device. Once we have the data in the IoT Hub, Azure Stream Analytics can be utilized to filter the data so that only relevant information flows into the MongoDB Atlas Cluster. The connection between Stream Analytics and MongoDB is done via an Azure Function. This filtered sensor data inside MongoDB is used for following purposes: To provide data for machine learning model that will predict the root cause of machine failure based on sensor data. To act as a data store for prediction results that can be utilized by business intelligence tools such as PowerBI using Atlas SQL Interface. To store the trained machine learning model checkpoint in binary encoded format inside a collection. The overall architecture is shown in Figure 3. Figure 3: Overall architecture Workflow: The sensors in the factory are sending time series measurements to Azure IoT Hub. These sensors are measuring for multiple machines: Product Type Air Temperature (°C) Process Temperature (°C) Rotational Speed Torque Tool Wear (min) IoT Hub will feed these sensor data to Azure Stream Analytics, where the data will be filtered and pushed to MongoDB Atlas time series collections. The functionality of Stream Analytics can be extended by implementing machine learning models to do real-time predictive analytics on streaming input data. The prediction results can also be stored in MongoDB in a separate collection. The sensor data contains the device_id field which helps us filter data coming from different machines. As MongoDB is a document database, we do not need to create multiple tables to store this data, in fact we can just use one collection for all the sensor data coming from various devices or machines. Once the data is received in MongoDB, sum and mean values of sensor data will be calculated for the predefined production shift duration and the results will be pushed to MongoDB Atlas Charts for visualization. MongoDB Time series window functions are used in an aggregation pipeline to produce the desired result. When a machine stoppage or breakdown occurs during the course of production, it may lead to downtime because the operator has to find out the cause of the failure before the machine can be put back into production. The sensor data collected from the machines can be used to train a machine learning model that can automatically predict the root cause when a failure occurs and significantly reduce the time spent on manual root cause finding on the shop floor. This can lead to increased availability of machines and thus more production time per shift. To achieve this goal, our first task is to identify the types of failures we want to predict. We can work with the machine owners and operators to identify the most common failure types and note that down. With this important step completed, we can identify the data sources that have relevant data about that failure type. If need be, we can update the Stream Analytics filter as well. Once the right data is identified, we train a Decision Tree Classifier model in Azure Machine Learning and deploy it as a binary value as a separate collection inside MongoDB. Atlas Scheduled Triggers are used to trigger the model (via an Azure Function) and the failure prediction results are written back results into a separate Failures collection in MongoDB. Scheduled triggers’ schedule can be aligned to production schedule so that it only fires when a changeover occurs for example. After a failure is detected, the operator and supervisor needs to be notified immediately. Using App Services, a mobile application is developed to send notifications and alerts to floor supervisor and machine operator once a failure root cause is predicted. Figure 4 shows the mobile app user interface where the user has an option to acknowledge the alert. Thanks to Atlas Device Sync , even when the mobile device is facing unreliable connectivity, the data keeps in sync between Atlas cluster and Realm database in the app. MongoDB’s Realm , is an embedded database technology already used on millions of mobile phones as part of mobile apps as well as infotainment like systems. FIgure 4: Alert app user interface Business benefits of using MongoDB Atlas as smart manufacturing data hub Scalability: MongoDB is a highly scalable document based database that can handle large amounts of structured, semi-structured and unstructured data. Native time series collections are available that help with storing large amounts of data generated by IIoT enabled equipment in a highly compressed manner. Flexibility: MongoDB stores data in a flexible, JSON-like format, which makes it easy to store and query data in a variety of ways. This flexibility makes it well-suited for handling the different data structures needed to store sensor data, ML models and prediction results, all in one database. This removes the need for maintaining separate databases for each type of data reducing IT sprawl in manufacturing organizations. Real-time Analytics: As sensor data comes in, MongoDB aggregation pipelines can help in generating features to be used for machine learning models. Atlas Charts can be set up in minutes to visualize important features and their trends in near real time. BI Analytics: Analysts can use the Atlas SQL interface to access MongoDB data from SQL based tools. This allows them to work with rich, multi-structured documents without defining a schema or flattening data. In a connected factory setting, this can be useful to generate reports for failures over a period of time and comparison between different equipment failures types. Data can be blended from MongoDB along with other sources of data to provide a 360 degree view of production operations. Faster Mobile Application Development: Atlas device sync bidirectionally connects and synchronizes Realm databases inside mobile applications with the MongoDB Atlas backend, leading to faster mobile application development and less time needed for maintenance of deployed applications. Conclusion The MongoDB Atlas developer data platform is designed and engineered to help speed up your journey towards smart manufacturing. It is not just suitable for high speed time series workloads but also for workloads that power mobile applications and BI Dashboards – leading to smarter applications, increased productivity and eventually smarter factories. Learn more All the source code used in this project, along with a detailed deployment guide, is available on our public Github page . To learn more about how MongoDB enables IIoT for our customers, please visit our IIoT use cases page . Get started today with MongoDB Atlas on Azure Marketplace listing .

February 27, 2023
Applied

How to Seamlessly Use MongoDB Atlas and Databricks Lakehouse Together

In a previous post , we talked briefly about using MongoDB and Databricks together. In this post, we'll cover the different ways to integrate these systems, and why. Modern business demands expedited decision-making, highly-personalized customer experiences, and increased productivity. Analytical solutions need to evolve constantly to meet this demand of these changing needs, but legacy systems struggle to consolidate the data necessary to service these business needs. They silo data across multiple databases and data warehouses. They also slow turnaround speeds due to high maintenance and scaling issues. This performance hit becomes a significant bottleneck as the data grows into terabytes and petabytes. To overcome the above challenges, enterprises need a solution that can easily handle high transaction volume, paired with a scalable data warehouse (increasingly known as a "lakehouse") that performs both traditional Business Intelligence (BI) and advanced analytics like serving Machine Learning (ML) models. In our previous blog post “ Start your journey-operationalize AI enhanced real-time applications: mongodb-databricks ” we discussed how MongoDB Atlas and the Databricks Lakehouse Platform can complement each other in this context. In this blog post, we will deep dive on the various ways to integrate MongoDB Atlas and Databricks for a complete solution to manage and analyze data to meet the needs of modern business. Integration architecture Databricks Delta Lake is a reliable and secure storage layer for storing structured and unstructured data that enables efficient batch and streaming operations in the Databricks Lakehouse. It is the foundation of a scalable lakehouse solution for complex analysis. Data from MongoDB Atlas can be moved to Delta Lake in batch/real-time and can be aggregated with historical data and other data sources to perform long-running analytics and complex machine learning pipelines. This yields valuable insights. These Insights can be moved back to MongoDB Atlas so they can reach the right audience at the right time to be actioned. The data from MongoDB Atlas can be moved to Delta Lake in the following ways: One-time data load Real-time data synchronization One-time data load 1. Using Spark Connector The MongoDB Connector for Apache Spark allows you to use MongoDB as a data source for Apache Spark. You can use the connector to read data from MongoDB and write it to Databricks using the Spark API. To make it even easier, MongoDB and Databricks recently announced Databricks Notebooks integration , which gives you an even easier and more intuitive interface to write complex transformation jobs. Login to Databricks cluster, Click on New > Data . Click on MongoDB which is available under Native Integrations tab. This loads the pyspark notebook which provides a top-level introduction in using Spark with MongoDB. Follow the instructions in the notebook to learn how to load the data from MongoDB to Databricks Delta Lake using Spark. 2. Using $out operator and object storage This approach involves using the $out stage in the MongoDB aggregation pipeline to perform a one-time data load into object storage. Once the data is in object storage, it can be configured as the underlying storage for a Delta Lake. To make this work, you need to set up a Federated Database Instance to copy our MongoDB data and utilize MongoDB Atlas Data Federation's $out to S3 to copy MongoDB Data and land it in an S3 bucket. The first thing you'll need to do is navigate to "Data Federation" on the left-hand side of your Atlas Dashboard and then click "Create Federated Database Instance" or "Configure a New Federated Database Instance." Connect your S3 bucket to your Federated Database Instance. This is where we will write the MongoDB data. The setup wizard should guide you through this pretty quickly, but you will need access to your credentials for AWS. Select an AWS IAM role for Atlas. If you created a role that Atlas is already authorized to read and write to your S3 bucket, select this user. If you are authorizing Atlas for an existing role or are creating a new role, be sure to refer to the documentation for how to do this. Enter the S3 bucket information. Enter the name of your S3 bucket. Choose Read and write, to be able to write documents to your S3 bucket. Assign an access policy to your AWS IAM role. Follow the steps in the Atlas user interface to assign an access policy to your AWS IAM role. Your role policy for read-only or read and write access should look similar to the following: { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:ListBucket", "s3:GetObject", "s3:GetObjectVersion", "s3:GetBucketLocation" ], "Resource": [ <role arn> ] } ] } Define the path structure for your files in the S3 bucket and click Next. Now you've successfully configured S3 bucket with Atlas Data Federation. Connect to your MongoDB instance using the MongoDB shell. This command prompts you to enter the password. mongosh "mongodb+srv://server.example.mongodb.net" --username username Specify the database and collection that you want to export data from using the following commands. use db_name; db.collection_name.find() Replace db_name and collection_name with actual values and verify the data exists by running the above command. Use the $out operator to export the data to an S3 bucket. db.[collection_name].aggregate([{$out: "s3://[bucket_name]/[folder_name]"}]) Make sure to replace [collection_name], [bucket_name] and [folder_name] with the appropriate values for your S3 bucket and desired destination folder. Note: The $out operator will overwrite any existing data in the specified S3 location, so make sure to use a unique destination folder or bucket to avoid unintended data loss. Real-time data synchronization Real-time data synchronization needs to happen immediately following the one-time load process. This can be achieved in multiple ways, as shown below. 1. Using Apache Kafka and Delta Live Table Streaming data from MongoDB to Databricks using Kafka and Delta Live Table Pipeline is a powerful way to process large amounts of data in real-time. This approach leverages Apache Kafka, a distributed event streaming platform, to receive data from MongoDB and forward it to Databricks in real-time. The data can then be processed using Delta Live Tables (DLT), which makes it easy to build and manage reliable batch and streaming data pipelines that deliver high-quality data on the Databricks Lakehouse Platform. Download and Install the MongoDB Source connector plugin in your Kafka Cluster from here . Update the following in the mongodb-source-connector.properties connector configuration file. CONNECTION-STRING - MongoDB Cluster Connection String DB-NAME - Database Name COLLECTION-NAME - Collection Name Note: These configurations can be modified based on the use case. Refer to this documentation for more details. Deploy the connector configuration file in your Kafka Cluster. This will enable real time data synchronization from MongoDB to Kafka Topic. Login to Databricks cluster, Click on New > Notebook . In create a notebook dialog, enter a Name , select Python as the default language, and choose the Databricks cluster. Then click on Create . Obtain the IPython notebook for DLT pipeline from here . Go to File > Import , and navigate to the notebook you downloaded in the previous step Click on Import to add the data streaming notebook to your workspace. Update the following variables in the notebook and save. TOPIC - Kafka Topic Name (i.e DB.COLLECTION name) KAFKA_BROKER - Kafka Bootstrap Server details API_KEY - Kafka Server API Key SECRET - Kafka Server Secret Now, Navigate to the sidebar and select the Workflows option. Within Workflows, choose the Delta Live Tables tab and select Create Pipeline . Give your pipeline a name and select Advanced for the product edition. Choose Continuous for the Pipeline Mode. Set the cluster_policy to none and select the notebook you created under Notebook Libraries. Optionally, you can choose to enter a storage location for the output data from the pipeline. If you leave the Storage location field blank, the system will use the default location. You can leave the settings in the Compute section at their default values. Click the Create button to create the pipeline. Run the pipeline to stream the data from Kafka to Delta Live Table. Refer to this documentation to learn more about Delta Live table. 2. Using Spark streaming MongoDB has released a version of the MongoDB Connector for Apache Spark that leverages the new Spark Data Sources API V2 with support for Spark Structured Streaming. MongoDB Connector for Apache Spark enables real-time micro-batch processing of data, enabling you to synchronize data from MongoDB to Databricks using Spark Streaming. This allows you to process data as it is generated, with the help of MongoDB's change data capture (CDC) feature to track all changes. By utilizing Spark Streaming, you can make timely and informed decisions based on the most up-to-date information available in Delta Lake. More details about the streaming functionality can be found here . Login to Databricks cluster, Click on New > Notebook . In create a notebook dialog, enter a Name , select Python as the default language, and choose the Databricks cluster. Then click on Create . Obtain the Spark streaming IPython notebook from here . Go to File > Import , and navigate to the notebook you downloaded in the previous step. Click on Import to add the data streaming notebook to your workspace. Follow the instructions in the notebook to learn how to stream the data from MongoDB to Databricks Delta Lake using Spark connector for MongoDB. 3. Using Apache Kafka and Object Storage Apache Kafka can be utilized as a buffer between MongoDB and Databricks. When new data is added to the MongoDB database, it is sent to the message queue using the MongoDB Source Connector for Apache Kafka. This data is then pushed to object storage using sink connectors, such as the Amazon S3 Sink connector. The data can then be transferred to Databricks Delta Lake using the Autoloader option, which allows for incremental data ingestion. This approach is highly scalable and fault-tolerant, as Kafka can process large volumes of data and recover from failures. Download and Install the MongoDB Source and AWS Sink Connector Plugin in your Kafka Cluster https://www.confluent.io/hub/mongodb/kafka-connect-mongodb https://www.confluent.io/hub/confluentinc/kafka-connect-s3 Update the following in the mongodb-source-connector.properties connector configuration file. CONNECTION-STRING - MongoDB Cluster Connection String DB-NAME - Database Name COLLECTION-NAME - Collection Name Update the following in the s3-sink-connector.properties connector configuration file. TOPIC-NAME - Kafka Topic Name (i.e DB.COLLECTION name) S3-REGION - AWS S3 Region Name S3-BUCKET-NAME - AWS S3 Bucket Name where you wish to push the data. Deploy the connector configuration files in your Kafka Cluster. This will enable real time data synchronization from MongoDB to AWS S3 Buckets. Note: The above connector pushes the data to the S3 bucket at a regular interval of time. These configuration can be modified based on the use case. Refer to the following documentation for more details. MongoDB Source Configuration AWS S3 Sink Configuration Load the data from S3 buckets to Databricks Delta lake using Databricks Autoloader feature. Refer to this documentation for more details. In conclusion, the integration between MongoDB Atlas and the Databricks Lakehouse Platform can offer businesses a complete solution for data management and analysis. The integration architecture between these two platforms is flexible and scalable, ensuring data accuracy and consistency. All the data you need for analytics is in one place in the Lakehouse. Whether it's through one-time data load or real-time data synchronization, the combination of MongoDB Atlas as an Operational Data Store (ODS) and Databricks Lakehouse as an Enterprise Data Warehouse/Lake (EDL) provides the ideal solution for modern enterprises looking to harness the value of their data. So, if you're struggling with the challenges of siloed data, slow decision-making, and outdated development processes, the integration of MongoDB Atlas and Databricks Lakehouse may be the solution you need to take your business to the next level. Please reach out to partners@mongodb.com for any questions.

February 27, 2023
Applied

Unifying Identity to Drive Customer Experience at a Leading Telco

As telecommunications companies around the world diversify product portfolios, adhere to new regulations and execute acquisitions and mergers to excel in a mature industry, customers’ expectations for flawless service, speed, and availability are only growing. All of these combined challenges put pressure on companies' applications and tech stack. More and more, telecommunications leaders are turning to open digital architectures to modernize the legacy enterprise architectures that can’t keep up with today’s customer demands. Throughout the industry, TM Forum’s Open APIs are becoming an integral part of digital transformations. Open source APIs make it easier for telecommunications companies to enable seamless connectivity, interoperability, and portability across a complex ecosystem of services in a consistent way across the industry. MongoDB’s customer, one of the largest telecommunications companies in the world, is on the trend. Read on to learn how this telecommunications giant collaborated with MongoDB Professional Services to modernize and implement TM Forum Open APIs to unlock data to provide great customer experience. The challenge: Delivering a simple, consistent customer experience in the telecommunications industry With founding roots dating back more than 150 years, MongoDB’s customer has a long history that led to a large number of subsidiaries covering a variety of services for end-consumers, corporate clients, and governments. The company’s surging customer base began to outgrow its data and systems architecture. It became difficult to identify customers who held multiple products across the rapidly expanding portfolio of products and services, especially since⁠ many of the customers were adopted through business acquisitions. As the customer base grew, it became more difficult to provide a positive customer experience, and also resulted in missed marketing and cross-sell opportunities. To improve customer interactions, the telecommunications enterprise envisioned the creation of a hub that unified customer identity across all services, products, and partners with MongoDB at the heart of it. At a high level, this would be a data layer that accesses customer information in accordance with TM Forum specifications, creating a consistent single view of the customer . The company also aimed to decouple access to customer information from their underlying legacy systems to empower internal teams to drive their own transformation projects. The end goals: Deliver good customer experiences for accessing, purchasing, and managing accounts across the company’s existing services portfolio. Make it easier for future services and acquisitions to be seamlessly integrated. The solution: A profile hub using TM Forum Open APIs To build this hub, the telecommunications company turned to MongoDB Professional Services , which provided a Jumpstart team in partnership with gravity9 . Think of this combination as a complete application development team in a box, ready to bring this solution to life. This single view of identity, called Profile Hub, would put the company’s customer profile at the core of their data concepts, flipping their previous legacy data model on its head. Going forward, everything would start with the customer profile, and move into products and services from there—instead of the other way around. Profile Hub is an implementation of several Open APIs established by TM Forum , a global industry association for service providers and their suppliers in the telecommunications industry. The APIs we implemented form the basis for representing a customer, their role, and a set of permissions on that role. The API microservice applications were built using Java Spring Boot and are powered by MongoDB Atlas running in AWS. MongoDB change streams and Kafka were used to create an event-driven architecture, and a behavior-driven testing approach was used to run more than 1,000 automated tests to form a “living specification” for developed code. Figure 1: Functional structure of each TM Forum API microservice application Each project contains: Implementation of REST APIs (CRUD) Event notification upon each successful API operation Integration with Amazon SNS and Kafka Each TM Forum API application is a separate microservice that implements the appropriate TM Forum specification, conforming to the REST API Design Guidelines ( TMF630 .) Each one exposes the following operations: Retrieval of a single object List a collection of objects (supports limit, offset, sort, projection, and filtering by properties) Partial update of an existing object Creation of a new object Deletion of an existing object There are two ways an object may be updated via the application: JSON Patch: performs an update as a series of operations which transforms a source document. We can filter/search documents using JSON Pointer or JSON Path JSON Merge Patch: represents the change as a lite version of the source document Each TM Forum API Application exposes REST interfaces to exchange data with the client. After receiving the payload, the application stores it in the MongoDB collection. After each successful API operation, MongoDB triggers a change stream event to be fired off—the application is listening for change stream events, and after receiving one it sends an event to the customer. Our microservice application supports fanout messaging scenarios using AWS services: Amazon Simple Notification Service (SNS) and Amazon Simple Queue Service (SQS). In this scenario, messages are pushed to multiple subscribers, which eliminates the need to periodically check or poll for updates. This also enables parallel asynchronous processing of the message by the subscribers. There are application configuration parameters which decide where the given message should be routed to. Each event sent to the external system is also stored in the audit collection. If it does not reach the destination, we can replay the events sequence again to restore the desired state. Another library also provides operations logging functionality. It can trace each request and response sent to the application, push it to the Kafka topic, then through the MongoDB connector to reach the MongoDB collection. This operations logging application can easily be integrated with every TM Forum API microservice application. For security, we encrypt data on the client side before it gets sent over the network using MongoDB’s built-in Client-side Field Level Encryption . Paired with this, we use a couple AWS services: First is AWS Key Management Service (KMS) , which gives centralized control over the cryptographic keys used to protect the data. Second, we use AWS Secrets Manager , which is a secure and convenient storage system for API keys, passwords, certificates, and other sensitive data. Stripped change streams also help us limit the information we send instead of sending the whole payload as a change stream body. Every TM Forum API is different as they have different domain models to work on. To test these unique applications, we use data-driven testing with the Spock framework. This lets us test the same behavior multiple times with different parameters and assertions, letting us hit upwards of 1200 unit tests per application with only a few test cases implemented. The results: A modern, customer-centric architecture The Profile Hub APIs form the core of the company’s new standards-based customer data architecture. This creates a better cx strategy by allowing upstream applications to easily leverage customer data. The enterprise will be able to reuse these components to accelerate new use case implementations on MongoDB Atlas. In addition, as the company modernizes its architecture, it will realize cost savings by moving off of legacy infrastructure with increased maintenance and licenses costs. Developers also benefit by getting to spend less time maintaining and having more time to build and launch new applications and products. By working with MongoDB Professional Services and gravity9, the company was able to develop this solution in under 12 weeks, shaving several months off of their original plans. New fully compliant TM Forum APIs can also now be delivered in a single sprint, allowing the telco to respond quickly to new business requirements. Looking forward, our client’s modern, customer-centric architecture will make it easier for them to navigate customer journeys and unlock revenue opportunities as they provide their customers with better, more connected products and experiences. Are you ready to meet with us to use MongoDB’s blueprint for accelerating TM Forum Open API implementation? Reach out to our Professional Services team to get started!

February 23, 2023
Applied

Ignite Launches Smart Spend Management Solution on MongoDB

Expenditures are under a microscope in today's macroeconomic environment. Organizations everywhere are trying to do more with less and stretch their budgets further. But managing spend can be an uphill battle when data is siloed across the organization. Ignite Procurement's spend management solution helps organizations unify procurement data across the enterprise, making procurement easier, better, and more efficient. Building a smarter, scalable stack Ignite CTO Valdemar Rolfsen says the company uses MongoDB Atlas because it gives them the flexibility they need to serve customers. Customer data can be structured or unstructured, and Atlas handles both seamlessly. He also says the rapid growth the company experienced in its early stages necessitated a database with more scalability than PostgreSQL could offer. MongoDB Atlas easily scales up or down, giving Ignite exactly what it needs to handle current workloads and the future growth the company anticipates. Rolfsen says that while the company doesn't run as many writes and reads as in other use cases, it does run a lot of advanced operations in its in-house analytics platform, and MongoDB handles those needs easily and efficiently. He also praises the professional support he's received from the MongoDB for Startups program. Ignite uses a mix of Atlas dedicated clusters and serverless instances , which has been generally available (GA) since June. Rolfsen says the work they're doing at Ignite fits very well within the serverless model. The Ignite platform is a mid-market and enterprise application that's used by companies in the U.S. and Europe. The platform sees heavy usage at certain times and very low usage at other times. With serverless, Rolfsen says he doesn't have to worry about the time it takes for an idle server to ramp up. When resources are in high demand, the serverless database automatically scales up to meet the demand and back down as demand subsides. With serverless pricing, you pay for what you use and not for idle resources. The transparency of the pricing model is something Rolfsen says is especially attractive in the serverless offering. Ignite is also benefiting from credits the company received as part of the MongoDB for Startups program. Rolfsen's affinity for the serverless offering goes beyond the pricing benefits of serverless. He recognizes the impact on CO2 emissions associated with large virtual machines operating continuously. Serverless technology, he says, is a much better alternative for the environment and sustainability. Ignite's cloud usage accounts for 25% of the company's CO2 emissions, according to Rolfsen. Getting that down is, "really cool to see," he says. In addition to MongoDB Atlas serverless instances, Ignite also runs a combination of Google Cloud Run, Cloud Functions, and Kubernetes clusters. As the company expands its services, it's starting to move more to microservices and cloud functions to limit the scope of the domain teams have to work in. How to find out more The MongoDB for Startups program helps startups along the way with free MongoDB Atlas credits, one-on-one technical advice, co-marketing opportunities, and access to a vast partner network. Learn more about the benefits of the MongoDB for Startup program and even sign up today. For more startup content, check out our wrap-up of the 2022 year in startups .

February 15, 2023
Applied

Real-Time ESG Data Management

ESG (Environmental, Social, and Governance) data collection and reporting has become a corporate priority, with over 96% of S&P 500 companies publishing sustainability reports in 2021, according to research from the Governance and Accountability Institute. There are several factors driving the adoption and use of ESG data; ranging from consumer preference for companies with positive ESG information, to employees, who increasingly believe environmental, social, and governance metrics are important indicators when choosing an employer. Many government bodies and regulators either have, or are considering, mandatory ESG data collection and ESG data reporting requirements for corporations under their jurisdiction. The European Union is taking the lead here , with several key pieces of legislation either already enacted, or coming soon. In the US, the SEC has also announced proposed rule changes for securities reporting, mandating companies make detailed climate-related disclosures in their filings. In addition to companies that report on their own data, financial firms, including the private equity industry, use ESG data and research to weigh risks and identify opportunities for the companies they invest in. Faced with growing scrutiny around ESG reporting and scoring, companies are struggling to meet ever more detailed and comprehensive reporting requirements. At the heart of the problem is the sheer volume and variety of data companies are expected to ingest and analyze to produce the scores that investors, consumers, and government entities demand. And with real-time data making its way into reports, ESG data management is becoming even harder. ESG data collection and analysis The volume and variety of ESG data makes collection and analysis difficult. The data collection problem can be broken down as follows: Variety Unlike financial datasets, which are mostly numerical, ESG metrics can include both structured and unstructured datasets, like an email or a media report. If a company wants to analyze satellite data to derive their own climate dataset, they may even need to analyze images and videos. Given these variables, companies need to employ a data model that can support many different types of data . Velocity As companies increasingly integrate real-time data sources into their ESG scoring systems, the velocity of data collected and analyzed increases exponentially. One example is loan due diligence in the financial sector. As customers demand faster loan approval turnaround times, financial institutions that currently rely on quarterly ESG data to make those decisions now need the information in real-time to instantly approve loans in an ESG compliant manner. Volume The increased variety of data sources, coupled with the growing velocity of data being collected leads to an increase in the sheer volume of data requiring analysis. Currently, ESG ratings and scores are derived from a blend of human judgment and model driven quantitative rating. But as the volume of data increases, along with the need for instant analysis of that data, real-time analytics and an increased use of AI/ML tools will become an ever greater part of ESG ratings and reporting. On top of this, there are also no universally applicable ESG standards, leaving companies having to deal with multiple different standards, with different data requirements, depending on which jurisdictions they operate in. Real-time ESG data analytics Companies are increasingly incorporating real-time data into their ESG analysis, reporting, and scoring. Harnessing technologies such as cloud computing, AI, and machine learning, those that utilize real-time data can, for instance, instantly parse breaking news stories for ESG-related data on their investments, or incorporate up-to-the-minute satellite data into reports on a firm’s environmental impact. The financial services industry in particular is taking a lead on integrating real-time ESG data into investment decisions. Asset and fund managers use real-time data platforms that allow them to calculate accurate ESG scores to aid investment decisions and risk calculations. For example, a bank looking to invest in an electric vehicle company would be alerted to a breaking news story about a hazardous accident at the manufacturer’s battery plant, with follow up data from social media or analyst reports quantifying the size of the public reaction and the level of negative market sentiment around the accident. MongoDB and ESG data management MongoDB Atlas is an ideal data foundation for ESG platforms. MongoDB Atlas uses the document data model, giving users the ability to ingest data from almost any source, consolidate data from a number of siloed data sets, enable the easy search of that data, and with a few clicks, create customized views of the data without the need for additional ETL operations to other databases or tools. MongoDB Atlas also future-proofs your ESG data platform with a flexible data schema that can easily adapt to rapidly changing ESG requirements and standards. See why Hydrus chose MongoDB Atlas as the basis for its ESG reporting platform. FAQ ESG data definition ESG (Environment, Society, and Governance) data comes from a growing list of sources, all of which help “score” a corporation based on how well positioned it is to handle the risks and opportunities presented by the environment, societal stakeholders, and corporate governance. Environment - What are a company's greenhouse gas emissions? How about its stewardship over natural resources? And how well positioned is it to weather physical climate risks, like global warming, flooding, drought, fire etc. Social - How does a company measure up against prevailing fair wage and employee engagement metrics? What impact does a company have on the communities where it operates? Governance - How well is a company managed? How responsive is a company to shareholders? How accountable is leadership? What safeguards are in place to ensure transparency? The growing interest around ESG data science and data analytics has prompted the rise of a new industry of ESG data companies and ESG data management software vendors. What are the different ESG data sources? ESG data may come from two primary sources; 'inside-out' and 'outside-in'. Inside-out data is supplied by companies, used for analysis, and usually lags 6-12 months due to annual ESG-related disclosures. Outside-in data is more regularly updated, sometimes even in real time. Most financial institutions, including banks who often have access to a lot of financial and company data from their customers, do not rely solely on their own data. ESG data analysis requires a broad range of inputs and data that the bank does not possess or can obtain even from their customers. For example, a bank may want to assess the risk of flooding for a chip manufacturing company that has factories in several provinces in China. The bank would need to collect the flood data from the different operating locations in order to score the risk. As banks don’t typically collect flood data themselves, the bank would purchase data from third-party climate data vendors. At this nascent stage of climate risk assessment within the banking industry, it is likely that the bank would not even attempt to collect the raw climate data and create the risk models to score the risk, relying instead on third-party risk scoring vendors. The bank would then make use of these scores and combine in models which they have strong competencies eg. credit risk to come up with flood risk-adjusted credit risk scores for loan approvals. Why is ESG data essential for investors? ESG data is used by asset managers and investors for market analysis, supporting asset allocation and risk management, and in providing insights into the long-term sustainability of investments in various corporations.

February 14, 2023
Applied

Green Lending, Green Data - The Impact on Banks Explained

On 13 Dec 2022, the European Banking Authority (EBA) published its roadmap for sustainable finance. The roadmap – a conglomeration of standards and rules aimed at better integrating ESG risk considerations into the banking sector – is set to come into effect in a rolling fashion over the next 3 years. In our work with leading European banks, clients regularly tell us how they’re starting to build or revamp their ESG data platforms in anticipation of the coming changes around green financing. The conversation typically revolves around how they can flexibly incorporate the many new data sources, types of data, and formats that they will have to ingest and analyze under the EBA’s roadmap of changes. Clients are also increasingly interested to hear what MongoDB has to offer around real-time ESG information delivery. These inquiries come despite the fact that the EBA didn't yet demand real-time public disclosure and regulatory submissions of sustainability information. Check our our blog on Real-Time ESG data management. Green lending, green metrics, green data One interesting area of the EBA roadmap concerns loans with environmental sustainability features, so called green lending. These loans, sometimes called energy efficient or green mortgages, are typically given to retail clients and SMEs to make energy efficient improvements to homes and other buildings, such as adding solar panels or funding other renewable energy work. According to the roadmap, "... the EBA will consider the merits of an EU definition for green loans and mortgages, and will identify potential measures to encourage their uptake or facilitate their access by retail and SME borrowers…. In line with the request, the EBA will deliver its advice to the European Commission by December 2023." With the EBA pushing to increase the uptake of green loans, affected banks will have to re-work their scoring criteria for green loans to fit the EBA’s new classification and incentives guidelines: Banks will need to change their green loans credit scoring model to grant "green" retail loans and mortgages. Changes will also likely be required for risk adjusted performance indicators such as RAROC (Risk-Adjusted-Return-on-Capital) which many banks used to quantify the risk-return ratio and other indicators or metrics for pricing and approval decisions. This may result in a change in the acceptance performance of the loans and mortgages. As requested by the European Commission, the changes would not only affect new loans, but also "...already originated loans". Depending on the final advice of EBA, this could potentially mean reassessing existing loans with new data to determine if they can now be classified as “green”. Additionally, a re-assessment of most, if not all the related indicators for risk management and reporting would also need to happen. All of these potential changes mean banks having to collect more data, from more disparate sources than ever before. The impact on banks All these can mean significant impact to the loan origination process and data systems supporting the process. Here are some questions for banks to think about: Managing evolving or unforeseen changes. How would a bank change their Loan Origination system and related data platform (eg. credit data mart) to quickly adapt to the new green loan taxonomy and data elements? As the standards and classification rules are still evolving, how can one design an application and data schema that will still assure the development team that they easily adapt without throwing away existing work? Capturing different data attributes for the same product/loan. How can banks take existing retail loan products or mortgages and integrate different assessment criteria such as country specific regulations within EU and outside of the EU? How about incorporating specific market/business practices within a country, such as a car loan, which can vary based on the type of car (Battery electric vehicles, vs hybrid, vs gas powered). Incorporating new data types and formats. How can one capture information that goes beyond traditional financial credit data, including new data for both green classifications and green risk assessments? In the ECB’s 2022 climate risk stress testing , the ECB already gave a preview that geospatial data will be required to assess loan risks. How can banks add geo-location data and perform queries and analytics in a co-existent and seamless manner with the other existing data on their existing data platform? How about incorporating a whole raft of new unstructured sources, such as text description (emails, collateral documentation) that contains required ESG or sustainability characteristics to correctly classify the loan, for instance carbon emission descriptions of the house under mortgage. Finding insights from data explosion. With the increasing volume and variety of data sources, how can the borrowers quickly find the information (such as guidelines and related product information or ESG related guidance to obtain the relevant data) needed to correctly submit all the required loan information? Can potential borrowers type in "green car loan" and the lending bank’s web site or mobile app return immediately the relevant information for the potential customer? How can green loan credit officers quickly search for borrowers pending approval that have textual collateral information or certain risk information, matching keywords related to new risk findings that change the risk decision? Meeting the demands of customers, and the competition. Will the bank’s loan origination systems be able to provide a sustainability risk-adjusted credit score in real-time for in-principle approvals? Will that system scale to keep up with the demand caused by a large volume of retail borrowers? How MongoDB can help Loan origination, including post-origination monitoring, requires a large system with multiple modules and corresponding internal user groups such as loan application and data capture, data enrichment, financial risk analytics, decision and approvals, and loan closure. There are many ways a bank can architect or revise its loan origination and monitoring systems. Here is a simplified architecture with MongoDB for a green loan origination system built to service the EBA’s proposed green loan changes. Simplified architecture for Green Loan origination. A few key features of using MongoDB Atlas: Atlas Device Sync can automatically synchronize MongoDB's mobile database Realm, deployed on the mobile devices of users, back to Atlas. Borrowers (or even loan officers) who need to submit or review a large set of documents can access the documents faster with the offline-first Realm mobile database. The use of Realm and Device Sync also speeds up mobile development and alleviates the need to maintain complex data synchronization logic. Atlas Search is an embedded, full-text search in MongoDB Atlas that gives you a seamless, scalable experience for building relevance-based app features. Built on Apache Lucene, Atlas Search eliminates the need to run a separate search system alongside your database. Combined with Altas Search facet, users can quickly narrow down Atlas Search results based on the most frequent attribute values in the specified attribute field. Atlas Data API and GraphQL – MongoDB Atlas provides a low-code/no-code approach for developers to quickly develop APIs for other internal or even external applications (like Third-Party Providers (TPPs) in an Open Banking ecosystem) to access data in a secure manner. MongoDB supports the use of GraphQL, a query language for API development designed to let developers construct requests that pull data from multiple data sources in a single API call. This helps eliminate over-fetching problems and circumvents the need for multiple costly round trips to the server. The ease of building data access and reduction of performance roundtrip, helps banks accelerate business with TPPs in an open banking ecosystem, improving the customer experience either with direct access to the bank's mobile applications, or those of a TPP. Data Aggregation Pipeline is a framework for data aggregation, modeled on the concept of data processing pipelines. Documents enter a multi-stage pipeline that transforms them into aggregated results. This allows bank development teams to quickly implement data analytics in a natural sequence of data processing units, rather than needing to use multiple nesting SQL statements. This framework is the cornerstone of providing the high performance transanalytics capabilities that MongoDB is known for . Banks can develop both real-time on-the-fly ESG-adjusted credit scoring and also batch analytics processing required as part of the loan origination process. Atlas Charts is a data visualization tool built into Atlas. It provides a clear understanding of your data, highlighting correlations between variables and making it easy to discern patterns and trends within your dataset. The Charts API allows banks to build in-app business intelligence with a variety of analytics tools to help both the borrowers and the bank to gain more insights into the loan. ESG vendors have used MongoDB to help their Fortune 500 customers to improve their ESG performance . For retail loans where the ESG complexity or green requirements should be a lot less complicated, the self-service analytics that Charts can provide would help to accelerate green retail loan processing even more. Atlas Triggers allow you to execute server-side logic in response to database events or according to a schedule. Triggers can respond to events or use pre-defined schedules. Triggers can be combined with many other integration features, such as the Data API mentioned above, to perform the necessary actions for the loan workflow. No task would be missed or remain unprocessed! Offer Green Loans with MongoDB One of the questions I am asked by peers in risk management is, “Why would ESG be relevant to retail loans?”. Such a question makes me suspect that there is still a lack of understanding around the relevance of ESG, sustainability and climate risk to those who may be working in ESG but not in retail lending business. The EBA's roadmap clearly indicates that there is a need to not just require sustainability to be incorporated in retail loans and green mortgages, but also to develop standards and guidelines to support that. This EBA sustainable finance roadmap clarifies, consolidates, and expands on earlier plans and should help financial institutions impacted to be more prepared for the wave of changes coming in ESG and sustainability financing. Both business and technology teams should start thinking about how to adapt for these evolving requirements and newly forming standards, whether they are in a market directly impacted by the EBA’s roadmap, or in one that will be influenced by these new standards. Afterall, EU regulation is often referenced and / or adopted by regulators in other countries and regions.

February 13, 2023
Applied

Ready to get Started with MongoDB Atlas?

Start Free