MongoDB Developer

Coding with MongoDB - news for developers, tips and deep dives

Mainframe Data Modernization with MongoDB Powered by Wipro's "ModerniZ" Tool

This post will highlight a practical and incremental approach to modernizing mainframe data and workloads into cloud-hosted microservices using MongoDB’s modern, general purpose database platform. Enterprises modernize their mainframes for a number of reasons—to increase agility, lower the risk of aging legacy skills, and reduce total cost of ownership (TCO). But the greatest underlying benefit and reason for modernization lies in a company’s ability to access and make sense of their own data more quickly. Gleaning valuable business insights through the use of real-time data and AI/ML models is at the heart of today’s most successful and innovative companies. Consider the following business processes and their reliance on real-time insights: Real-time fraud detection, KYC, and score calculation Supporting new API requirements – PSD2, Open Banking, Fintech APIs etc Payment pattern analysis Moving away from hundreds of canned reports to template-based self-service configurable reporting Real-time management reporting With the continued emergence of mobile and web apps, enterprises are looking to render content even faster, as well as scale up and down on demand. However, mainframes often serve as the true system of record (SoR) and maintain the golden copy of core data. In a typical enterprise, inquiry transactions on the mainframe contribute to over 80% of the overall transaction volume—in some cases up to 95%. The goal for organizations is to increase the throughput of these inquiry transactions with improved response time. However, during real-time transaction processing, middleware must orchestrate multiple services, transform service response, and aggregate core mainframe and non-core applications. This architecture prevents the legacy services from seamlessly scaling on-demand with improved response time. At the same time, CIOs have significant concerns around the risks of big bang mainframe exit strategies, especially in the financial services, insurance and retail sectors where these complex applications serve the core business capabilities. Wipro and MongoDB’s joint offering, “ModerniZ,” can help alleviate these risks significantly by introducing a practical, incremental approach to mainframe modernization. An incremental solution: Offloading inquiry services data with ModerniZ To keep up with changing trends in an agile manner, and to bridge the gap between legacy monoliths and digital systems of engagement, a tactical modernization approach is required. While complete mainframe modernization is a strategic initiative, offloading inquiry services data and making it available off the mainframe is a popular approach adopted by several enterprises. Below are a few business requirements driving mainframe data offload: Seamlessly scale volume handling by up to 5X Improve response time by up to 2X Reduce mainframe TCO by up to 25% Direct API enablement for B2B and Partners (change the perception of being antiquated) Improve time to market on new enhancements by up to 3X Provision separate security mechanism for inquiry-only services Access single view data store for intra-day reporting, analytics and events handling These business requirements, as well as the challenges in current mainframe environments, warrant an offloaded data environment with aggregated and pre-enriched data from different sources in a format which can be directly consumed by channels and front-end systems. This is where our joint offering for mainframe modernization with MongoDB and Wipro’s ModerniZ tool comes into play. ModerniZ is Wipro’s IP platform specially focused on modernizing the UI, services and data layer of System Z (Mainframe). ModerniZ has multiple in-house tools and utilities to accelerate every phase of the legacy modernization journey. Let’s dig deeper into the solution elements. "CQRS with ModerniZ" for transactional and operational agility Command Query Responsibility Segregation (CQRS) is an architectural principle that can prescribe an operation as a ‘command,’ which performs an action, or a ‘query,’ which returns data to the requestor—but not both. CQRS achieves this by separating the read data model from the write data model. Separation of these two operations in a business process helps optimize performance, reduce the cost associated with inquiry transactions, and create a new model which can grow vertically and horizontally. MongoDB, with its document model and extensive set of capabilities, is best suited to house this offloaded ‘read data model.’ MongoDB’s JSON/BSON document model helps in pre-enriching the data and storing it in ‘inquiry ready’ format which simplifies the overhead for the front-end consumers. Enterprise use cases for CQRS: Customer demographic information – e.g. monetary and nonmonetary transaction inquiries in banking and financial services Payer view – Healthcare Single view across policy administration engines and pricing engines Consolidated participant view in benefit administration platforms Single view of manufacturing production control systems across plants or countries The below process indicates the step-by-step approach in enabling CQRS by offloading the data and transactional volumes into a modernized landscape, and continuously syncing the data (based on the business criticality) across platforms. Figure 1. Mainframe data modernization process The visual below indicates the conceptual target architecture where the service delivery platform (API/Middleware layer) identifies and routes the transaction to the respective systems. Any update which happens in the legacy system will be cascaded to the target MongoDB based on the business criticality of the fields. Figure 2. Post data modernization process view Mainframe services will continue to get exposed as domain APIs via zCEE for the Command Services (Update Transactions) and the newly built microservices will serve the inquiry transactions by fetching data from MongoDB. Any data updates in the mainframe will be pushed to MongoDB. The below table indicates how different fields can be synced between the mainframe and MongoDB, as well as their corresponding sync intervals. Java programs/Spark consumes the JSON document from Kafka Cluster, merges into MongoDB and creates new documents. table, th, td { border: 2px solid black; border-collapse: collapse; } Sync Type Field Classsification Sync Strategy Type-1 Near Real Time Sync for critical fields Using Kafka Queues / CICS Event Triggers / DB2 / 3rd Pardy CDC Replicators / Queue Replicators Type-2 Scheduled Batch Polling sync for less critical fields Using Mini Intra-day Batch / Replicators / ELT Type-3 EoD Batch sync for non-critical fields Batch CDC Sync Sync / Update / ELT How MongoDB and Wipro's ModerniZ helps Wipro’s ModerniZ platform provides multiple tools to accelerate the modernization journey across phases, from impact analysis to design to build to deployment. For data modernization, ModerniZ has 5 tool sets like PAN (Portfolio Analyzer), SQL Converter, automated data migration, and so on—all which can be leveraged to yield a minimum committed productivity gain of 20%. Figure 3. Mainframe to cloud transformation using ModerniZ Why MongoDB for mainframe modernization? MongoDB is built for modern application developers and for the cloud era. As a general purpose, document-based, distributed database, it facilitates high productivity and can handle huge volumes of data. The document database stores data in JSON-like documents and is built on a scale-out architecture that is optimal for any kind of developer who builds scalable applications through agile methodologies. Ultimately, MongoDB fosters business agility, scalability and innovation. Some key benefits include: Deploys across cloud in nearly all regions Provides a document model that is flexible and maps to how developers think and code Costs a fraction of the price compared to other offloading solutions built using relational or other NoSQL databases and even a larger, sharded MongoDB environment brings magnitudes of savings compared to the traditional mainframe MIPS-based licensing model Allows complex queries to be run against data via an extremely powerful aggregation framework. On top of that, MongoDB provides a BI connector for dedicated reporting/business intelligence tools as well as specific connectors for Hadoop and Spark to run sophisticated analytical workloads Offers enterprise-grade management through its EA product and includes security features which cover all areas of security – authentication, authorization, encryption and auditing. Competitors often only offer a subset of those capabilities Provides a unified interface to work with any data generated by modern applications Includes in-place, real-time analytics with workload isolation and native data visualization Maintains distributed multi-document transactions that are fully ACID compliant MongoDB has successfully implemented mainframe offloading solutions before and customers have even publicly talked about the success (e.g. a top financial service enterprise in the EU) Roadmap and contact details Work in progress on offloading mainframe read workloads to GCP native services and MongoDB. Contact partner-presales@mongodb.com for more information.

September 16, 2021
Developer

Serverless Instances Now Offer Extended Regional and Cloud Provider Support

Today’s applications are expected to just work, regardless of time of day, user traffic, or where in the world they are being accessed from. But in order to achieve this level of performance and scale, developers have to meticulously plan for infrastructure needs, sometimes before they even know what the success of their application may be. In many cases, this is not feasible and can lead to over provisioning and over paying. But what if you could forgo all of this planning and the database would seamlessly scale for you? Well, now you can - with serverless instances on MongoDB Atlas. Since we announced serverless instances in preview at MongoDB.live we have been actively working toward implementing new functionality to make them more robust and widely available. With our most recent release, serverless instances now offer expanded cloud providers and regions, and support MongoDB tools. Deploy a serverless instance on the cloud provider of your choice With our dedicated clusters on MongoDB Atlas, you have the flexibility to run anywhere with global reach on the cloud provider of your choice, so you can deliver responsive and reliable applications wherever your users are located. Our goal is to provide this same flexibility for serverless instances. We’re happy to announce that you can now deploy a serverless instance in ten regions on AWS, Google Cloud, and Azure. You’ll see when deploying a serverless instance there are now more regions supported on AWS, as well as two available regions on both Google Cloud and Azure - so you can get started with the cloud provider that best suits your needs or the region that’s closest to you. We will be continuing to add new regions over time to ensure coverage where you need it most. Easily import your data with MongoDB tools With this release, we have also made it easier to work with your data. You can now easily import data from an existing MongoDB deployment using the MongoDB Tools including mongodump, mongorestore, mongoexport , and mongoimport . In order to use MongoDB tools with serverless instances, you will need to be using the latest version . If you have additional feature requests that would make your developer experience better, share them with us in our feedback forums . Database deployment made simple With serverless instances, you can get started with almost no configuration needed - MongoDB Atlas will automatically scale to meet your workload needs, whether you have variable traffic patterns or you’re looking for a sandbox database for your weekend hobby project. If you haven’t yet given serverless instances a try, now is a great time to see what they can offer. If you have feedback or questions, we’d love to hear them! Join our community forums to meet other MongoDB developers and see what they’re building with serverless instances. Create your own serverless instance on MongoDB Atlas. Try the Preview .

September 16, 2021
Developer

A Guide to Freeing Yourself from Legacy RDBMS

Oracle introduced the first commercial relational database (RDBMS) to the market in 1979 — more than a decade before the World Wide Web. Now, digital transformation is reshaping every industry at an accelerating pace. In an increasingly digital economy, this means a company's competitive advantage is defined by how well they build software around their most critical asset — data. MongoDB and Palisade Compliance have helped some of the largest and most complex Oracle customers transform their architecture and shift to a cloud-first world. Although every client is unique, we have identified three important steps to moving away from Oracle software, reducing costs, and achieving their digital transformation goals: Understand your business and technical requirements for today and tomorrow, and identify the technical solution and company that will be by your side to help future-proof your organization. Decipher your Oracle contracts and compliance positions to maximize cost reduction initiatives and minimize any risks from Oracle audits and non-compliance that may derail your ultimate goals. Mobilize internal momentum and traction to make the move. MongoDB can help with #1, Palisade Compliance assists with #2, and you have to supply #3. This is a guide to getting started, as outlined by the main pillars of success above. 1. Understand your requirements and find the right partner — MongoDB The most common requirements we hear from organizations are that they need to move faster, increase developer productivity, and improve application performance and scale -- all while reducing cost and breaking free from vendor lock-in. For example , to keep pace with demands from the business, Travelers Insurance modernized its development processes with a microservices architecture supported by agile and DevOps methodologies. But the rigidity of its existing Oracle and SQL Server databases created blockers to move at the speed they needed. The solution was MongoDB and its flexible data model. They eliminated the three-day wait to make any database changes, creating a software development pipeline supporting continuous delivery of new business functionality. Similarly, Telefonica migrated its customer personalization service from Oracle to MongoDB. Using Oracle, it took 7 developers, multiple iterations and 14 months to build a system that just didn't perform. Using MongoDB, a team of 3 developers built its new personalization service in 3 months, which now powers both legacy and new products across the globe. MongoDB helps Telefonica be more agile, save money and drive new revenue streams. While some organizations try to innovate by allowing siloed, modern databases to coexist with their legacy relational systems, many organizations are moving to fully replace RDBMS. Otherwise, a level of complexity remains that creates significant additional work for developers because separate databases are required for search, additional technologies are needed for local data storage on mobile devices, and data often needs to be moved to dedicated analytics systems. As a result, development teams move slowly, create fewer new features, and cost the organization more capital. MongoDB provides the industry’s first application data platform that allows you to accelerate and simplify how you build with data for any application. Developers love working with MongoDB’s document model because it aligns with how they think and code. The summarized functional requirements that we typically hear from leading companies and development teams regarding what they require from a data platform include: A data structure that is both natural and flexible for developers to work with Auto-scaling and multi-node replication Distributed multi-document transactions that are fully ACID compliant Fully integrated full-text search that eliminates the need for separate search engines Flexible local datastore with seamless edge to cloud sync In-place, real-time analytics with workload isolation and native data visualization Ability to run federated queries across your operational/transactional databases and cloud object storage Turnkey global data distribution for data sovereignty and fast access to Data Lake Industry-leading data privacy controls with client-side, field level encryption Freedom to run anywhere, including the major clouds across many regions MongoDB delivers everything you need from a modern data platform. But it’s not just about being the right data platform; we’re also the right modernization partner. Through our Modernization Program we have built and perfected modernization guides that help you select and prioritize applications, review best practices, and design best-in-class, production-grade, migration frameworks. We’ve built an ecosystem around accelerating and simplifying your journey that includes: deployment on the leading cloud providers to enable the latest innovations technology companies that help with data modeling, migration, and machine learning, and expert System Integrators to provide you with tools, processes and support to accelerate your projects. We are proud to be empowering development teams to create faster and develop new features and capabilities, all with a lower total cost of ownership. 2. Manage Oracle as you move away — Palisade Compliance Oracle’s restrictive contracts, unclear licensing rules, and the threat of an audit can severely impact a company’s ability to transform and adopt new technologies that are required in a cloud-first world. To move away from Oracle and adopt new solutions, companies must be sure they can actually reduce their costs while staying in compliance and avoiding the risks associated with an audit. There will be a time when you are running your new solution and your legacy Oracle software at the same time. This is a critical phase in your digital transformation as you do not want to be tripped up by Oracle’s tactics and forced to stay with them. It may seem counterintuitive, but as you spend less with Oracle you must be even more careful with your licensing. As long as you keep spending money with Oracle and renewing those expensive contracts, the threat of an audit and non-compliance will remain low. Oracle is unlikely to audit a company that keeps giving it money. However, the moment you begin to move to newer technologies, your risk of an audit significantly increases. As a result, you must be especially vigilant to prevent Oracle from punishing you as you move away from them. Even if you’ve found a technical partner and managed your Oracle licenses and compliance to ensure no surprises, you still have to find a way to reduce your costs. It’s not as simple as terminating Oracle licenses and seeing your support costs go down. As stated above, Oracle contracts are designed to lock in customers and make it nearly impossible to actually reduce costs. Palisade Compliance has identified eleven ways to manage your Oracle licenses and reduce your Oracle support. It is critical that you understand and identify the options that work for your firm, and then build and execute on a plan that ensures your success. 3. Mobilize internal momentum and traction to make the move Legacy technology companies excel at seeding doubt into organization and preventing moves that threaten their antiquated solutions. Unfortunately, too many companies succumb to these tactics and are paralyzed into a competitive disadvantage in the market. In software, as in life, it’s easier to stay the course than to follow through with change. But when it comes to technical and business decisions that impact the overall success and direction of an organization, innovation and change aren’t just helpful, they’re necessary to survive--especially in a world with high customer demands and easy market entry. Ensuring you have the right technical partner and Oracle advisor is the best way to build the confidence and momentum needed to make your move. Creating that momentum is easier with MongoDB’s Database Platform, consisting of a fully managed service across 80+ regions, and Palisade’s expertise in Oracle licensing and contracts. Technical Alternative (MongoDB) + Independent Oracle Advisors (Palisade) ⇒ Momentum Parting thoughts To schedule a preliminary health check review and begin building the right strategy for your needs, fill out your information here . And to learn more about MongoDB’s Modernization Program, visit this page . About Palisade Compliance With over 400 clients in 30 countries around the world, Palisade is the leading provider of Oracle-independent licensing, contracting, and cost reduction services. Visit the website to learn more. To schedule a complementary one-hour Oracle consultation send an email to info@palisadecompliance.com.

September 2, 2021
Developer

Highlight What Matters with the MongoDB Charts SDK

We're proud to announce that with the latest release of the MongoDB Charts SDK you can now apply highlights to your charts. These allow you to emphasize and deemphasize your charts with our MongoDB query operators . Build a richer interactive experience for your customers by highlighting with the MongoDB Charts embedding SDK . By default, MongoDB Charts allows for emphasizing parts of your charts by series when you click within a legend. With the new highlight capability in the Charts Embedding SDK, we put you in control of when this highlighting should occur, and what it applies to. Why would you want to apply highlights? Highlighting opens up the opportunity for new experiences for your users. The two main reasons why you may want to highlight are: To show user interactions: We use this in the click handler sandbox to make it obvious what the user has clicked on. You could also use this to show documents affected by a query for a control panel. Attract the user’s attention: If there's a part of the chart you want your users to focus on, such as the profit for the current quarter or the table rows of unfilled orders. Getting started With the release of the Embedding SDK , we've added the setHighlight method to the chart object, which uses MQL queries to decide what gets highlighted. This lets you attract attention to marks in a bar chart, lines in a line chart, or rows in a table. Most of our chart types are already supported, and more will be supported as time goes on. If you want to dive into the deep end, we've added a new highlighting example and updated the click event examples to use the new highlighting API: Highlighting sandbox Click events sandbox Click events with filtering sandbox The anatomy of a click In MongoDB Charts, each click produces a wealth of information that you can then use in your applications , as seen below: In particular, we generate an MQL expression that you can use called selectionFilter , which represents the mark selected. Note that this filter uses the field names in your documents, not the channel names. Before, you could use this to filter your charts with setFilter , but now you can use the same filter to apply emphasis to your charts. All this requires is calling setHighlight on your chart with the selectionFilter query that you get from the click event, as seen in this sandbox . Applying more complex highlights Since we accept a subset of the MQL language for highlighting, it's possible to specify highlights which target multiple marks, as well as multiple conditions. We can use expressions like $lt and $gte to define ranges which we want to highlight. And since we support the logical operators as well, you can even use $and / $or . All the Comparison , Logical and Element query operators are supported, so give it a spin! Conclusion This ability to highlight data will make your charts more interactive and help you better emphasize the most important information in your charts. Check out the embedding SDK to start highlighting today! New to Charts? You can start now for free by signing up for MongoDB Atlas , deploying a free tier cluster and activating Charts. Have an idea on how we can make MongoDB Charts better? Feel free to leave an idea at the MongoDB Feedback Engine .

September 2, 2021
Developer

MongoDB Atlas as a Data Source for Amazon Managed Grafana

Amazon Managed Grafana is a fully managed service that is based on open source Grafana. Amazon Managed Grafana makes it easy to visualize and analyze operational data at scale. With Amazon Managed Grafana, organizations can analyze data stored in MongoDB Atlas without having to provision servers, configure or update software, or do the heavy lifting involved in securing and scaling Grafana in production. Connecting MongoDB Atlas to AMG The MongoDB Grafana plug-in makes it easy to query MongoDB with Amazon Managed Grafana. Simply select MongoDB as a data source, then connect to theMongoDB cluster using an Atlas connection string and proper authentication credentials (see Figure 1). Figure 1. Set up: MongoDB Grafana plug-in Now, MongoDB is configured as a data source. To visualize the data through Amazon Managed Grafana, select the Explore tab in the side panel and ensure that MongoDB is selected as the data source. Users can then write the first query in the query editor (see Figure 2). sample_mflix.movies.aggregate([ {"$match": { "year": {"$gt" : 2000} }}, {"$group": { "_id": "$year", "count": { "$sum": 1 }}}, {"$project": { "_id": 0, "count": 1, "time": { "$dateFromParts": {"year": "$_id", "month": 2}}}} ] ).sort({"time": 1}) Figure 2. AMG query editor Grafana will graph the query, illustrating how certain fields change over time. For more granular detail, users can review the data view below the visualization. (see Figure 3). Figure 3. AMG data view Using MongoDB as a data source in Amazon Managed Grafana allows users to easily analyze MongoDB data alongside other data sources, affording a singular point of reference for all of the most important data in an application. There’s no hassle; once connected to MongoDB from Amazon Managed Grafana, it simply works. Try out MongoDB Atlas with Amazon Managed Grafana today.

September 1, 2021
Developer

Simplifying Data Migrations From Legacy SQL to MongoDB Atlas with Studio 3T and Hackolade

Migrating data from SQL relational databases to MongoDB Atlas may initially seem to be a straightforward process. Export the data from the relational database, import the tables into MongoDB Atlas, and then start writing queries for it. But the deeper you look, the more it can start to feel like an overwhelming task. Decades of irrelevant indexes, rare relationships and forgotten fields that need to be migrated all make for a more complicated process. Not to mention, converting your old schemas to work well with the document-based world of MongoDB can take even longer. Making this process easier and more achievable was one of Studio 3T’s main goals when they created their SQL to MongoDB migration tools. In fact, Studio 3T's SQL migration doesn't just make this process easier; it also makes it reliably repeatable. This means you can tune your migration to produce the perfect documents in an ideal schema. There's no big bang cut over; you can proceed at your own pace. Delivering the ability to perfect your new schema is also why we've integrated with Hackolade's schema design and data modeling tools. In this article, we look at how the data modernization process works with a focus on the SQL migration powers of Studio 3T and the data modelling technology of Hackolade. So the problem was, how do I migrate the data that’s been collected over the last 10 years from MySQL over to MongoDB? And that’s when I found Studio 3T. I could not imagine trying to do it myself by hand... If it hadn’t been for Studio 3T, I probably would have just left it all in the SQL database. Rand Nix, IT Director, Wakefield Inspection Services Why MongoDB Atlas Building on MongoDB’s intuitive data model and query API, MongoDB Atlas gives engineering organizations the versatility they need to build sophisticated applications that can adapt to changing customer demands and market trends. Not only is it the only multi-cloud document database available, it also delivers the most advanced security and data distribution capabilities of any fully managed service. Developers can get started in minutes and leverage intelligent automation to maintain performance at scale as applications evolve over time. Mapping the move The crucial component to a SQL migration is mapping the SQL tables and columns to JSON documents and their name/value fields. While the rigid grid of relational tables has a familiar feeling, a document in MongoDB can be any shape and, ideally, it should be the shape that makes sense for your business logic. Doing this by hand can prove extremely time-consuming for developers. Which situations call for reshaping your data into documents and which should you construct new documents for? These are important questions which require time and expertise to answer. Or, you can simply use Studio 3T's SQL Migration tool which is a powerful, configurable import process. It allows you to create your MongoDB documents with data retrieved from multiple relational tables. Using foreign keys, it can capture the relationships between records in document-centric form. "When we started out, we built it on SQL Server. All was good to start, but by the time we got up to fifty customers and more, each with some thousands of their staff logging into the system concurrently, we ran into scaling issues. "Now we’re using a hosted, paid-for subscription on MongoDB Atlas. They do all the management and the sharding and all that stuff. And when we switch to regional provisioning, they can handle all that too, which is a huge relief. "We all use Studio 3T constantly, it’s used as extensively as Visual Studio. We use it for all sorts of things because Studio 3T makes it really easy to navigate around the data." Dan Cummings, Development Mgr. Terryberry Inc. For example, a relational database may have a table of customers and a table of orders by each customer. With the SQL Migration tool, you can automatically create a collection of customer documents, containing an array of all the orders that customer has placed. This exploits the power of the document model by keeping related data local. You can also preview the resulting JSON documents before running a full import so you are sure you're getting what you want. Figure 1. Studio 3T’s SQL to MongoDB Migration In Figure 1 (see above), we can see Studio 3T's SQL Migration pulling in multiple SQL tables, embedding rental records into the customer records, and previewing the JSON output. For more radical restructuring, SQL queries can also provide the source of the data for import. This allows for restructuring of the data by, for example, region or category. Hackolade integrated As you can see, Studio 3T is fully capable of some complex migrations through its tooling, SQL Migration, Import and Reschema. However, if you are of the mindset that you should design your migration first, then this is where Hackolade comes in. Hackolade is a "polyglot data" modelling application designed for a world where models are implemented in everything from graphs to APIs to storage formats. Figure 2. Entity Relationships Mapped in Hackolade Using Hackolade, you can import the schema with relationships from an SQL database and then visually reassemble the SQL fields to compose your MongoDB document schema. Watch this demo video for more details. Once you've built your new models, you can use Studio 3T to put them into practice by importing a Hackolade schema mapping into a SQL Migration configuration. Figure 3. Importing Hackolade Models into Studio 3T This helps eliminate the need to repeatedly query and import from the SQL database as you fine-tune your migration. Instead, you can construct and document the migration within Hackolade, review with your team, and then implement with confidence. Once the data is in MongoDB Atlas, support from Studio 3T continues. Studio 3T's Reschema tools allow you to restructure your schema without re-importing, working entirely on the MongoDB Atlas data. This can be particularly useful when blending data from existing document databases with freshly imported, formerly relational data. The idea that migrations have to be team-breaking chores no longer holds. It is eminently possible, with tools such as Studio 3T's SQL Migration and Hackolade, to not only perform more complex migrations easily, but to be able to design them up front and put them into practice as often as is needed. With both tools working in integrated harmony, the future of migration has arrived.

August 24, 2021
Developer

Data Movement from Oracle to MongoDB Made Easy with Apache Kafka

Change Data Capture features have existed for many years in the database world. CDC makes it possible to listen to changes to the database like inserting, updating and deleting data and have these events be sent to other database systems in various scenarios like ETL, replications and database migrations. By leveraging the Apache Kafka, the Confluent Oracle CDC Connector and the MongoDB Connector for Apache Kafka, you can easily stream database changes from Oracle to MongoDB. In this post we will pass data from Oracle to MongoDB providing a step by step configuration for you to easily re-use, tweak and explore the functionality. At a high level, we will configure the above references image in a self-contained docker compose environment that consists of the following: Oracle Database MongoDB Apache Kafka Confluent KSQL These containers will be run all within a local network bridged so you can play around with them from your local Mac or PC. Check out the GitHub repository to download the complete example. Preparing the Oracle Docker image If you have an existing Oracle database, remove the section “database” from the docker-compose file. If you do not already have an Oracle database, you can pull the Oracle Database Enterprise Edition from Docker Hub . You will need to accept the Oracle terms and conditions and then login into your docker account via docker login then docker pull store/oracle/database-enterprise:12.2.0.1-slim to download the image locally. Launching the docker environment The docker-compose file will launch the following: Apache Kafka including Zookeeper, REST API, Schema Registry, KSQL Apache Kafka Connect MongoDB Connector for Apache Kafka Confluent Oracle CDC Connector Oracle Database Enterprise The complete sample code is available from a GitHub repository . To launch the environment, make sure you have your Oracle environment ready and then git clone the repo and build the following: docker-compose up -d --build Once the compose file finishes you will need to configure your Oracle environment to be used by the Confluent CDC Connector. Step 1: Connect to your Oracle instance If you are running Oracle within the docker environment, you can use docker exec as follows: docker exec -it oracle bash -c "source /home/oracle/.bashrc; sqlplus /nolog " connect / as sysdba Step 2: Configure Oracle for CDC Connector First, check if the database is in archive log mode. select log_mode from v$database; If the mode is not “ARCHIVELOG”, perform the following: SHUTDOWN IMMEDIATE; STARTUP MOUNT; ALTER DATABASE ARCHIVELOG; ALTER DATABASE OPEN; Verify the archive mode: select log_mode from v$database The LOG_MODE should now be, “ARCHIVELOG”. Next, enable supplemental logging for all columns ALTER SESSION SET CONTAINER=cdb$root; ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS; The following should be run on the Oracle CDB: CREATE ROLE C##CDC_PRIVS; GRANT CREATE SESSION, EXECUTE_CATALOG_ROLE, SELECT ANY TRANSACTION, SELECT ANY DICTIONARY TO C##CDC_PRIVS; GRANT SELECT ON SYSTEM.LOGMNR_COL$ TO C##CDC_PRIVS; GRANT SELECT ON SYSTEM.LOGMNR_OBJ$ TO C##CDC_PRIVS; GRANT SELECT ON SYSTEM.LOGMNR_USER$ TO C##CDC_PRIVS; GRANT SELECT ON SYSTEM.LOGMNR_UID$ TO C##CDC_PRIVS; CREATE USER C##myuser IDENTIFIED BY password CONTAINER=ALL; GRANT C##CDC_PRIVS TO C##myuser CONTAINER=ALL; ALTER USER C##myuser QUOTA UNLIMITED ON sysaux; ALTER USER C##myuser SET CONTAINER_DATA = (CDB$ROOT, ORCLPDB1) CONTAINER=CURRENT; ALTER SESSION SET CONTAINER=CDB$ROOT; GRANT CREATE SESSION, ALTER SESSION, SET CONTAINER, LOGMINING, EXECUTE_CATALOG_ROLE TO C##myuser CONTAINER=ALL; GRANT SELECT ON GV_$DATABASE TO C##myuser CONTAINER=ALL; GRANT SELECT ON V_$LOGMNR_CONTENTS TO C##myuser CONTAINER=ALL; GRANT SELECT ON GV_$ARCHIVED_LOG TO C##myuser CONTAINER=ALL; GRANT CONNECT TO C##myuser CONTAINER=ALL; GRANT CREATE TABLE TO C##myuser CONTAINER=ALL; GRANT CREATE SEQUENCE TO C##myuser CONTAINER=ALL; GRANT CREATE TRIGGER TO C##myuser CONTAINER=ALL; ALTER SESSION SET CONTAINER=cdb$root; ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS; GRANT FLASHBACK ANY TABLE TO C##myuser; GRANT FLASHBACK ANY TABLE TO C##myuser container=all; Next, create some objects CREATE TABLE C##MYUSER.emp ( i INTEGER GENERATED BY DEFAULT AS IDENTITY, name VARCHAR2(100), lastname VARCHAR2(100), PRIMARY KEY (i) ) tablespace sysaux; insert into C##MYUSER.emp (name, lastname) values ('Bob', 'Perez'); insert into C##MYUSER.emp (name, lastname) values ('Jane','Revuelta'); insert into C##MYUSER.emp (name, lastname) values ('Mary','Kristmas'); insert into C##MYUSER.emp (name, lastname) values ('Alice','Cambio'); commit; Step 3: Create Kafka Topic Open a new terminal/shell and connect to your kafka server as follows: docker exec -it broker /bin/bash When connected create the kafka topic : kafka-topics --create --topic SimpleOracleCDC-ORCLCDB-redo-log \ --bootstrap-server broker:9092 --replication-factor 1 \ --partitions 1 --config cleanup.policy=delete \ --config retention.ms=120960000 Step 4: Configure the Oracle CDC Connector The oracle-cdc-source.json file in the repository contains the configuration of Confluent Oracle CDC connector. To configure simply execute: curl -X POST -H "Content-Type: application/json" -d @oracle-cdc-source.json http://localhost:8083/connectors Step 5: Setup kSQL data flows within Kafka As Oracle CRUD events arrive in the Kafka topic, we will use KSQL to stream these events into a new topic for consumption by the MongoDB Connector for Apache Kafka. docker exec -it ksql-server bin/bash ksql http://127.0.0.1:8088 Enter the following commands: CREATE STREAM CDCORACLE (I DECIMAL(20,0), NAME varchar, LASTNAME varchar, op_type VARCHAR) WITH ( kafka_topic='ORCLCDB-EMP', PARTITIONS=1, REPLICAS=1, value_format='AVRO'); CREATE STREAM WRITEOP AS SELECT CAST(I AS BIGINT) as "_id", NAME , LASTNAME , OP_TYPE from CDCORACLE WHERE OP_TYPE!='D' EMIT CHANGES; CREATE STREAM DELETEOP AS SELECT CAST(I AS BIGINT) as "_id", NAME , LASTNAME , OP_TYPE from CDCORACLE WHERE OP_TYPE='D' EMIT CHANGES; To verify the steams were created: SHOW STREAMS; This command will show the following: Stream Name | Kafka Topic | Format ------------------------------------ CDCORACLE | ORCLCDB-EMP | AVRO DELETEOP | DELETEOP | AVRO WRITEOP | WRITEOP | AVRO ------------------------------------ Step 6: Configure MongoDB Sink The following is the configuration for the MongoDB Connector for Apache Kafka: { "name": "Oracle", "config": { "connector.class": "com.mongodb.kafka.connect.MongoSinkConnector", "topics": "WRITEOP", "connection.uri": "mongodb://mongo1", "writemodel.strategy": "com.mongodb.kafka.connect.sink.writemodel.strategy.UpdateOneBusinessKeyTimestampStrategy", "database": "kafka", "collection": "oracle", "document.id.strategy": "com.mongodb.kafka.connect.sink.processor.id.strategy.PartialValueStrategy", "document.id.strategy.overwrite.existing": "true", "document.id.strategy.partial.value.projection.type": "allowlist", "document.id.strategy.partial.value.projection.list": "_id", "errors.log.include.messages": true, "errors.deadletterqueue.context.headers.enable": true, "value.converter":"io.confluent.connect.avro.AvroConverter", "value.converter.schema.registry.url":"http://schema-registry:8081", "key.converter":"org.apache.kafka.connect.storage.StringConverter" } } In this example, this sink process consumes records from the WRITEOP topic and saves the data to MongoDB. The write model, UpdateOneBusinessKeyTimestampStrategy, performs an upsert operation using the filter defined on PartialValueStrategy property which in this example is the "_id" field. For your convenience, this configuration script is written in the mongodb-sink.json file in the repository. To configure execute: curl -X POST -H "Content-Type: application/json" -d @mongodb-sink.json http://localhost:8083/connectors Delete events are written in the DELETEOP topic and are sinked to MongoDB with the following sink configuration: { "name": "Oracle-Delete", "config": { "connector.class": "com.mongodb.kafka.connect.MongoSinkConnector", "topics": "DELETEOP", "connection.uri": "mongodb://mongo1”, "writemodel.strategy": "com.mongodb.kafka.connect.sink.writemodel.strategy.DeleteOneBusinessKeyStrategy", "database": "kafka", "collection": "oracle", "document.id.strategy": "com.mongodb.kafka.connect.sink.processor.id.strategy.PartialValueStrategy", "document.id.strategy.overwrite.existing": "true", "document.id.strategy.partial.value.projection.type": "allowlist", "document.id.strategy.partial.value.projection.list": "_id", "errors.log.include.messages": true, "errors.deadletterqueue.context.headers.enable": true, "value.converter":"io.confluent.connect.avro.AvroConverter", "value.converter.schema.registry.url":"http://schema-registry:8081" } } curl -X POST -H "Content-Type: application/json" -d @mongodb-sink-delete.json http://localhost:8083/connectors This sink process uses the DeleteOneBusinessKeyStrategy writemdoel strategy . In this configuration, the sink reads from the DELETEOP topic and deletes documents in MongoDB based upon the filter defined on PartialValueStrategy property. In this example that filter is the “_id” field. Step 7: Write data to Oracle Now that your environment is setup and configured, return to the Oracle database and insert the following data: insert into C##MYUSER.emp (name, lastname) values ('Juan','Soto'); insert into C##MYUSER.emp (name, lastname) values ('Robert','Walters'); insert into C##MYUSER.emp (name, lastname) values ('Ruben','Trigo'); commit; Next, notice the data as it arrived in MongoDB by accessing the MongoDB shell. docker exec -it mongo1 /bin/mongo The inserted data will now be available in MongoDB. If we update the data in Oracle e.g. UPDATE C##MYUSER.emp SET name=’Rob’ WHERE name=’Robert’; COMMIT;\ The document will be updated in MongoDB as: { "_id" : NumberLong(11), "LASTNAME" : "Walters", "NAME" : "Rob", "OP_TYPE" : "U", "_insertedTS" : ISODate("2021-07-27T10:25:08.867Z"), "_modifiedTS" : ISODate("2021-07-27T10:25:08.867Z") } If we delete the data in Oracle e.g. DELETE FROM C##MYUSER.emp WHERE name=’Rob’; COMMIT;. The documents with name=’Rob’ will no longer be in MongoDB. Note that it may take a few seconds for the propagation from Oracle to MongoDB. Many possibilities In this post we performed a basic setup of moving data from Oracle to MongoDB via Apache Kafka and the Confluent Oracle CDC Connector and MongoDB Connector for Apache Kafka. While this example is fairly simple, you can add more complex transformations using KSQL and integrate other data sources within your Kafka environment making a production ready ETL or streaming environment with best of breed solutions. Resources How to Get Started with MongoDB Atlas and Confluent Cloud Announcing the MongoDB Atlas Sink and Source Connectors in Confluent Cloud Making your Life Easier with MongoDB and Kafka Streaming Time-Series Data Using Apache Kafka and MongoDB

August 17, 2021
Developer

The Top 5 Data Trends Driving Competitive Advantage Today… — and Tomorrow

The latest market research from Cloudflight , a leading analyst firm based in Europe, identified 12 major technology trends for the current year. The trends found a radical shift in cloud adoption and an acceleration toward digital as people, society, the economy, and the environment all responded to the coronavirus pandemic. During a recent webinar , Dr. Stefan Ried (Cloudflight) and Mat Keep (MongoDB) shared key industry insights and explored in detail five of the most prevalent trends. The session found that, as the need for technological innovation grows, a company’s competitive advantage is increasingly tied to how well it can build software around its most important asset: data. In this post, Dr. Stefan Ried breaks down those five key trends and analyzes how businesses can drive data innovation to stay ahead of the field. Mat Keep then offers practical next steps to get started as data is increasingly managed in the cloud. Trend 1 Data becomes the differentiator — even beyond software Initially, many startups disrupted the incumbents in their industries with innovation based on software. All the while, non-digital-native enterprises caught up. Now data has become more important than software algorithms. Here’s an example: Imagine a traditional automotive company. The business could purchase components and software from a supplier to implement autonomous driving in its cars, but without enough learning data out of every region its cars wouldn’t drive reliably. In this case — and many more — the automotive firm cannot just buy a software competitive advantage off the shelf. Instead, it must build that advantage — and build it using data. It’s why data is quickly becoming the differentiator in all industries and why delivering a modern customer experience is increasingly reliant on this underlying infrastructure. Software Stack Eruption (Source: Cloudflight 2020) The above image illustrates just how the tech stack is evolving. Data quality is quickly becoming the outstanding differentiator compared to software algorithms. That’s why we consider the access, ownership, and quality of data to be the mountain of innovation in this decade and moving forward. Trend 2 Europe embraces various cloud scenarios Cloud adoption in Europe has always been behind that of the United States. One reason is obvious data sovereignty and compliance concerns. It would be an intriguing thought experiment to reflect on how the U.S. public cloud adoption would have developed over the past 10 years if the only strong and innovative providers were European or even Chinese companies. Europe, however, is now at an important inflection point. Global hyperscalers finally addressed these national privacy issues. Platform service providers, including MongoDB with MongoDB Atlas , have significantly increased support for these privacy requirements with technical features such as client-side-encryption and operational SLAs. This achievement enables enterprises and even public government agencies across Europe to embrace all three basic types of cloud scenarios. Lift and shift , moving existing legacy workloads without any change to new IaaS landscapes in the cloud. Modernization and decomposing existing application stacks into cloud-native services such as a DBaaS. Modernized workloads could leverage the public cloud PaaS stacks much better than monolithic legacy stacks. The new development of cloud-native applications and building modern applications with less code and more orchestration of many PaaS services. Trend 3 Hybrid-cloud is the dominant cloud choice and multicloud will come next Nearly 50 percent of participants in our recent webinar said hybrid-cloud is their current major deployment model. These organizations use different public and private clouds for different workloads. Just 20 percent of the attendees still restrict activities to a single cloud provider. Another equally sized group claimed the exact opposite approach to multicloud environments,where a single workload may use a mixture of cloud sources or may be developed on different providers to reach multiple regions. See below. Embracing the Cloud webinar poll results (June 2021) The increasing adoption of these real multicloud scenarios is yet another major trend we will see for many years. Less experienced customers may be afraid of the complexity of using multiple cloud providers, but independent vendors offer the management of a full-service domain across multiple providers. MongoDB Atlas offers this platform across AWS, Azure, and GCP, and paves the road for real multicloud adoption and innovation. Trend 4 Cloud-native is taking off with innovative enterprises In many client engagements, Cloudflight sees a strong correlation between new business models driven by digital products and cloud-native architectures. Real innovation happens when differentiated business logic meets the orchestration of a PaaS offering. That’s why car OEMs do not employ packaged asset-life-cycle-management systems but instead develop their own digital twins for the emerging fleet of millions of digitized vehicles. These PaaS architectures follow an API-first and service-oriented paradigm leveraging a lot of open-source software. Most of this open-source software is commercially managed by hyperscalers and their partner vendors to make it accessible and highly available without deep knowledge of the service itself. The approach provides very fast productive operations of new digital products. If compliance requires it, however, customers may operate the same open-source services on their own again. Once your product becomes extremely successful and you’re dealing with data volume far beyond one petabyte, you may also reconsider self-operations for cost reasons. This is because there is no operational lock-in for a specific service provider and you may become an “operations pro” on your own. Trend 5 Digital twins become cloud drivers in many industries Many people still connect the term “cloud computing” to virtualized compute-and-storage services. Yet cloud computing is far more. PaaS levels became increasingly attractive with prepackaged cloud-native services. It has been on the market for many years, but the perception and adoption — especially in Europe — is still behind its potential. Based on today’s PaaS services, cloud providers and their partners are already extending their offers to higher levels. The space of digital twins along with AI are clear opportunities here. There are offerings for each of the three major areas of digital twins: In modern automated manufacturing (industry 4.0), production twins are created when a product is ordered and they make production-relevant information (such as individual configurations) available to all manufacturing steps along the supply chain. Once the final product is delivered, the requirements for interactions and data models change significantly for these post-production-life-cycle twins . Production, post-production and simulation-twin (Source: Cloudflight ) Finally, simulation twins are a smart approach to test machine learning applications. Take, for example, the autonomous driving challenge: Instead of testing the ongoing iterations of driving “knowledge” on a physical vehicle, running virtual simulation twins is much preferred and safer than experiments in real traffic situations. Beyond manufacturing and automotive, there are many verticals in which digital twins make sense. Health care is a clear and obvious example in which real-life experiments may not always be the best approach. Success here depends mostly on the cooperation between technology vendors and the industry-specific digital twin ecosystems . In Summary Each of the five trends discussed center on or closely relate to cloud-native data management. A traditional database may be able to run for specific purposes on cloud infrastructure, but only a modern cloud-native application data platform is able to serve both the migration of legacy applications and the development of multiple new cloud-native applications. Next Steps Where and how can companies get started on a path to using data as a driver of competitive advantage? Mat Keep, Senior Director of Products at MongoDB, takes us through how to best embrace this journey. As companies move to embrace the cloud, they face an important choice. Do they: Lift and shift: move existing applications to run in the cloud on the same architecture and technologies used on premises. Transform (modernize): rearchitect applications to take advantage of new cloud-native capabilities such as elasticity, redundancy, global distribution, and managed services. Lift and shift is often seen as an easier and more predictable path since it reuses a lot of the technology you use on premises — albeit now running in the cloud — presenting both the lowest business risk and least internal cultural and organizational resistance. It can be the right path in some circumstances, but we need to define what those circumstances are. For your most critical applications, lift and shift rarely helps you move the business forward. You will be unable to fully exploit new cloud-native capabilities that enable your business to build, test, and adapt faster. The reality we all face is that every application is different, so there is no simple or single “right” answer to choosing lift and shift versus transformation. In some cases, lift and shift can be the right first step, helping your teams gain familiarity with operating in the cloud before embarking on a fuller transformation as they see everything the cloud has to offer. This can also be a risk, however, if your teams believe they are done with the cloud journey and don’t then progress beyond that first step. To help business and technology leaders make the right decisions as they embrace the cloud, we have created an Executive Perspective for Lift and Shift Versus Transformation . The perspective presents best practices that can help prioritize your efforts and mobilize your teams. By working with more than 25,000 customers, including more than 50 percent of the Fortune 100, the paper shares the evaluation frameworks we have built that can be used to navigate the right path for your business, along with the cultural transformations your teams need to make along the way. Embracing the Cloud: Assessment Framework Toyota Material Handling in Northern Europe has recently undergone its own cloud journey. As the team evolved its offerings for industry 4.0, it worked with MongoDB as part of its transformation. Moving from monolithic applications and aging relational databases running on premises to microservices deployed on a multicloud platform, the company completed its migration in just four months. It reduced costs by more than 60 percent while delivering an agile, resilient platform to power its smart factory business growth. To learn more about cloud trends and the role of data in your cloud journey, tune in to the on-demand webinar replay .

August 17, 2021
Developer

MongoDB & Bosch: Discussion on AIoT

For more than a decade, the digital transformation of industry has been focused on the technologies that make up the Internet of Things (IoT). As AI and machine learning tech mature, a new field has appeared that combines these trends: AIoT, the Artificial Intelligence of Things, which applies AI to the data collected by IoT devices. Among the firms pioneering this space is the engineering and industrial giant Bosch, which has long been a leader in IoT. The move to AIoT has allowed for Bosch to create smart products that either have intelligence built in, or have “swarm intelligence” in their back end that allows for the collection of data that is used to improve the products. In April 2021, Mark Porter, CTO at MongoDB, and Dirk Slama, VP of Co-innovation and IT/IoT Alliances at Bosch, sat down to discuss AIoT. Their conversation saw them touch on what MongoDB and Bosch are working on around AIoT, and where they see the future of AIoT heading. Bosch’s new focus on AIoT has enhanced their need for a flexible, modern data platform such as MongoDB. IoT devices collect enormous amounts of data; as Bosch adds sensors and new types of data to their products, MongoDB has allowed it to adapt quickly without having to go through a schema redesign whenever they need to implement a change to their products. As part of their efforts to progress AIoT technology, Bosch and other companies recently created the AIoT User Group , an initiative open to anyone. The group’s goal is to bring end users working on AIoT business and use cases together with technology experts to share best practices around AIoT solutions. This co-creation approach allows for the rapid utilization of best practices to try out and develop new ideas and technologies. Porter and Slama’s conversation covered many AIoT topics — and a glimpse at the technology’s next steps. For instance, Slama wants to see agility added to AIoT without losing control. In AIoT, there are many features that must be perfect on day one; but there are also a lot of features where you want to continuously improve system performance, which requires an agile approach. For Mark Porter and Dirk Slama's full conversation, check out the video below!

August 11, 2021
Developer

Ready to get Started with MongoDB Atlas?

Start Free