Paresh Saraf

18 results

Build Analytics-Driven Apps with MongoDB Atlas and the Microsoft Intelligent Data Platform

Customers increasingly expect engaging applications informed by real-time operational analytics, yet meeting these expectations can be difficult. MongoDB Atlas is a popular operational data platform that makes it straightforward to manage critical business data at scale. For some applications, however, enterprises may also want to apply insights gleaned from data warehouse, business intelligence (BI), and related solutions, and many enterprises depend on the Microsoft Intelligent Data Platform to apply analytics and governance solutions to operational data stores. MongoDB and Microsoft have partnered to make it simple to use the Microsoft Intelligent Data Platform to glean and apply comprehensive analytical insights to data stored in MongoDB. This article details how enterprises can successfully use MongoDB with the Microsoft Intelligent Data Platform to build more engaging, analytics-driven applications. Microsoft Intelligent Data Platform + MongoDB MongoDB Atlas provides a unified interface for developers to build distributed, serverless, and mobile applications with support for diverse workload types including operational, real-time analytics, and search. With the ability to model graph, geospatial, tabular, document, time series, and other forms of data, developers don’t have to go for multiple niche databases, which results in highly complex, polyglot architectures. The Microsoft Intelligent Data Platform offers a single platform for databases, analytics, and data governance by integrating Microsoft’s database, analytics, and data governance products. In addition to all Azure database services, the Microsoft Intelligent Data Platform includes Azure Synapse Analytics for data warehousing and analytics, Power BI for BI reporting, and Microsoft Purview for enterprise data governance requirements. Although customers have always been able to apply the Microsoft Intelligent Data Platform services to MongoDB data, doing so hasn't always been as simple as it could be. Through this new integration, customers gain a seamless way to run analytics and data warehousing operations on the operational data they store in MongoDB Atlas. Customers can also more easily use Microsoft Purview to manage and run data governance policies against their most critical MongoDB data, thereby ensuring compliance and security. Finally, through Power BI customers are empowered to easily query and extract insights from MongoDB data using powerful in-built and custom visualizations. Let’s deep dive into each of these integrations. Operationalize insights with MongoDB Atlas and Azure Synapse Analytics MongoDB Atlas is an Operational Data Platform which can handle multiple workload types including transactional, search, operational analytics, etc. It can cater to multiple application types including distributed, serverless, mobile, etc. For data warehousing workloads, long-running analytics, and AI/ML, we compliment Azure Synapse Analytics very well. MongoDB Atlas can be easily integrated as a source or as a sink resource in Azure Synapse Analytics. This connector is useful to: Fetch all the MongoDB Atlas historical data into Synapse Retrieve incremental data for a period based on filter criteria in a batch mode, to run SQL based or Spark based analytics The sink connector allows you to store the analytics results back to MongoDB, which can then power applications enabled on top of it. Many enterprises require real-time analytics, for example, in fraud detection, anomaly detection of IoT devices, predicting stock depletion, and maintenance of machinery, where a delay in getting insights could cause serious repercussions. MongoDB and Microsoft have worked together to come up with the best practice architecture for the same which can be found in this article . Figure 1: Schematic showing integration of MongoDB with Azure Synapse Analytics. Business intelligence reporting and visualization with PowerBI Together, MongoDB Atlas and Microsoft PowerBI offer a sophisticated real-time data platform, providing customers with the ability to present specialized operational and analytical query engines on the same data sets. Information on connecting from PowerBI desktop to MongoDB is available in the official documentation . MongoDB is also excited to announce the forthcoming MongoDB Atlas Power BI Connector that will expose the richness of the JSON document data with Power BI (see Figure 2). This MongoDB Atlas Power BI Connector allows users to unlock access to their Atlas cloud data. Figure 2: Schematic showing integration of MongoDB and Microsoft Power BI. Beyond providing mere access to MongoDB Atlas data, this connector will provide a SQL interface to let you interact with semi-structured JSON data in a relational way, thereby ensuring you can take full advantage of Power BI's rich business intelligence capabilities. Importantly, through the connector, support is planned for two connectivity modes: import and direct. This new MongoDB Atlas Power BI Connector will be available in the first half of 2023. Conclusion Together with the Microsoft Intelligent Data Platform offerings, MongoDB Atlas can help operationalize the insights driven from customers’ data spread across siloed legacy databases and help build modern applications with ease. With MongoDB Atlas on Microsoft Azure, developers receive access to the most comprehensive, secure, scalable, and cloud–based developer data platform in the market. Now, with the availability of Atlas on the Azure Marketplace, it’s never been easier for users to start building with Atlas while streamlining procurement and billing processes. Get started today through the MongoDB Atlas on Azure Marketplace listing .

January 10, 2023

Migrate to MongoDB Atlas on AWS with Relational Migrator

Competitive advantage is directly tied to how well companies are able to build software around their most important asset: data. Rigid relational schemas often require downtime and significant application code updates in order to make even simple modifications, such as adding a new data attribute. In MongoDB, entities are modeled as documents that map naturally to the same objects that developers are used to working with in their programming languages. Additionally, legacy relational databases were not built to scale horizontally. Sharding data to handle large data volumes and ensure lower latency is typically a significant manual process that requires custom application logic to query across multiple shards and aggregate results. MongoDB Atlas is an effective solution for solving such problems, and MongoDB Relational Migrator streamlines the process of moving to MongoDB from a relational database. MongoDB Atlas on Amazon Web Services (AWS) MongoDB Atlas removes the need for a complex object-relational mapping layer and allows developers to build and release new features quicker. MongoDB Atlas is built to be distributed and to handle sharding transparently to the developer. With Atlas, no application code changes are necessary when an application needs to scale out from a 10MB to a 500TB dataset. MongoDB Atlas is well integrated into the AWS environment, and the document-based database works seamlessly with AWS products. To learn more about common integration and project requirements, refer to Managed MongoDB on AWS . Migrate to Atlas on AWS with Relational Migrator Some customers have successfully migrated their relational workloads to MongoDB Atlas on AWS. One example is Cox Automotive, whose system was hitting limitations on the relational database. The company migrated to Atlas on AWS and leveraged capabilities like Atlas Data Lake (powered by Amazon Simple Storage Service (Amazon S3)) and Atlas App Services . Read our customer case study to learn more about how Cox Automotive uses MongoDB Atlas . At the same time, other companies have struggled with how to approach this challenge. When considering such a migration, it’s important to think carefully about data modeling. Although it’s possible to naively move a relational schema into MongoDB without any changes, this approach won’t deliver many of MongoDB’s benefits. A better practice is to design a new and better MongoDB schema that’s more denormalized and potentially to take the opportunity to revise the architecture of the application as well. To make this process easier, we’re developing MongoDB Relational Migrator . Relational Migrator streamlines the process of moving to MongoDB from a relational database and is compatible with Oracle, Microsoft SQL Server, MySQL, and PostgreSQL. MongoDB Relational Migrator connects to a relational database to analyze its existing schema, then helps architects design and map to a new MongoDB schema. Migration support When you’re ready, Relational Migrator will perform the data migration from the source RDBMS to MongoDB. Migration can be a one-shot migration if you’re prepared for a hard cutover. And soon, we will also support a continuous sync in case you need to leave the source system running and continue pushing changes into MongoDB. With Relational Migrator, you can map your relational schema—or just a piece of it, if needed—to a new MongoDB schema. Relational Migrator helps the design and mapping process with common MongoDB schema design patterns built in. Based on this schema mapping, you can move data into a target MongoDB cluster. Relational Migrator will support both Snapshot (one-time) and continuous data migration. Get more details on Relational Migrator on the product page or in the deep dive presentation from MongoDB World. Get started with MongoDB Atlas in AWS Marketplace today.

November 3, 2022

Migrating Terabytes of IoT Data From a Legacy NoSQL Database to MongoDB Atlas With MongoDB's Custom Migration Tool

In 2020, a large European energy company began an ambitious plan to replace its traditional metering devices — all 7.6 million of them — with smart meters. That would allow the energy company to monitor gas use remotely and allow customers’ bills to more accurately reflect their energy consumption. At the same time, the company began installing smart components along their production network to monitor operations in real-time, manage alarms, use predictive maintenance tools, and find leaks using advanced technologies. The energy company knew this shift would result in a massive amount of data coming into their systems, and they thought they were ready for it. They understood the complexities of managing and leveraging data from the Internet of Things (IoT), such as the high velocity at which data must be ingested and the need for time-based data aggregations. They rolled out an IoT platform with big data and analytics tools to help them make progress toward their objectives of high-quality, efficient, and safe service. This article examines how the company migrated its system to MongoDB Atlas in order to handle the massive influx of data. Managing data The energy company was managing 3 TB of data on its NoSQL database, with the remainder housed and managed on a relational database. However, it started facing challenges, including a lack of scalability, increasing costs, and poor performance. The costs to maintain the pre-production and production environments were unsustainable, and the situation wasn’t going to get better: By 2023, the energy company planned to increase the number of IoT devices and sensors by a factor of five. They needed a viable solution for the long term. Migrating to MongoDB Atlas The energy company decided to migrate to MongoDB Atlas for several reasons. Atlas’s online archive, combined with the ability to create time-series sharded collections, makes Atlas an ideal fit for IoT data, as does the flexibility of the document data model. Additionally, an API that was compatible with the existing database would minimize the impact on application code and make it easier to migrate applications. The customer chose PeerIslands to be its technical partner and help them with the migration. PeerIslands, a MongoDB partner, is an enterprise-class digital transformation company with an expert, multilingual team with significant experience working across multiple technologies and cloud platforms. PeerIslands has developed solutions for both homogenous and heterogenous workload migrations. Among these solutions is a MongoDB migration tool that helps perform one-time migrations and change data capture while minimizing downtime. The tool is fully GUI-based, and tasks such as infrastructure provisioning, dump and restore, change stream listeners, and processors have all been automated. For change capture, the tool uses the native MongoDB change stream APIs. Migration challenges In working with the energy company to perform the migration, the PeerIslands team faced two particular challenges: The large volume of data. Initial snap-shotting of the data would take about a day. The application had significant write loads. On average, it was writing about 12,000 messages per second. However, the load was unevenly distributed, with spikes when devices would “wake up” and report their status. These two factors quickly generated close to 20 million change events that had to be synced to MongoDB. Meanwhile, new data was constantly being written into the source. Migration tool PeerIslands’ migration tool uses mongodump and mongorestore for one-time data migration and MongoDB Kafka Connector for real-time data synchronization. By using Apache Kafka, the migration tool was able to handle the large amount of change stream data and successfully manage the migration. To address the complexity of the migration, PeerIslands also enhanced the migration tool with additional capabilities: Parallelize the Kafka change stream processing using partitions. The Kafka partitioning strategy was in sync with the target Atlas sharding strategy. Use ReplaceOneBusinessKeyStrategy as the write model for Kafka MongoDB sink connector to write into sharded Atlas collections. By using its in-house tooling, PeerIslands was able to successfully complete the migration with near zero downtime. Improved performance With the migration complete, the customer has already begun to realize the benefits of MongoDB Atlas for their massive amounts of IoT data. The user interface has become extremely responsive, even in front of more expensive queries. Because of the improved performance of the database, the customer is now able to pursue improvements and efficiencies in other areas. With better performance, the company expects consumption of the data to rise and their schema design to evolve. They’re looking to leverage the time-series benefits of MongoDB both to simplify their schema design and deliver richer IoT functionality. They’re also better equipped to rapidly respond to and fulfill business needs, because the database is no longer a limitation. Importantly, costs have decreased for the production environment, and even more dramatic cost reductions have been seen for the pre-production environment. Learn more about the migration tool and MongoDB’s time series capabilities . Interested in your own data migration? Contact us .

July 11, 2022

Faster Migrations to MongoDB Atlas on Google Cloud with migVisor by EPAM

As the needs of Google Cloud customers evolve and shift towards new user expectations, more and more customers are choosing MongoDB as an ideal alternative to legacy databases. Together, we’ve partnered with users looking to digitize and grow their businesses (such as Forbes ), or meet increased demand due to COVID (such as our work with Boxed , the online grocer) by scaling up infrastructure and data processing within a condensed time frame. As a fully-managed service within the Google Cloud Marketplace , MongoDB Atlas enables our joint customers to quickly deploy applications on Google Cloud with a unified user experience and an integrated billing model. Migrations to managed cloud database services vary in complexity, but even under the most straightforward circumstances, careful evaluation and planning is required. Customer database environments often leverage database technologies from multiple vendors, across different versions, and can run into thousands of deployments. This makes manual assessment cumbersome and error prone. This is where EPAM Systems , a provider with strategic specialization in database and application modernization solutions, comes in. EPAM’s database migration assessment tool, migVisor , is a first-of-its-kind cloud database migration assessment product that helps companies analyze database workloads, configuration, and structure to generate a visual cloud migration roadmap that identifies potential quick wins as well as challenge areas. migVisor identifies the best migration path for databases using sophisticated scoring logic to rank the complexity of migrating them to a cloud-centric technology stack. Previously applicable only to migrations from RDBMS to cloud-based RDBMS, migVisor is now available for MongoDB to MongoDB Atlas migrations. migVisor helps you: Analyze migration decisions objectively by providing a secure assessment of source and target databases that’s independent of deployed environments Accelerate time to migration by automating the discovery and assessment process, which reduces development cycles from a few weeks to a few days Easily understand tech insights by providing a visual overview of your entire journey, enabling better planning and improving stakeholder visibility Reduce database licensing costs by giving you intelligent insights on the target environment and recommended migration paths Key features of migVisor for MongoDB For several years, migVisor by EPAM has delivered automated assessments that have helped hundreds of customers migrate their relational databases to cloud-based or cloud-native databases. Now, migVisor adds support for the world’s leading modern data platform: MongoDB. As part of the initial release, migVisor will support self-managed MongoDB to MongoDB Atlas migration assessments. We plan to support TCO for MongoDB migrations, application modernization, migration assessment, and relational MongoDB migration assessments in future releases. MongoDB is also a natural fit for Google Cloud’s Open Cloud strategy of providing customers a broad set of fully managed database services, as Google Cloud's own GM and VP of Engineering & Databases, Andi Gutmans, notes: We are always looking for ways to simplify migrations for our customers. Now, with EPAM's database migration assessment tool, migVisor, supporting MongoDB Atlas, our customers can easily complete database assessments—including TCO analyses and migration complexity assessments, and generate comprehensive migration plans. A simplified migration experience combined with our joint Marketplace success enables customers to consolidate their data workloads into the cloud while making the development and procurement process simple—so users can focus more on innovation. How the migVisor assessment works migVisor analyzes source databases (on-prem or in any cloud environment) for migration assessment to a new target. The assessment includes the following steps: The simple-to-use migVisor Metadata Collector (mMC) collects metadata from the source database, including: featureCompatibilityVersion value, journaling status for data bearing nodes, MongoDB storage size used, replica set configuration, and more. Figure 1: mMC GUI Edit Connection Screen On the migVisor Analysis Dashboard you can select the source/target pair (e.g., MongoDB to MongoDB Atlas on Google Cloud). Figure 2: Source and Target Selection In the migVisor console, you can then view the automated assessment output that was created from migVisor’s migration complexity scoring engine, including classification of the migration into high/medium/low complexity and identification of potential migration challenges and incompatibilities. Figure 3: Source Cluster Features Finally, you can also export the assessment output in CSV format for further analysis in your preferred data analysis/reporting tool. Conclusion Together, Google Cloud and MongoDB have successfully worked with many organizations to streamline cloud migrations and modernize their legacy landscape. To build on the foundation of providing our customers with the best-in-class experience, we’ve closely worked with Google Cloud and EPAM Systems to integrate MongoDB Atlas with migVisor. Because of this, customers will now be able to better plan migrations, reduce risk and avoid missteps, identify quick wins for TCO reduction, review migration complexities, and appropriately plan migration phases for the best outcomes. Learn more about how you can deploy, manage, and grow MongoDB on Google Cloud on our partner page . If you’d like guidance and migration advice, please reach out to mdb-gcp-marketplace@mongodb.com to get in touch with the Google, MongoDB, and EPAM Sales teams.

January 21, 2022

PeerIslands Cosmos DB Migrator Tool to MongoDB Atlas on Google Cloud

When you’re in the midst of innovating, the last thing you want to worry about is infrastructure. Whether you’re looking to streamline inventory management or reimagine marketing, you need applications that can scale fast and maintain high availability. That’s where MongoDB Atlas on Google Cloud comes in. With MongoDB Atlas’ general-purpose, document-based database, users can free themselves from the hassle of database management, and give back precious time to developers to focus on innovation. Combine these benefits with Google Cloud’s cloud computing power, high availability, and ability to integrate with tools like BigQuery, Dataflow, Dataproc and more, and it’s hard to find a comparable joint solution. In fact, many current Microsoft Azure Cosmos DB users are now considering making the move to MongoDB. Microsoft’s Cosmos DB only supports single partition transactions, has no schema governance and forces developers to work with five different APIs to deliver full application functionality. Conversely, MongoDB Atlas on Google Cloud supports distributed multi-document ACID transactions, includes schema governance, and offers integrated full-text search, auto-archiving, data lakes, and edge-to-cloud data sync. The following blog illustrates how PeerIslands’ Cosmos DB Migrator tool can help users move from Cosmos DB to MongoDB Atlas on Google Cloud. Why PeerIslands PeerIslands is an enterprise-class digital transformation company composed of a team of polyglots who are comfortable across multiple technologies and cloud platforms. As a services firm, PeerIslands is focused on helping customers with both cloud-native development and application transformation. With best-in-the-industry talent, PeerIslands has been working with the MongoDB team to build a suite of solutions around two key objectives: For a customer evaluating MongoDB, how can we rapidly address common questions? Once a customer has chosen MongoDB, how can we reduce time to value by rapidly migrating workloads to MongoDB? With this in mind, PeerIslands developed a suite of tools around schema generation, understanding MongoDB query performance, as well as helping customers understand code changes required for upgrading MongoDB versions. In terms of workload migrations, PeerIslands developed solutions for both homogenous and heterogenous migrations. The company is also contributing to the open source community with a mobile app for enabling MongoDB admins to manage Atlas on the go. PeerIslands' Cosmos DB migration use case The current approach for migrating data from Cosmos DB to MongoDB is to use MongoDB dump and restore. But there are several problems with this approach. It’s fully manual and CLI-based which creates a poor user experience and requires technical resources even for simple migrations. There’s a lack of change capture capability which requires downtime during the duration of migration. For large Cosmos DB migrations, this causes significant issues. The team is also under pressure to deliver the entire migration in a short period of time. Migrations often get delayed as customers have difficulty identifying the right migration window. The Cosmos to MongoDB tool is a “Live Migrate” like tool that helps perform one-time migrations and change data capture from Cosmos DB (MongoDB model) to MongoDB Atlas and minimizes downtime requirements associated with migrations. The tool is fully GUI-based and nearly everything is automated. All the tasks for infrastructure provisioning, dump & restore, change stream listeners and processors have all been automated with a graphical user interface (GUI). The Cosmos to Mongo migration tool uses native MongoDB tools and the performance is similar to native tools. For change capture, we leverage the native MongoDB change stream APIs. A high level view of the solution is provided in figure 1 below: Figure 1: Solution Map Migration steps: Migration configuration: Provide the name of the migration task, source Cosmos DB details, and target MongoDB details. The tool supports key vault integration as well. Migration infrastructure provisioning: Provide migration infrastructure details required for creating the VM (Virtual Machine) including location, type of VM instance, etc. Migration execution: Allow for automation of the migration once the configuration is complete. The migration is executed in 3 steps: backup, restore and change event processing. As a user, you can initiate the backup process. The change event listener is started in parallel with the backup process and captures all the changes. Once the backup is complete, the user can restore the initial data and then perform change event processing to apply all the changes to MongoDB. Migration validation: The tool also provides facilities for validating the migration. Users can view the total number of documents on both source Cosmos DB collection and target MongoDB collection. They can also compare random documents picked up from Cosmos DB and MongoDB side by side and validate whether the data elements have been loaded correctly. For a more detailed demo and description of events, watch the following video: Migrating to a new database can feel daunting at first, but PeerIslands Cosmos DB migrator makes it easy. Major concerns like delays and downtime are eliminated from the process, helping you run your business smoothly and reap the benefits of MongoDB more quickly. And with PeerIslands suite of tools, you can rapidly address MongoDB-specific questions and accelerate time to value. Reach out today to get started

December 9, 2021

Make Migrating to MongoDB Atlas on AWS Easy with PeerIslands Modernization Tool Set

As cloud computing becomes commonplace across industries, organizations are rapidly adopting MongoDB Atlas because they know that true modernization is about more than just moving data as-is to the cloud—i.e. taking a “lift and shift” approach. It’s also about remodeling that same data along the way for faster and more iterative development. With MongoDB’s document-based database, developers are empowered to reimagine how they build with flexible schema design that allows them to easily model and remodel data for a wide range of use cases, while still applying governance when needed. MongoDB Atlas maps naturally to modern object-oriented programming languages, making developers' lives much easier. In contrast to the rigidity of SQL databases, MongoDB’s flexible data model means that your database schema can evolve with business requirements. This helps users build applications faster, handle diverse data types and manage applications more efficiently at scale. As a fully-managed service, MongoDB Atlas takes care of database maintenance for you and can also be scaled within and across multiple distributed data centers, providing new levels of availability and scalability previously unachievable with relational databases. The advantages of moving to MongoDB Atlas are clear, but some companies may still feel reluctant to leave behind the legacy relational databases they’re familiar with for unknown territory. This is where PeerIslands comes in. With PeerIslands, you don’t have to go it alone. The following blog introduces PeerIslands’ modernization capabilities, and how you can leverage them to migrate seamlessly to MongoDB Atlas on AWS. Why PeerIslands? PeerIslands is an enterprise-class digital transformation company composed of a team of polyglots who are comfortable across multiple technologies and cloud platforms. As a services firm, PeerIslands is focused on helping customers with both cloud-native development, and applications transformation. With best-in-the-industry talent, the team has helped several Fortune 50 companies bring large-scale transformations to life, and has received recognition from several clients and partners, including MongoDB. With engineers trained and certified in MongoDB, PeerIslands has helped MongoDB’s ISV and retail customers modernize, moving software built for on-prem to SaaS environments more conducive to cloud environments, and was named MongoDB’s Boutique System Integrator Partner of the Year . PeerIslands can swiftly transform and migrate core, legacy, and on-premises applications to the cloud. They develop solutions based on cutting-edge microservices and serverless architecture across public cloud platforms and hybrid PaaS platforms to help users quickly get applications to customers and business users. How PeerIslands can help PeerIslands has been working with MongoDB and AWS to develop tools that address two key objectives for customers: Objective 1: Tools that address common customer questions when evaluating MongoDB MongoDB Test Data Generator: A fully UI-driven tool with an extensive data library for rapidly loading MongoDB with use-case specific, near real-world data at scale MongoDB Performance Testing tool: A performance analyzer where you can create multiple load profiles, run-use case specific MongoDB queries and understand the performance of the queries. With the test data generator and the performance testing tool, customers can get a clear view of the performance of MongoDB for their specific situation even before migrating to MongoDB MongoDB Schema Generator and Data Modeler: SchemaGen tool helps to rapidly generate draft JSON schema from your existing SQL schema. On top of this, you can then perform the data modeling exercise and generate schema to form your MongoDB schema. The schema generator also provides key information about the SQL DB like size, index, and more MongoDB Sizer: MongoDB sizing tool helps you understand the size implications of your schema and calculate Atlas sizing. With the MongoDB sizer, customers can upload their own schema and calculate the various factors that influence the Atlas sizing Codescanner: A tool for scanning your code repositories for deprecated MongoDB APIs. With the code scanner, customers can get a clear view of the application impact for upgrading MongoDB versions Objective 2: Tools that accelerate time to value by rapidly moving workloads to MongoDB COSMOS2Atlas migration: A point-and-click solution that helps COSMOS customers migrate data from COSMOS to MongoDB. This solution provides change capture capability to ease downtime requirements and makes data migration easy and seamless 1Data: A tool for addressing more complex requirements of migrating data from SQL to MongoDB Admin mobile app: A mobile app for admins to track key Atlas KPIs and approve common access requests on the go PeerIslands brings to the table an entire suite of tools for addressing all your MongoDB needs. PeerIslands use-case featuring 1Data tool One of the key requirements of modernization projects is to solve large-scale data migrations from SQL databases. There are a number of tools that are available which simply replicate data from SQL to MongoDB—but, we rarely use the same SQL schema in MongoDB. Schema transformation—however difficult to do at scale—is nonetheless required so that we can make the best use of MongoDB capabilities. Today, the typical approach is to run custom Spark jobs as they are scalable and flexible when it comes to processing schema transformations and loading the data into MongoDB. But when you go beyond migrating one or two tables in a Proof of concept (PoC) setting, the problem becomes much more complex. For instance, writing custom Spark programs for every schema transformation is cumbersome and error-prone. For even simple migrations we will have tens of Spark programs. Any defects that occur during transformation are going to cause significant issues. Also consider the following challenges: How do you extract data out of your SQL database without impacting database performance? How do you handle infrastructure provisioning and scaling? How do you orchestrate the migration? Few master tables can be migrated once but transaction tables may need both one-time migration and a daily incremental migration. How can you do this orchestration at scale? How do you know whether you have not lost data during migration? Last but not the least, once a data is migrated how do you keep it up to date? We will probably end up with a suite of tools to address these issues–SQOOP, Kafka, Spark, some kind of a job orchestration engine, an observability suite, notification workflow and so on. It will quickly become evident that migrating data from SQL to MongoDB without disrupting business could be the most daunting barrier to adopting MongoDB. Unfortunately, current tools invariably fail for complex heterogeneous migration scenarios and developers end up writing a lot of custom code. Realizing this issue, PeerIslands has been working with MongoDB and AWS to develop 1Data. 1Data is a platform that helps enterprises perform migration and real time synchronization of data between SQL databases and MongoDB. 1Data is designed to complement existing AWS services like DMS in migrating data out of SQL. Key features of 1Data: Data is fully GUI based — There is no coding required 1Data provides a single platform for both one-time migration and continuous updates 1Data is consistent across one-time migration and continuous updates. This provides a good anti-corruption layer for continuous updates The tech stack of 1Data is based on Spark, Kafka among others and is highly scalable 1Data is highly modular and has a well defined API layer. 1Data can be easily extended to your needs 1Data automatically handles all the infrastructure required for migration with AWS quick start templates High Level Solution Architecture 1Data capabilities are realized through a decoupled and highly scalable architecture. The data extract, transformation and load part are independent of each other and can easily be customized based on the specific requirements of the customer. The architecture can orchestrate between batch-based initial loads and streaming-based CDC loads. A Spark, Kafka, and Airflow-based tech stack provides excellent scalability for the 1Data platform to handle large data migrations. Figure 1: 1Data High Level solution architecture OneData Portal structures migrations using Endpoints, Tasks and DAGs (Directed Acyclic Graphs) Endpoints define source, intermediate and final data locations and can come in the form of files, databases or queues. Endpoints can also be database extracts in S3 from AWS DMS service. Task definition is the second step in the migration. Tasks act on source point and produce data in either staging or destination end point. There are a number of predefined tasks available:Extract, Transformation, Sink and Validation tasks. You can configure both streaming and batch tasks. Defining the DAGs is the final step before actual migration. DAGs are used to define the sequence in which a user wants to execute the defined tasks. The technology components used in 1Data allows for easily handling very large data migrations. Each of the components has been selected such that they can be deployed across multiple cloud platforms and can be scaled easily. Technology Stack details below: Web Portal:                Angular WebAPI:                  Node Configuration Database:          MongoDB Data Transformation & Validation:      Spark Data Extraction:              Sqoop, Spark, DMS Change Data Capture:           Kafka, Debezium Data Sink:                Spark Job/Task Orchestrator:          Airflow PeerIslands has worked with AWS and MongoDB to create a Quick Start for 1Data. With Quick Start, customers can rapidly instantiate 1Data for their migration requirements. To recap, with 1Data Quick start on AWS, we can Perform heterogeneous schema transformation from SQL and load data into MongoDB Atlas on AWS Weave together continuous data updates, incremental data updates and one-time migration using a combination of batch and streaming jobs Orchestrate the migrations tasks Validate the migration ...And all without writing a single line of code! Demo Looking forward A modern, data architecture can help you unlock your business’ full potential, and gain real-time access to the insights you need, when you need them. MongoDB’s document-based database and flexible schema design help you make smarter decisions, cut costs, and take full advantage of AI/ML capabilities to empower your employees and raise customer satisfaction. The decision to migrate off your legacy systems and onto MongoDB is easy—and now the process is, too. Let PeerIslands help you get there. Our best-in-class teams leverage next-generation technologies, including Artificial Intelligence (AI), Augmented Reality (AR), Blockchain, Internet of Things (IoT), Machine Learning (ML), Mobile, and Virtual Reality (VR). Our expertise spans the modern programming stack, and we follow best practices in distributed, agile, and lean principles as well as test-driven development and DevOp. Additional Resources ISV WMP Program Contact aws-isv-workload-migration@amazon.com for details Atlas Quick Start MongoDB Atlas Starter Package Atlas Migration Guide Atlas Migration Pattern Contact us with any questions around modernization with MongoDB, AWS, and PeerIslands.

October 28, 2021

Build a Single View of Your Customers with MongoDB Atlas and Cogniflare's Customer 360

The key to successful, long-lasting commerce is knowing your customers. If you truly know your customers, then you understand their needs and wants and can identify the right product to deliver to them—at the right time and in the right way. However, for most B2C enterprises, building a single view of the customer poses a major hurdle due to copious amounts of fragmented data. Businesses gather data from their customers in multiple locations, such as ecommerce platforms, CRM, ERP, loyalty programs, payment portals, web apps, mobile apps and more. Each data set can be structured, semi-structured or unstructured, delivered as stream or require batch processing, which makes compiling already fragmented customer data even more complex. This has led some organizations to bespoke solutions, which still only provide a partial view of the customer. Siloed data sets make running operations like customer service, targeted marketing and advanced analytics—such as churn prediction and recommendations—highly challenging. Only with a 360 degree view of the customer can an organization deeply understand their needs, wants and requirements, as well as how to satisfy them. A single view of that 360 data is therefore vital for a lasting relationship. In this blog, we’ll walk through how to build a single view of the customer using MongoDB’s database and Cogniflare’s Calledio Customer 360 tool. We’ll also explore a real-world use case focused on sentiment analysis. Building a single view with Calleido's Customer 360 With a Customer 360 database, organizations can access and analyze various individual interactions and touchpoints to build a holistic view of the customer. This is achieved by acquiring data from a number of disparate sources. However, routing and transforming this data is a complex and time-consuming process. Many of the existing Big Data tools often aren’t compatible with cloud environments. These challenges inspired Cogniflare to create Calleido . Figure 1: Calleido Customer 360 Use Case Architecture Calleido is a data processing platform built on top of battle-tested open source tools such as Apache NiFi. Calleido comes with over 300 processors to move structured and unstructured data from and to anywhere. It facilitates batch and real-time updates, and handles simple data transformations. Critically, Calleido seamlessly integrates with Google Cloud and offers one-click deployment. It uses Google Kubernetes Engine to scale up and down based on the demand, and provides an intuitive and slick low-code development environment. Figure 2: Calleido Data Pipeline to Copy Customers From PostgreSQL to MongoDB A real-world use case: Sentiment analysis of customer emails To demonstrate the power of Cogniflare’s Calleido , MongoDB Atlas , and the Customer 360 view, consider the use case of conducting a sentiment analysis on customer emails. To streamline the build of a Customer 360 database, the team at Cogniflare created flow templates for implementing data pipelines in seconds. In the upcoming sections, we’ll walk through some of the most common data movement patterns for this Customer 360 use case and showcase a sample dashboard. Figure 3: Sample Customer Dashboard The flow commences with a processor pulling IMAP messages from an email server (ConsumeIMAP). Each new email that arrives into the chosen inbox (e.g. customer service), triggers an event. Next, the process extracts email headers to determine topline details about the email content (ExtractEmailHeaders). Using the sender's email, Calleido identifies the customer (UpdateAttribute) and extracts the full email body by executing a script (ExecuteScript). Now, with all the data collected, a message payload is prepared and published through Google Cloud Platform (GCP) Pub/Sub (Kafka can also be used) for consumption by downstream flows and other services. Figure 4: Translating Emails to Cloud PubSub Messages The GCP Pub/Sub messages from the previous flow are then consumed (ConsumeGCPPubSub). This is where the power of MongoDB Atlas integration comes in as we verify each sender in the MongoDB database (GetMongo). If a customer exists in our system, we pass the email data to the next flow. Other emails are ignored. Figure 5: Validating Customer Email with MongoDB and Calleido Analysis of the email body copy is then conducted. For this flow, we use a processor to prepare a request body, which is then sent to Google Cloud Natural Language AI to assess the tone and sentiment of the message. The results from the Language Processing API then go straight to MongoDB Atlas so they can be pulled through into the dashboard. Figure 6: Making Cloud AutoML Call with Calleido End result in the dashboard: The Customer 360 database can be used in internal back-office systems to supplement and inform customer support. With a single view, it’s quicker and more effective to troubleshoot issues, handle returns and resolve complaints. Leveraging information from previous client conversations ensures each customer is given the most appropriate and effective response. These data sets can then be fed into analytics systems to generate learnings and optimizations, such as associating negative sentiment with churn rate. How MongoDB's document database helps In the example above, Calleido takes care of copying and routing data from the business source system into MongoDB Atlas, the operational data store (ODS). Thanks to MongoDB’s flexible data structure, we can transfer data in its original format, and subsequently implement necessary schema transformations in an iterative manner. There is no need to run complex schema migrations. This allows for the quick delivery of a single view database. Figures 7 & 8: Calleido Data Pipelines to Copy Products and Orders From PostgreSQL to MongoDB Atlas Calleido allows us to make this transition in just a few simple steps. The tool runs a custom SQL query (ExecuteSQL) that will join all the required data from outer tables and compile the results in order to parallelize the processing. The data arrives in Avro format, then Calleido converts it into JSON (ConvertAvroToJSON) and transforms it to the schema designed for MongoDB (JoltTransformJSON). End result in the Customer 360 dashboard: MongoDB Atlas is the market-leading choice for the Customer 360 database. Here are the core reasons for its world-class standard: MongoDB can efficiently handle non-standardized schema coming from legacy systems and efficiently store any custom attributes. Data models can include all the related data as nested documents. Unlike SQL databases, MongoDB avoids complicated join queries, which are difficult to write and not performant. MongoDB is rapid. The current view of a customer can be served in milliseconds without the need to introduce a caching layer. The MongoDB flexible schema model enables agility with an iterative approach. In the initial extraction, the data can be copied nearly exactly as its original shape. This drastically reduces latency. In subsequent phases, the schema can be standardized and the quality of the data can be improved without complex SQL migrations. MongoDB can store dozens of terabytes of data across multiple data centers and easily scale horizontally. Data can be shared across multiple regions to help navigate compliance requirements. Separate analytics nodes can be set up to avoid impacting performance of production systems. MongoDB has a proven record of acting as a single view database, with legacy and large organizations up and running with prototypes in two weeks and into production within a business quarter. MongoDB Atlas can autoscale out of the box, reducing costs and handling traffic peaks. The data can be encrypted both in transit and at rest, helping to accomplish compliance with security and privacy standards, including GDPR, HIPAA, PCI-DSS, and FERPA. Upselling the customer: Product recommendations Upselling customers is a key part of modern business, but the secret to doing it successfully is that it’s less about selling and more about educating. It’s about using data to identify where the customer is in the customer journey, what they may need, and which product or service can meet that need. Using a customer's purchase history, Calleido can help prepare product recommendations by routing data to the appropriate tools such as BigQuery ML. These recommendations can then be promoted through the call center and marketing teams for both online or mobile app recommendations. There are two flows to achieve this: preparing training data and generating recommendations: Preparing training data First, appropriate data from PostgreSQL to BigQuery is transferred using the ExecuteSQL processor. The data pipeline is scheduled to execute periodically. In the next step, appropriate data is fetched from PostgreSQL, dividing it to 1K row chunks with the ExecuteSQLRecord processor. These files are then passed to the next processor which uses load balancing enabled to utilize all available nodes. All that data then gets inserted to a BigQuery table using the PutBigQueryStreaming processor. Figure 9: Copying Data from PostgreSQL to BigQuery with Calleido Generating product recommendations Next, we move on to generating product recommendations. First, you must purchase Big Query capacity slots, which offer the most affordable way to take advantage of BigQuery ML features. Here, Calleido invokes an SQL procedure with the ExecuteSQL processor, then ensures that the requested BigQuery capacity is ready to use. The next processor (ExecuteSQL) executes an SQL query responsible for creating and training the Matrix Factorization ML model using the data copied by the first flow. Next in the queue, Calleido uses the ExecuteSQL processor to query our trained model to acquire all the predictions and store them in a dedicated BigQuery table. Finally, the Wait processor waits for both capacity slots to be removed, as they are no longer required. Figure 10 & 11: Generating Product Recommendations with Calleido Then, we remove old recommendations through the power of two processors. First, the ReplaceText processor updates the content of incoming flow files, setting the query body. This is then used by the DeleteMongo processor to perform the removal action. Figure 12: Remove Old Recommendations The whole flow ends with copying Recommendations to MongoDB. The ExecuteSQL processor fetches and aggregates the top 10 recommendations per user, all in chunks of 1k rows. Then, the following two processors (ConvertAvroToJSON and ExecuteScript) prepare data to be inserted into the MongoDB collection, by the PutMongoRecord processor. Figure 13: Copy Recommendations to MongoDB End result in the Customer 360 dashboard (the data used here in this example is autogenerated): Benefits of Calleido's 360 customer database on MongoDB Atlas Once the data is available in a centralized operational data store like MongoDB, Calleido can be used to sync it with an analytics data store such as Google BigQuery. Thanks to the Customer 360 database, internal stakeholders can then use the data to: Improve customer satisfaction through segmentation and targeted marketing Accurately and easily access compliance audits Build demand planning forecasts and analyses of market trends Reward customer loyalty and reduce churn Ultimately, a single view of the customer enables organizations to deliver the right message to prospective buyers, funneling those at the brand awareness stage into the conversion stage and ensures retention and post sales mechanics are working effectively. Historically, a 360 view of the customer was a complex and fragmented process, but with Cogniflare’s Calleido and MongoDB Atlas, a Customer 360 database has become the most powerful and cost efficient data management stack that an organization can harness.

October 14, 2021

Simplifying Data Migrations From Legacy SQL to MongoDB Atlas with Studio 3T and Hackolade

Migrating data from SQL relational databases to MongoDB Atlas may initially seem to be a straightforward process. Export the data from the relational database, import the tables into MongoDB Atlas, and then start writing queries for it. But the deeper you look, the more it can start to feel like an overwhelming task. Decades of irrelevant indexes, rare relationships and forgotten fields that need to be migrated all make for a more complicated process. Not to mention, converting your old schemas to work well with the document-based world of MongoDB can take even longer. Making this process easier and more achievable was one of Studio 3T’s main goals when they created their SQL to MongoDB migration tools. In fact, Studio 3T's SQL migration doesn't just make this process easier; it also makes it reliably repeatable. This means you can tune your migration to produce the perfect documents in an ideal schema. There's no big bang cut over; you can proceed at your own pace. Delivering the ability to perfect your new schema is also why we've integrated with Hackolade's schema design and data modeling tools. In this article, we look at how the data modernization process works with a focus on the SQL migration powers of Studio 3T and the data modelling technology of Hackolade. So the problem was, how do I migrate the data that’s been collected over the last 10 years from MySQL over to MongoDB? And that’s when I found Studio 3T. I could not imagine trying to do it myself by hand... If it hadn’t been for Studio 3T, I probably would have just left it all in the SQL database. Rand Nix, IT Director, Wakefield Inspection Services Why MongoDB Atlas Building on MongoDB’s intuitive data model and query API, MongoDB Atlas gives engineering organizations the versatility they need to build sophisticated applications that can adapt to changing customer demands and market trends. Not only is it the only multi-cloud document database available, it also delivers the most advanced security and data distribution capabilities of any fully managed service. Developers can get started in minutes and leverage intelligent automation to maintain performance at scale as applications evolve over time. Mapping the move The crucial component to a SQL migration is mapping the SQL tables and columns to JSON documents and their name/value fields. While the rigid grid of relational tables has a familiar feeling, a document in MongoDB can be any shape and, ideally, it should be the shape that makes sense for your business logic. Doing this by hand can prove extremely time-consuming for developers. Which situations call for reshaping your data into documents and which should you construct new documents for? These are important questions which require time and expertise to answer. Or, you can simply use Studio 3T's SQL Migration tool which is a powerful, configurable import process. It allows you to create your MongoDB documents with data retrieved from multiple relational tables. Using foreign keys, it can capture the relationships between records in document-centric form. "When we started out, we built it on SQL Server. All was good to start, but by the time we got up to fifty customers and more, each with some thousands of their staff logging into the system concurrently, we ran into scaling issues. "Now we’re using a hosted, paid-for subscription on MongoDB Atlas. They do all the management and the sharding and all that stuff. And when we switch to regional provisioning, they can handle all that too, which is a huge relief. "We all use Studio 3T constantly, it’s used as extensively as Visual Studio. We use it for all sorts of things because Studio 3T makes it really easy to navigate around the data." Dan Cummings, Development Mgr. Terryberry Inc. For example, a relational database may have a table of customers and a table of orders by each customer. With the SQL Migration tool, you can automatically create a collection of customer documents, containing an array of all the orders that customer has placed. This exploits the power of the document model by keeping related data local. You can also preview the resulting JSON documents before running a full import so you are sure you're getting what you want. Figure 1. Studio 3T’s SQL to MongoDB Migration In Figure 1 (see above), we can see Studio 3T's SQL Migration pulling in multiple SQL tables, embedding rental records into the customer records, and previewing the JSON output. For more radical restructuring, SQL queries can also provide the source of the data for import. This allows for restructuring of the data by, for example, region or category. Hackolade integrated As you can see, Studio 3T is fully capable of some complex migrations through its tooling, SQL Migration, Import and Reschema. However, if you are of the mindset that you should design your migration first, then this is where Hackolade comes in. Hackolade is a "polyglot data" modelling application designed for a world where models are implemented in everything from graphs to APIs to storage formats. Figure 2. Entity Relationships Mapped in Hackolade Using Hackolade, you can import the schema with relationships from an SQL database and then visually reassemble the SQL fields to compose your MongoDB document schema. Watch this demo video for more details. Once you've built your new models, you can use Studio 3T to put them into practice by importing a Hackolade schema mapping into a SQL Migration configuration. Figure 3. Importing Hackolade Models into Studio 3T This helps eliminate the need to repeatedly query and import from the SQL database as you fine-tune your migration. Instead, you can construct and document the migration within Hackolade, review with your team, and then implement with confidence. Once the data is in MongoDB Atlas, support from Studio 3T continues. Studio 3T's Reschema tools allow you to restructure your schema without re-importing, working entirely on the MongoDB Atlas data. This can be particularly useful when blending data from existing document databases with freshly imported, formerly relational data. The idea that migrations have to be team-breaking chores no longer holds. It is eminently possible, with tools such as Studio 3T's SQL Migration and Hackolade, to not only perform more complex migrations easily, but to be able to design them up front and put them into practice as often as is needed. With both tools working in integrated harmony, the future of migration has arrived.

August 24, 2021

How DataSwitch And MongoDB Atlas Can Help Modernize Your Legacy Workloads

Data modernization is here to stay, and DataSwitch and MongoDB are leading the way forward. Research strongly indicates that the future of the Database Management System (DBMS) market is in the cloud, and the ideal way to shift from an outdated, legacy DBMS to a modern, cloud-friendly data warehouse is through data modernization. There are a few key factors driving this shift. Increasingly, companies need to store and manage unstructured data in a cloud-enabled system, as opposed to a legacy DBMS which is only designed for structured data. Moreover, the amount of data generated by a business is increasing at a rate of 55% to 65% every year and the majority of it is unstructured. A modernized database that can improve data quality and availability provides tremendous benefits in performance, scalability, and cost optimization. It also provides a foundation for improving business value through informed decision-making. Additionally, cloud-enabled databases support greater agility so you can upgrade current applications and build new ones faster to meet customer demand. Gartner predicts that by 2022, 75% of all databases will be on the cloud – either by direct deployment or through data migration and modernization. But research shows that over 40% of migration projects fail. This is due to challenges such as: Inadequate knowledge of legacy applications and their data design Complexity of code and design from different legacy applications Lack of automation tools for transforming from legacy data processing to cloud-friendly data and processes It is essential to harness a strategic approach and choose the right partner for your data modernization journey. We’re here to help you do just that. Why MongoDB? MongoDB is built for modern application developers and for the cloud era. As a general purpose, document-based, distributed database, it facilitates high productivity and can handle huge volumes of data. The document database stores data in JSON-like documents and is built on a scale-out architecture that is optimal for any kind of developer who builds scalable applications through agile methodologies. Ultimately, MongoDB fosters business agility, scalability and innovation. Key MongoDB advantages include: Rich JSON Documents Powerful query language Multi-cloud data distribution Security of sensitive data Quick storage and retrieval of data Capacity for huge volumes of data and traffic Design supports greater developer productivity Extremely reliable for mission-critical workloads Architected for optimal performance and efficiency Key advantages of MongoDB Atlas , MongoDB’s hosted database as a service, include: Multi-cloud data distribution Secure for sensitive data Designed for developer productivity Reliable for mission critical workloads Built for optimal performance Managed for operational efficiency To be clear, JSON documents are the most productive way to work with data as they support nested objects and arrays as values. They also support schemas that are flexible and dynamic. MongoDB’s powerful query language enables sorting and filtering of any field, regardless of how nested it is in a document. Moreover, it provides support for aggregations as well as modern use cases including graph search, geo-based search and text search. Queries are in JSON and are easy to compose. MongoDB provides support for joins in queries. MongoDB supports two types of relationships with the ability to reference and embed. It has all the power of a relational database and much, much more. Companies of all sizes can use MongoDB as it successfully operates on a large and mature platform ecosystem. Developers enjoy a great user experience with the ability to provision MongoDB Atlas clusters and commence coding instantly. A global community of developers and consultants makes it easy to get the help you need, if and when you need it. In addition, MongoDB supports all major languages and provides enterprise-grade support. Why DataSwitch as a partner for MongoDB? Automated schema re-design, data migration & code conversion DataSwitch is a trusted partner for cost-effective, accelerated solutions for digital data transformation, migration and modernization through a modern database platform. Our no-code and low-code solutions along with cloud data expertise and unique, automated schema generation accelerates time to market. We provide end-to-end data, schema and process migration with automated replatforming and refactoring, thereby delivering: 50% faster time to market 60% reduction in total cost of delivery Assured quality with built-in best practices, guidelines and accuracy Data modernization: How “DataSwitch Migrate” helps you migrate from RDBMS to MongoDB DataSwitch Migrate (“DS Migrate”) is a no-code and low-code toolkit that leverages advanced automation to provide intuitive, predictive and self-serviceable schema redesign from a traditional RDBMS model to MongoDB’s Document Model with built-in best practices. Based on data volume, performance, and criticality, DS Migrate automatically recommends the appropriate ETTL (Extract, Transfer, Transform & Load) data migration process. DataSwitch delivers data engineering solutions and transformations in half the timeframe of the existing typical data modernization solutions. Consider these key areas: Schema redesign – construct a new framework for data management. DS Migrate provides automated data migration and transformation based on your redesigned schema, as well as no-touch code conversion from legacy data scripts to MongoDB Atlas APIs. Users can simply drag and drop the schema for redesign and the platform converts it to a document-based JSON structure by applying MongoDB modeling best practices. The platform then automatically migrates data to the new, re-designed JSON structure. It also converts the legacy database script for MongoDB. This automated, user-friendly data migration is faster than anything you’ve ever seen. Here’s a look at how the schema designer works. Refactoring – change the data structure to match the new schema. DS Migrate handles this through auto code generation for migrating the data. This is far beyond a mere lift and shift. DataSwitch takes care of refactoring and replatforming (moving from the legacy platform to MongoDB) automatically. It is a game-changing unique capability to perform all these tasks within a single platform. Security – mask and tokenize data while moving the data from on-premise to the cloud. As the data is moving to a potentially public cloud, you must keep it secure. DataSwitch’s tool has the capability to configure and apply security measures automatically while migrating the data. Data Quality – ensure that data is clean, complete, trustworthy, consistent. DataSwitch allows you to configure your own quality rules and automatically apply them during data migration. In summary: first, the DataSwitch tool automatically extracts the data from an existing database, like Oracle. It then exports the data and stores it locally before zipping and transferring it to the cloud. Next, DataSwitch transforms the data by altering the data structure to match the re-designed schema, and applying data security measures during the transform step. Lastly, DS Migrate loads the data and processes it into MongoDB in its entirety. Process Conversion Process conversion, where scripts and process logic are migrated from legacy DBMS to a modern DBMS, is made easier thanks to a high degree of automation. Minimal coding and manual intervention are required and the journey is accelerated. It involves: DML – Data Manipulation Language CRUD – typical application functionality (Create, Read, Update & Delete) Converting to the equivalent of MongoDB Atlas API Degree of automation DataSwitch provides during Migration Schema Migration Activities DS Automation Capabilities Application Data Usage Analysis 70% 3NF to NoSQL Schema Recommendation 60% Schema Re-Design Self Services 50% Predictive Data Mapping 60% Process Migration Activities DS Automation Capabilities CRUD based SQL conversion (Oracle, MySQL, SQLServer, Teradata, DB2) to MongoDB API 70% Data Migration Activities DS Automation Capabilities Migration Script Creation 90% Historical Data Migration 90% 2 Catch Load 90% DataSwitch Legacy Modernization as a Service (LMaas): Our consulting expertise combined with the DS Migrate tool allows us to harness the power of the cloud for data transformation of RDBMS legacy data systems to MongoDB. Our solution delivers legacy transformation in half the time frame through pay-per-usage. Key strengths include: ● Data Architecture Consulting ● Data Modernization Assessment and Migration Strategy ● Specialized Modernization Services DS Migrate Architecture Diagram Contact us to learn more.

May 13, 2021

Accelerate Data Modernization with Infosys Data Model Converter

Are you in the process of migrating applications from a relational database to MongoDB? If so, you’re likely trying to best understand and decide how your enterprise data needs to be modeled. Our previous blog discussed how Infosys Data Services Suite can help enterprises move data seamlessly from legacy relational databases to MongoDB. But moving data is only one part of the puzzle. The more significant step is choosing the target data model, or schema design, a process that usually requires several hours of highly skilled talent. That’s why we created this follow-up blog to help you get started. Rethinking Schema Design Ultimately, schema design can be the difference between an inefficient, disorganized database and a strategic one that empowers the entire company. Schema design in MongoDB requires a change in perspective for data architects, developers, and database administrators. They have to: Rethink the legacy relational data model. This model flattens data into rigid two-dimensional tabular structures of rows and columns. The new data model is a rich and dynamic one with embedded sub-documents and arrays Rethink how the data platform works. In relational databases, it is extremely difficult to change the data platform as the application evolves. However, in MongoDB, the apps and APIs come first and the data platform dynamically accommodates the data Getting Schema Design Right Begin the schema design process by considering the application’s requirements. You’ll want to model the data in a way that leverages the flexibility of the document model. In schema migrations, it may seem easy at first to simply mirror the flat schema of the relational database in the document model. However, this negates the advantages enabled by the rich and embedded data structures of the document model. For example, data that belongs to a parent-child relationship in two RDBMS tables can be collapsed (embedded) into a single document in MongoDB. The application data access patterns should also drive schema design with a specific focus on: The read/write ratio of database operations and whether it is more important to optimize the performance of one operation over another The types of queries and updates performed by the databases The lifecycle of the data and growth rate of documents Simplifying Schema Design with Infosys Data Model Converter Infosys has developed a solution called Infosys Data Model Convertor that processes source relational schema and the above-mentioned signals as inputs and automatically provides target MongoDB schema suggestions. Infosys Data Model Converter is available as part of Infosys Modernization Suite which accelerates enterprises’ modernization journey. Each schema suggestion is accompanied by a detailed analysis report. The data modeler can use this as a starting point and iterate over the schema to arrive at the final MongoDB schema. The Infosys Data Model Converter reduces 50-60% of the effort typically spent in schema design. Key Features Boosts productivity by augmenting the migration of RDBMS to NoSQL database Saves time by automatically extracting schema, query and data patterns from an existing RDBMS Comprehensively analyzes the RDBMS entity relations, data and read-and-write patterns Applies a rich set of rules and generates a fully compliant NoSQL target state data model Offers flexibility by externalizing the rules for organization-specific customizations Connects and deploys the model to the target NoSQL platform with sample data Discover more ways in which Infosys can help you unlock value from modernization. Contact us for any modernization questions.

April 15, 2021

Optimize Data Modeling and Schema Design with Hackolade and MongoDB

Development teams are constantly searching for new ways to quickly enhance applications and satisfy the rapid progression of customer needs. The dynamic schema evolution in MongoDB enables such a reality through the power and flexibility of storing data in a JSON document format instead of in relational tables. Notably, developers love the flexibility and schema-less nature of the JSON document format. But as application complexity and scale increases in an enterprise environment, this flexibility must be skillfully organized to harness the power of the solution, maximize developer productivity, and lower total cost of ownership. For large enterprises and government agencies, the key is to leverage the benefits of modern applications running on MongoDB Atlas while also ensuring proper data management and governance. This is where a data modeling tool designed specifically for MongoDB will greatly help. Enter Hackolade. For decades, Entity-Relationship Diagrams (ERDs) have been used to visually represent the data structures of relational databases. But ERDs were originally designed for flat structures only. Hackolade , a MongoDB certified technology partner , has enhanced ERD capabilities to accommodate the representation of JSON hierarchical structures with nested objects and arrays. Hackolade is pioneering data modeling and schema design for NoSQL databases and REST APIs. Why it Matters A data model is an abstraction describing and documenting an organization’s information system. It is a collection of Entity-Relationship diagrams, descriptions, constraints, and metadata representing data structures: Hackolade data model for MongoDB A schema, on the other hand, is a “consumable” scope contract describing the layout or structure of a file, a transaction, or a database. It is an authoritative source for producers and consumers to agree on the structure being exchanged or accessed. While data models are useful for humans to understand structure, schemas are the technical artifact necessary for systems to interact. Hackolade provides both, allowing MongoDB customers to easily visualize the data model, intuitively create and enforce schema with MongoDB’s JSON Schema Validator, and iteratively change the schema as the applications evolve. Automatically-generated JSON Schema Validator Customer Benefits Increase data agility with forward-engineering An ERD provides an easy-to-understand picture of your data. As a communication tool, it helps facilitate dialog between application stakeholders like business analysts, designers, architects, developers, and DBAs. With an ERD, you can evaluate different “what if” scenarios, identify the ideal way to denormalize data, and leverage the benefits of MongoDB Atlas technology. Simply apply a query-driven design of the schema after analyzing the access patterns of the application. You can then visualize and evaluate the impacts without writing a line of code—obviously, this is a more productive approach than coding first, then realizing that much needs to be rewritten to accommodate everyone’s needs. The Hackolade software generates several artifacts such as: collection creation with validator script requiring no knowledge of JSON Schema syntax, sample JSON documents, Mongoose schemas, documentation in HTML, Markdown or PDF, plotter output of ERD pictures, document and index sizing estimates, and more. The process is easily integrated into a Jenkins CI/CD pipeline by invoking a flexible Command-Line Interface. Ensure data quality and compliance through schema reverse-engineering Deriving a data model from an existing MongoDB instance is not as easy as fetching a DDL from a relational database. Schemas must be inferred from a representative sample of documents in each collection. Hackolade has perfected its schema inference algorithms to accommodate the flexibility and polymorphism of JSON hierarchical structures. The derived models become a trusted source to feed data dictionaries and data governance suites. Reverse-engineering helps ensure data quality and compliance, with the use of an automated Command-Line Interface process. Facilitate application modernization with the denormalization of legacy data structures Hackolade can import a variety of structures from relational DDLs, logical data models in XSD format, JSON documents and schemas, and Excel templates. To leverage the benefits of MongoDB, these structures should evolve to embed information where applicable and avoid slow JOINs. This should not be done blindly, but based on a proper analysis of the application access patterns in the context of data volume estimates and relationship cardinality. Hackolade provides a handy feature to quickly evolve a relational data model towards a denormalized schema, thereby leveraging the benefits of MongoDB’s document model and facilitating modernization. The process easily hooks into the forward-engineering process described above, generating pictures, scripts, and documentation. Implement continuous evolution and data management The lifecycle of modernized applications does not stop after the initial data migration step. Applications must be successfully operated, and will continue to evolve, resulting in likely schema changes. Hackolade is designed to facilitate agile development approaches and the full lifecycle of modern software. It provides the necessary tooling to design and manage data models and schemas for successful application modernization on MongoDB Atlas. Learn how to maximize developer productivity and lower total cost of ownership using data modeling with Hackolade , and the MongoDB University data modeling advanced course . Download the joint solution brief: MongoDB and Hackolade: Visual Data Modeling for MongoDB Schemas .

March 11, 2021

Capgemini Solutions that help customers modernize applications to MongoDB

Companies across every industry vertical continue to face the challenge of how to effectively migrate and quickly access massive amounts of enterprise data—all while keeping system performance up to par throughout the obstacle-ridden process. The complexities involved with the ubiquitous, traditional Relational Database Management Systems (RDBMS) are many. RDBMS systems can often inhibit performance, falter under heavy volumes and slow down deployment. With MongoDB’s document-based, distributed database, however, performance and volume issues are easily addressed. But when it comes to speeding up time to market? The right auxiliary tools are still needed. Capgemini, a MongoDB partner and global leader in digital transformation, provides the final piece of the puzzle with a new tool rooted in automated intelligence. In this blog, we’ll explore three key Capgemini Solutions that help customers modernize to MongoDB. Tools that expedite time to market Migration from legacy system to MongoDB New development using MongoDB as a backend database Whether your company is developing a new database or migrating from legacy systems to MongoDB, Capgemini’s new Database Convert & Compare (DCC) tool can help. Below, we’ll detail how DCC works, then walk through a few recent, client examples and the immense benefits reaped. Tool: Database Convert & Compare (DCC) A powerful tool developed by the Capgemini team, DCC optimizes activities like database migration, data comparison, validation and much more. The tool can perform data transformations with specific customization based on the source and target database in the scope. When migrating from RDBMS to MongoDB, DCC achieves 70% automation and 30% manual retrofit on a database level. How does DCC work? In context of RDBMS to NoSQL migration, DCC performs the migration in 3 stages. 1) Assessment: Source database schema assessment – DCC extracts source schema information and performs an assessment to generate detailed inventory of data objects such as tables, views, stored procedures and indexes. It also generates a detailed report on data volume from each table which helps in assessing estimated data migration time from source to target Apply analytics to prepare recommendation for target database structure—The target structure varies based on various parameters, such as: Table relationships (one to many, many to many, one to one) Indexes applied on table for performance requirements Column data type 2) Schema Migration Customize tool to apply recommendation from step 1.2 hence generating the script for target database Target schema script preparation – DCC will generate a complete database schema script except for a few object types such as stored procedure, views etc. Produce detailed report of schema migration, inclusive of objects that couldn’t be migrated Manual intervention is required to apply business logic implementation of source database, stored procedures and views to target environment application 3) Data Migration Column mapping – assessment report generates inventory of source database table fields as well as post recommended schema structure; the report also provides recommended field mapping from source to target based on adopted recommendation and DCC customization Post migration data validation script – DCC generates a data validation script after data migration is complete which takes field mapping into consideration from the related assessment and recommendation reports Data migration script for execution – DCC allows for the setup and configuration of different scripts for data migration, such as: One-time data migration from source to target Daily batch run to sync up source and target database data Intermittent data validation during the process of data migrationIf there are any discrepancies found in validation, the job will stop and generate a report with potential root cause of issue in data migration) Standalone data comparison – DCC allows for seclusion of data validation between source and target database. In this case, DCC will generate source database table inventory details and extract target database collection inventory details. Minimal manual intervention is required to perform the field mapping and set the configuration in the tool for data migration execution. Other configuration features such as one time migrations or daily batch migrations can be configured as well. The Capgemini team has successfully implemented and deployed the DCC tool for various banking customers for RDBMS to NoSQL end-to-end migration including for application retrofit and rewiring using other capable tools such as CAP360 Case study 1: Migration from Mainframe to MongoDB for a Large European Investment Bank A large banking client encountered significant challenges in terms of growth and scale-up, low resilience and increased risks, and certainly increasing costs associated with the advent of mobile banking and a related significant increase in volume. To help the client evolve more quickly, Capgemini built an Operational Data Platform to offload expensive mainframe operations, as well as store and process customer transactions for business operations, analysis and reporting. The Challenge: Inefficient and slow to meet customer demand for new digital banking services due to heavy reliance on legacy infrastructure and apps Continued growth in traffic and the launch of new digital services led to increased cost of operations and decreased performance Mainframe was the single point of failure for many applications. Outages resulted in poor customer service, brand erosion, and regulatory concerns The Approach: An analysis of digital channels revealed that 92% of traffic was generated by 25 interaction types, with 85% of these being read-only. To offload these operations from the mainframe, an operational data lake (ODL) was created. MongoDB-based ODL was updated in near real-time via change data capture and messaging queue to power existing apps, new digital services and other APIs. Outcome and Benefits: Accelerated time to market for new digital services, including personalization Improved stand-in capability to support resiliency during planned and unplanned mainframe outages Reduced number of read-only transactions to mainframes (MIPS cost), freeing up resources for additional growth Saved the customer over 80% in year-on-year post migration costs. The new MongoDB database was seamlessly able to handle 25mn+ transactions per day as well as able to handle data volume of over 30 months of history with ~13b transactions held in 114m documents Case study 2: Migration of Large-scale Database from Legacy to MongoDB for US-based Insurance Customer A US-based insurance client faced disparate data spread across 100+ systems, making data aggregation a cumbersome process. The client wanted to access the many data points around a single customer without hindering performance of the entire system. The Challenge: Reconciling different data schemas from multiple systems into a single schema is problematic and, in many cases, impossible. When adding new data sources, it is difficult to iterate on the schema quickly. Providing access to the data within the ‘Single View’ requires ad hoc queries as well as multi-layer indexing and aggregation which becomes complicated for relational databases to provide. Lack of personalization and ability to provide context-based experiences in real time results in lost business opportunities. Approach: In order to assist customer service reps in real-time, we built “The Wall,” a single view application that pulls disparate data from legacy systems for analytics. Additionally, we designed a flexible data model to aggregate disparate data into a single data store. MongoDB’s expressive query language and secondary indexes can reach any field in real time, making data access faster and easier. Our approach was designed based on 4 key foundations: Document Model – Rich and flexible data store. A single document can store up to 16 MB of data. With 20+ data types meant flexibility in terms of managing data Versatility – Variety of structured and non-structured data models defined Analytics – Strong data aggregator framework to aggregate data related to single customer Workload Isolation – Parallel run for operational and analytical workload on same cluster Outcome and Benefits: Our largest insurance customer was able to attain the single view of the customer within 90 days timespan. A different insurance customer achieved 360 degree view of 13 million customers on MongoDB Enterprise Advanced. And yet another esteemed healthcare customer was able to achieve as much as a 300% reduction in processing times and increased processing throughput with 50% less hardware. Ready to accelerate your digital transformation? Capgemini and MongoDB can help you re-envision your data and advance your business processes so you can focus on innovation. Reach out today to get started. Download the Modernization Guide

February 10, 2021