partners

33 results

Application Modernization with gravity9 And MongoDB Atlas: How Digital Decoupling Supports the Customer Offering

The goal of most organizations is pretty clear: to improve customer offerings and become more operationally efficient, streamlined and profitable. But is it possible for organizations to excel in an agile fashion when they are reliant upon legacy systems? It’s the age-old dilemma between risk and innovation. How can you mitigate the former while accelerating the latter? Nearly all organizations operate with some type of legacy system in place that, often, is central to the operation of the business or the customer offering—i.e., one that would be highly costly and disruptive to move away from. To solve this predicament, the process of digital decoupling enables organizations to detach incrementally from legacy systems while acknowledging the critical role they often play. In this blog, we’ll further explore the value of digital decoupling as well as introduce how gravity9 with MongoDB Atlas delivers the smoothest transition possible. Why Not Simply Upgrade? Digital decoupling is not a “big-bang” upgrade where one system is fully replaced by another overnight; rather, it allows for the continued existence of your legacy system as part of your digital architecture while simultaneously unlocking innovation. But why not simply upgrade? Isn’t “out with the old and in with the new” the faster route to take? Not always. When a big-bang upgrade focuses on the replacement of a legacy system that is central to a customer offering or business operations, it becomes a much more complex, risky, and time-intensive undertaking. Often, many months or years can pass before any value is delivered to customers. But while your organization is focusing time and effort on a long-term, large-scale system replacement initiative, your customers’ needs will be changing and your competitors will be continuing to innovate. Once your new system is finally ready to be deployed to the marketplace, there’s a good chance it may already be rendered obsolete. Digital decoupling offers a faster, less risky, and more flexible alternative. The legacy system is maintained as the core of your business, but strategic portions are exposed through modern microservices to allow for the rapid creation of new digital products and offerings. The organization can utilize new, modern technologies to maintain the functionality of the legacy system while building a more advanced, digital architecture around it. By maintaining the existing legacy system, the organization significantly reduces disruption and risk while unlocking the ability to innovate new products on a rapid timescale. How it Works Applying a digital decoupling approach makes quickly innovating new digital products and services on top of your data possible by way of microservices and an event-driven architecture. Microservice Architecture after Digital Decoupling By utilizing event-driven architecture, individual systems and capabilities can be built as fully scalable microservices each with their own database, allowing solutions to be built around each microservice that can be combined to provide limitless additional capabilities and services for customers in a rapid and agile fashion. Digital decoupling creates a customer experience delivered via a modern, feature-rich UI or website that is intuitive, user friendly and continuously evolving, while the legacy system still operates behind the scenes. After years of working with large organizations, the solutions architects at gravity9 have a deep understanding of event-driven architecture as a solution to digital decoupling. Our adherence to domain-driven design is in our DNA , it is how we build solutions and is core to the way we work….We build event-driven microservices on top of monolithic legacy architecture. Noel Ady, gravity9 Founding Partner. By utilizing domain-driven design, system actions are communicated or triggered by way of an event, with colloquialized messages sent between the legacy application and the new architecture via a bus. An adaptor is created to sit in front of the legacy system and speak to your “new IT” in the language of events. This adaptor looks at the data in your legacy system and raises events when changes occur, then optionally writes back changes raised by other systems, allowing your legacy system to participate in the event-driven architecture. The use of APIs ensures the traffic is two-way and non-intrusive to the legacy application so that it can continue to operate as expected. One of the key technology concerns related to adaptors for legacy systems is the concept of a “delta store.” Events in an event-driven architecture should contain the context for the event, often including the previous value, to help receiving systems properly respond to the event. In more modern systems it’s possible to get this data from webhooks or similar alternatives, but these mechanisms won’t exist in older legacy systems so a different approach via a delta store is needed. A delta store will contain the history of changes on a value (the ‘deltas’) to allow the adaptor to properly construct the event context and to ensure that events are only raised for true changes in values. Why MongoDB? MongoDB’s flexible data schema makes it an excellent implementation technology for a delta store, allowing a dynamic mechanism that can flex to new data and event types on demand. gravity9 partners with MongoDB Atlas, MongoDB’s multi-cloud, secure and flexible database service, as an integral technology enabler of digital decoupling to increase flexibility of the resulting architecture. Importantly, Atlas also enhances reliability for mission-critical production databases with continuous backups and point-in-time recovery. It’s secure for sensitive data and automates key processes like infrastructure provisioning, setup and deployment so teams can access the database resources they need, when they need them. Best of all, MongoDB’s features and benefits help free up developer time so they can focus their talent on more innovative tasks. What Should I do Next? The logic is just as important as the physical in digital decoupling when it comes to modelling your events. Utilizing best-practice, domain-driven design alongside a proven approach is the key to success. Together, gravity9 and MongoDB have replicated this success time and time again, enabling organizations to lay the foundations for newer more modern architecture without the disruption of removing their legacy systems. Interested in learning more about MongoDB’s Modernization Program? Contact us today!

May 27, 2021

Infosys Media Platform & MongoDB: Metadata Management and Workflow Orchestration Across Media Supply

Capitalize on current and innovative technologies in the media supply chain with the Infosys Media Platform (IMP). As a part of the cloud and cloud-based Infosys Cobalt ™ portfolio, Infosys’ unifying framework, built on MongoDB Atlas and MongoDB Enterprise Advanced , helps you facilitate creative collaboration, enable productions on an industrial scale, and monetize customer relationships. How? By integrating the various ecosystems involved and providing a common platform to connect you to services and technology solutions for the media content value chain. Infosys’ intelligently-woven media and metadata management framework, leveraging MongoDB’s document model, enables smart workflows and incorporates ML/AI to create, manage and moderate content metadata. This allows the orchestration workflows across different business functions. Additionally, the platform delivers the benefits of productivity, scalability, and agility via the cloud and streamlines collaboration among the ecosystem of partners and technology solutions. Why Infosys Media Platform (IMP)? The Infosys Media Platform consists of various modules that serve critical business functions, such as: The Curation & Digitization Module -- provides the master workflow and ingests content from internal archives and multiple sources using AI/ML to create a composite index of frame level and time-coded metadata of recognized elements (such as celebrities/known personalities, objects, brands, text, and images). It also enables intelligent ad spot identification and includes functions like automated QC, editing, review, and approval, and censor editing The Custom AI Module -- ensures that newly introduced elements such as celebrities, brands, etc., can be continuously trained and recognized. This can also be used to custom-train the AI models to recognize specific content as per customer’s needs The Localization Module -- enables collaboration across multiple locations and global vendors through automatic generation of closed captions and subtitles in multiple languages Metadata Management and Distribution Module -- enables global distribution to digital platforms at scale through standard workflow models and a state-of-the-art dashboard by orchestrating the accumulation of asset level and descriptive time-based metadata from production to delivery With the above modules, Infosys Media Platform can attain the following capabilities: Content Enrichment -- leverages AI models to process video files and generate time-coded metadata for post-production and distribution process Closed Captioning, Subtitling and Localization -- processes audio (dialogs from a video, lyrics from a song, speech from a podcast) and converts it into closed captions and subtitles Content Moderation -- recognizes the presence of mature content (profanity, violence, gore etc.) using the video content and speech detection capabilities Image Processing -- identifies various attributes of an image file, similar to the capabilities on video/music content Metadata Packaging & Distribution -- manages end-to-end supply chain of digital metadata creation, updates, packaging and distribution NLP Based Analytics -- using natural language processing capabilities of the platform, users can review any string of text (dialogs, lyrics, conversations) to determine the context of the conversation as well as the sentiment Why MongoDB for Infosys Media Platform? What company wouldn’t want a database platform that increases developer productivity and data-driven operational efficiency? MongoDB offers both. With MongoDB Atlas , you can reduce the time developers spend managing data and databases, so they can focus on value-added tasks like developing new apps. MongoDB’s document model and query language provide easier access to data, allowing developers to work quickly and efficiently to support new data structures and data types, as well as leverage database-supported roll-ups for analysis. Additionally, MongoDB Atlas introduces 100+ metrics and monitoring capabilities with its complete data platform built to improve operational productivity, so you can work smarter, not harder. A main feature of Infosys Media Platform is its cloud-agnostic nature; and as a cloud-agnostic and multi-cloud data platform, MongoDB is the only platform that satisfies IMP with its ability to run seamlessly across the globe. Through a mixed workload of real-time and transactional analytics, IMP also offers a roadmap on analytics, text search and data visualization capabilities--and MongoDB provides all these features. How MongoDB powers the data platform for Infosys Media Platform (IMP) Data for all the modules previously described is powered by the MongoDB data platform Details like profile data, accounts, ratings, translations, country/region details are stored in MongoDB Audit transactional data are currently running on SQL and there is a roadmap for moving to MongoDB data platform . In the future, MongoDB’s core capabilities will further enhance the Infosys Media Platform and customer experience. Our roadmap includes utilizing MongoDB ACID transactional capabilities to store audit details, as well as using MongoDB functions and triggers for creating cloud agnostic serverless functions. Additionally, MongoDB Atlas may be leveraged for full-text search capabilities applied to media data, and to create charts for dashboarding and real-time analytics of user subscriber data. Download the Modernization Guide

May 19, 2021

Build Applications and APIs Faster with MongoDB Atlas and AWS App Runner

MongoDB is excited to be a launch partner for AWS App Runner . Most technology organizations today are in the process of adopting DevOps practices and speeding up the development of new containerized apps. However, deploying these apps to production environments has often required additional infrastructure management, thus reducing the agility that containers promise. MongoDB Atlas provides database infrastructure on demand, simplifying the data layer of DevOps deployments. Atlas makes it simple to create, scale, and tear down databases as needed, smoothing your deployment pipeline. With AWS App Runner, it’s now even easier to deploy the application component. Given your source code or container image, AWS App Runner builds and deploys the application or API and handles load balancing, scaling, monitoring, and more. There’s no need to manage servers or containers yourself: AWS App Runner handles all of the details. AWS App Runner-based applications connect seamlessly with MongoDB Atlas, addressing both the application layer and the data layer deployment. Developers can now leverage the scalability and performance of our global, cloud-native database service for their App Runner applications. MongoDB Atlas & FastAPI on App Runner Let’s dive into deploying a real API application with AWS App Runner and Atlas to see just how straightforward it is! We’ll demonstrate an example of a microservice API managing MongoDB documents. For this blog, we’re building off the Getting Started with MongoDB and FastAPI post over on the MongoDB Developer Hub . This app sets up a FastAPI endpoint connected to MongoDB Atlas which provides a basic API to create, read, update, and delete a collection of student documents. The app is self-contained in a GitHub repository: https://github.com/mongodb-devloper/mongodb-atlas-fastapi . To get started, you simply need your MongoDB Atlas and AWS accounts. Step 0) Get a MongoDB Cloud Account If you don’t have an Atlas account yet, you can follow the "Get Started with Atlas" guide to create your account. Step 1) Create a MongoDB Atlas Cluster There are many ways to create clusters with Atlas! Choose which option from the following list best suits your needs and experience. MongoDB Atlas UI: the steps in the "Get Started with Atlas" guide will show you how to create your first cluster directly in the Atlas UI. This is the best option for new users. Keep a note of your username, password, and connection string as you will need those later. MongoDB Atlas AWS Quick Start: AWS CloudFormation users can opt to use the new MongoDB Atlas AWS Quick Start to launch a complete Atlas deployment. Once provisioned, you can get the connection string from the Stack Output’s tab in the Cloud Formation web console or AWS CLI. mongocli: Developers and command line aficionados may prefer a more direct method. The latest release of the MongoDB Cloud CLI mongocli now boasts a quickstart command which you can use to create cluster’s and so much more. Check it out: https://docs.mongodb.com/mongocli/stable/quick-start/atlas/#configure-an-service-cluster Whichever method you choose to set up your Atlas Cluster, you can always find the connection string in the Atlas console. Note this for use in the next step, in which you’ll connect your app to MongoDB Atlas. If you have any issues, check out the detailed steps to connect your application to MongoDB Atlas. Step 2) Launch your app Open up your AWS Web Console and navigate to AWS App Runner . Click the “Create an App Runner service” button. AppRunner lets you deploy apps from either pre-built Docker images or Github repositories. Our sample app works both ways. If you want something simple, choose the Docker option. For a more advanced deployment with CI/CD autowired, then opt for the Source code repository integration. Step 2.1) Launch your app from a Docker image If you don’t want to mess around with source code, then select the “Amazon ECR Public” and use the following image tag: https://gallery.ecr.aws/u1r4t8v5/mongodb-atlas-fastapi:latest Step 2.2) Launch your app from a Source code repository (Optional - skip to Step 3 if you are deploying your app with the Docker image option) To use the “Source core repository” option, first create a fork of https://github.com/mongodb-devloper/mongodb-atlas-fastapi into one of your GitHub organizations. Then, return to AppRunner and follow the instructions to create a new GitHub connection. Finally, select the “mongodb-atlas-fastapi” code repo you just created. You will need to configure a couple of settings with the GitHub option, but they are all standard. Runtime : Python 3 Build command : pip install -r requirements.txt Start command : python app.py Port : 8080 Step 3) Configure and launch your application For both the Docker and Source Code options select “Next” and then: Name your app service: mongodb-atlas-fastapi Click “Add environment variable”, add one variable with your MongoDB Atlas Connection string with Key set to MONGODB_URI_SRV and paste your connection string for the value. (For example, mongodb+srv://< user>:< password>. test1.f31en.mongodb.net/myFirstDatabase?retryWrites=true&w=majority ) Set the “Port” to 8080 NOTE - This setting will not show up for Source Code integrations. You can click Next, review the next page for correctness and then click “Create & deploy”. After a few minutes you will have a public URL endpoint to your new Student API with a full UI to manage your documents. Find the ‘Default domain’ value on the AppRunner UI, and click this - you will see your new MongoDB Atlas powered API app! If you opted to set up the Github integration, then any code changes you push to the project will automatically trigger a new deployment and rollout of your application. Since you’re using MongoDB Atlas, high availability is built in and configuration changes (scaling, upgrading, multi-region expansion, etc.) require the click of a button or simple API call. Conclusion With AWS App Runner and MongoDB Atlas, developing, deploying, and evolving an app or API is incredibly easy. In this example, we deployed a containerized API endpoint on top of an Atlas database in a matter of minutes. What will you build? Get Started with MongoDB Atlas

May 18, 2021

How DataSwitch And MongoDB Atlas Can Help Modernize Your Legacy Workloads

Data modernization is here to stay, and DataSwitch and MongoDB are leading the way forward. Research strongly indicates that the future of the Database Management System (DBMS) market is in the cloud, and the ideal way to shift from an outdated, legacy DBMS to a modern, cloud-friendly data warehouse is through data modernization. There are a few key factors driving this shift. Increasingly, companies need to store and manage unstructured data in a cloud-enabled system, as opposed to a legacy DBMS which is only designed for structured data. Moreover, the amount of data generated by a business is increasing at a rate of 55% to 65% every year and the majority of it is unstructured. A modernized database that can improve data quality and availability provides tremendous benefits in performance, scalability, and cost optimization. It also provides a foundation for improving business value through informed decision-making. Additionally, cloud-enabled databases support greater agility so you can upgrade current applications and build new ones faster to meet customer demand. Gartner predicts that by 2022, 75% of all databases will be on the cloud – either by direct deployment or through data migration and modernization. But research shows that over 40% of migration projects fail. This is due to challenges such as: Inadequate knowledge of legacy applications and their data design Complexity of code and design from different legacy applications Lack of automation tools for transforming from legacy data processing to cloud-friendly data and processes It is essential to harness a strategic approach and choose the right partner for your data modernization journey. We’re here to help you do just that. Why MongoDB? MongoDB is built for modern application developers and for the cloud era. As a general purpose, document-based, distributed database, it facilitates high productivity and can handle huge volumes of data. The document database stores data in JSON-like documents and is built on a scale-out architecture that is optimal for any kind of developer who builds scalable applications through agile methodologies. Ultimately, MongoDB fosters business agility, scalability and innovation. Key MongoDB advantages include: Rich JSON Documents Powerful query language Multi-cloud data distribution Security of sensitive data Quick storage and retrieval of data Capacity for huge volumes of data and traffic Design supports greater developer productivity Extremely reliable for mission-critical workloads Architected for optimal performance and efficiency Key advantages of MongoDB Atlas , MongoDB’s hosted database as a service, include: Multi-cloud data distribution Secure for sensitive data Designed for developer productivity Reliable for mission critical workloads Built for optimal performance Managed for operational efficiency To be clear, JSON documents are the most productive way to work with data as they support nested objects and arrays as values. They also support schemas that are flexible and dynamic. MongoDB’s powerful query language enables sorting and filtering of any field, regardless of how nested it is in a document. Moreover, it provides support for aggregations as well as modern use cases including graph search, geo-based search and text search. Queries are in JSON and are easy to compose. MongoDB provides support for joins in queries. MongoDB supports two types of relationships with the ability to reference and embed. It has all the power of a relational database and much, much more. Companies of all sizes can use MongoDB as it successfully operates on a large and mature platform ecosystem. Developers enjoy a great user experience with the ability to provision MongoDB Atlas clusters and commence coding instantly. A global community of developers and consultants makes it easy to get the help you need, if and when you need it. In addition, MongoDB supports all major languages and provides enterprise-grade support. Why DataSwitch as a partner for MongoDB? Automated schema re-design, data migration & code conversion DataSwitch is a trusted partner for cost-effective, accelerated solutions for digital data transformation, migration and modernization through a modern database platform. Our no-code and low-code solutions along with cloud data expertise and unique, automated schema generation accelerates time to market. We provide end-to-end data, schema and process migration with automated replatforming and refactoring, thereby delivering: 50% faster time to market 60% reduction in total cost of delivery Assured quality with built-in best practices, guidelines and accuracy Data modernization: How “DataSwitch Migrate” helps you migrate from RDBMS to MongoDB DataSwitch Migrate (“DS Migrate”) is a no-code and low-code toolkit that leverages advanced automation to provide intuitive, predictive and self-serviceable schema redesign from a traditional RDBMS model to MongoDB’s Document Model with built-in best practices. Based on data volume, performance, and criticality, DS Migrate automatically recommends the appropriate ETTL (Extract, Transfer, Transform & Load) data migration process. DataSwitch delivers data engineering solutions and transformations in half the timeframe of the existing typical data modernization solutions. Consider these key areas: Schema redesign – construct a new framework for data management. DS Migrate provides automated data migration and transformation based on your redesigned schema, as well as no-touch code conversion from legacy data scripts to MongoDB Atlas APIs. Users can simply drag and drop the schema for redesign and the platform converts it to a document-based JSON structure by applying MongoDB modeling best practices. The platform then automatically migrates data to the new, re-designed JSON structure. It also converts the legacy database script for MongoDB. This automated, user-friendly data migration is faster than anything you’ve ever seen. Here’s a look at how the schema designer works. Refactoring – change the data structure to match the new schema. DS Migrate handles this through auto code generation for migrating the data. This is far beyond a mere lift and shift. DataSwitch takes care of refactoring and replatforming (moving from the legacy platform to MongoDB) automatically. It is a game-changing unique capability to perform all these tasks within a single platform. Security – mask and tokenize data while moving the data from on-premise to the cloud. As the data is moving to a potentially public cloud, you must keep it secure. DataSwitch’s tool has the capability to configure and apply security measures automatically while migrating the data. Data Quality – ensure that data is clean, complete, trustworthy, consistent. DataSwitch allows you to configure your own quality rules and automatically apply them during data migration. In summary: first, the DataSwitch tool automatically extracts the data from an existing database, like Oracle. It then exports the data and stores it locally before zipping and transferring it to the cloud. Next, DataSwitch transforms the data by altering the data structure to match the re-designed schema, and applying data security measures during the transform step. Lastly, DS Migrate loads the data and processes it into MongoDB in its entirety. Process Conversion Process conversion, where scripts and process logic are migrated from legacy DBMS to a modern DBMS, is made easier thanks to a high degree of automation. Minimal coding and manual intervention are required and the journey is accelerated. It involves: DML – Data Manipulation Language CRUD – typical application functionality (Create, Read, Update & Delete) Converting to the equivalent of MongoDB Atlas API Degree of automation DataSwitch provides during Migration Schema Migration Activities DS Automation Capabilities Application Data Usage Analysis 70% 3NF to NoSQL Schema Recommendation 60% Schema Re-Design Self Services 50% Predictive Data Mapping 60% Process Migration Activities DS Automation Capabilities CRUD based SQL conversion (Oracle, MySQL, SQLServer, Teradata, DB2) to MongoDB API 70% Data Migration Activities DS Automation Capabilities Migration Script Creation 90% Historical Data Migration 90% 2 Catch Load 90% DataSwitch Legacy Modernization as a Service (LMaas): Our consulting expertise combined with the DS Migrate tool allows us to harness the power of the cloud for data transformation of RDBMS legacy data systems to MongoDB. Our solution delivers legacy transformation in half the time frame through pay-per-usage. Key strengths include: ● Data Architecture Consulting ● Data Modernization Assessment and Migration Strategy ● Specialized Modernization Services DS Migrate Architecture Diagram Contact us to learn more.

May 13, 2021

Exploring Data with MongoDB Atlas, Databricks, and Google Cloud

MongoDB Atlas supports Google Cloud (GC), enabling you to easily spin up managed MongoDB clusters within GC in minutes. We’re excited to share that Databricks recently launched Databricks on GC, giving customers the freedom to move and analyze their data within GC and MongoDB Atlas. With the latest update to Databricks, it’s now easier to get started with a cloud-first approach on GC that leverages MongoDB Atlas with its flexible data model designed for modern applications and Databricks for more advanced analytics use cases. The following tutorial illustrates how to use MongoDB Atlas on GC and Databricks. We’ll use sample sales data in MongoDB Atlas and calculate the rolling average using Databricks on GC. This tutorial covers the following: How to read data from MongoDB Atlas on GC into Spark How to run the MongoDB Connector for Spark as a library in Databricks How to use the PySpark libraries to perform rolling averages of sales data How to write these averages back to MongoDB so they are accessible to applications Create Databricks Workspace To provision a new Databricks workspace, you will need to have a GC project already created. If you do not already have a Databricks cluster deployed on GC, follow the online documentation to create one. Note: It is important to follow the documentation, because there are a few key settings you will need to make in your GC project, such as enabling container.googleapis.com, storage.googleapis.com, and deploymentmanager.googleapis.com services and adjusting certain Google Cloud quotas before creating your Databricks cluster. In this example we have already created the Google Cloud project mongodb-supplysales and are ready to go to the Google Marketplace and add Databricks to our project. Within your Google project, click on “Marketplace” and enter “Databricks” in the search box. Click on the resulting tile and follow the instructions. Once your Databricks cluster is created, navigate to the Databricks cluster with the URL provided. Here you can create a new workspace. Once you’ve created your workspace, you will be able to launch it from the URL provided: Logging into your workspace brings up the following welcome screen: In this article, we will create a notebook to read data from MongoDB and use the PySpark libraries to perform the rolling average calculation. We can create our Databricks cluster by selecting the “+ Create Cluster” button from the Clusters menu. Note: For the purposes of this walkthrough we chose only one worker and preemptible instances; in a production environment you would want to include more workers and autoscaling. Before we create our cluster, we have the option under Advanced Options to provide Spark configuration variables. One of the common settings for Spark config is to define spark.mongodb.output.uri and spark.mongodb.input.uri . First we need to create the MongoDB Atlas cluster so we have a connection string to enter for these values. At this point, open a new browser tab and navigate to MongoDB Atlas. Prepare a MongoDB Atlas Instance Once in the MongoDB Atlas portal, you will need to do the following before you can use Atlas with Databricks: Create your MongoDB Atlas cluster Define user credentials for use in the Spark connector Define network access Add sample data (optional for this article) Create Your MongoDB Atlas Cluster If you already have a MongoDB Atlas account, log in and create a new Atlas cluster. If you do not have an account, you can set up a free cluster at the following URL: https://www.mongodb.com/cloud . Once your account is set up, you can create a new Atlas cluster by using the “+ New Cluster” dialog. MongoDB provides a free tier for Google Cloud. Once you provide a cluster name and click on “create,” Atlas will take approximately five to seven minutes to create your Atlas cluster. Define Database Access By default there are no users created in an Atlas cluster. To create an identity for our Spark cluster to connect to MongoDB Atlas, launch the “Add New Database User” dialog from the Database Access menu item. Notice that there are three options for authentication to MongoDB Atlas: Password, Certificate, and AWS IAM authentication. Select “Password,” and enter a username and password. Atlas provides granular access control: For example, you could restrict this user account to work only with a specific Atlas cluster or define the account as temporary and have Atlas expire within a specific time period. Defining Network Access MongoDB Atlas does not allow any connection from the internet by default. You need to include MongoDB Atlas as part of a VPC peering or AWS PrivateLink configuration. If you do not have that set up with your cloud provider, you need to specify from which IP addresses Atlas can accept incoming connections. You can do this via the “Add IP Address” dialog in the Network Access menu. In this article, we will add “0.0.0.0,” allowing access from anywhere, because we don’t know specifically which IP our Databricks cluster will be running on. MongoDB Atlas can also make this IP access list temporary, which is great for situations where you need to allow access from anywhere. Add Sample Data Now that we have added our user account and allowed network access to our Atlas cluster, we need to add some sample data. Atlas provides several sample collections that are accessible from the menu item on the cluster. In this example, we will use the sales collection within the sample_supplies database. Update Spark Configuration with Atlas Connection String Copy the MongoDB Atlas connection string by clicking on the Connect button and selecting “Connect your application.” Copy the contents of the connection string and note the placeholders for username and password . You will have to change those to your own credentials. Return to your Databricks workspace. Under Advanced Options in your Databricks workspace, paste the connection string for both the spark.mongodb.output.uri and spark.mongodb.input.uri variables. Note that you will need to update the credentials in the MongoDB Atlas connection string with those you defined previously. For simplicity in your PySpark code, change the default database in the connection string from MyFirstDatabase to sample_supplies. (This is optional, because you can always define the database name via Spark configuration options at runtime.) Start the Databricks Cluster Now that your Spark config is set, start the cluster. Note: If the cluster fails to start, check the event log and view the JSON tab. This is an example error message you will receive if you forgot to increase the SSD storage quota: Add MongoDB Spark Connector Once the cluster is up and running, click on “Install New” from the Libraries menu. Here we have a variety of ways to create a library, including uploading a JAR file or downloading the Spark connector from Maven. In this example, we will use Maven and specify org.mongodb.spark:mongo-spark-connector_2.12:3.0.1 as the coordinates. Click on “Install” to add our MongoDB Spark Connector library to the cluster. Note: If you get the error message “Maven libraries are only supported on Databricks Runtime version 7.3 LTS, and versions >= 8.1,” you can download the MongoDB Spark Connector JAR file from https://repo1.maven.org/maven2/org/mongodb/spark/mongo-spark-connector_2.12/3.0.1/ and then upload it to Databricks by using the Upload menu option. Create a New Notebook Click on the Databricks home icon from the menu and select “Create a blank notebook.” Attach this new notebook to the cluster you created in the previous step. Because we defined our MongoDB connection string as part of the Spark conf cluster configuration, your notebook already has the MongoDB Atlas connection context. In the first cell, paste the following: from pyspark.sql import SparkSession pipeline="[{'$match': { 'items.name':'printer paper' }}, {'$unwind': { path: '$items' }}, {'$addFields': { totalSale: { \ '$multiply': [ '$items.price', '$items.quantity' ] } }}, {'$project': { saleDate:1,totalSale:1,_id:0 }}]" salesDF = spark.read.format("mongo").option("collection","sales").option("pipeline", pipeline).option("partitioner", "MongoSinglePartitioner").load() Run the cell to make sure you can connect the Atlas cluster. Note: If you get an error such as “MongoTimeoutException,” make sure your MongoDB Atlas cluster has the appropriate network access configured. The notebook gave us a schema view of what the data looks like. Although we could have continued to transform the data in the Mongo pipeline before it reached Spark, let’s use PySpark to transform it. Create a new cell and enter the following: from pyspark.sql.window import Window from pyspark.sql import functions as F salesAgg=salesDF.withColumn('saleDate', F.col('saleDate').cast('date')).groupBy("saleDate").sum("totalSale").orderBy("saleDate") w = Window.orderBy('saleDate').rowsBetween(-7, 0) df = salesAgg.withColumn('rolling_average', F.avg('sum(totalSale)').over(w)) df.show(truncate=False) Once the code is executed, the notebook will display our new dataframe with the rolling averages column: It is this cell where we will provide some additional transformation of the data such as grouping the data by saleDate and provide a summation of the totalSale per day. Once the data is in our desired format, we define a window of time as the past seven entries and then add a column to our data frame that is a rolling average of the total sales data. Once we have performed our analytics, we can write the data back to MongoDB for additional reporting, analytics, or archiving. In this scenario, we are writing the data back to a new collection called sales-averages: df.write.format("mongo").option("collection","sales-averages").save() You can see the data by using the Collections tab within the MongoDB Atlas cluster UI. WIth the data in MongoDB Atlas, you can now leverage many of the services available, including Atlas Online Archive, Atlas Search, and Atlas Data Lake. Summary The integration between MongoDB Atlas, Google Cloud, and Databricks enables you to gain deep insights into your data and gives you freedom to move and analyze data as your needs evolve. Check out the resources below for more information: Getting started with MongoDB Atlas MongoDB Spark Connector MongoDB Atlas on Google Cloud

May 11, 2021

Accelerate Data Modernization with Infosys Data Model Converter

Are you in the process of migrating applications from a relational database to MongoDB? If so, you’re likely trying to best understand and decide how your enterprise data needs to be modeled. Our previous blog discussed how Infosys Data Services Suite can help enterprises move data seamlessly from legacy relational databases to MongoDB. But moving data is only one part of the puzzle. The more significant step is choosing the target data model, or schema design, a process that usually requires several hours of highly skilled talent. That’s why we created this follow-up blog to help you get started. Rethinking Schema Design Ultimately, schema design can be the difference between an inefficient, disorganized database and a strategic one that empowers the entire company. Schema design in MongoDB requires a change in perspective for data architects, developers, and database administrators. They have to: Rethink the legacy relational data model. This model flattens data into rigid two-dimensional tabular structures of rows and columns. The new data model is a rich and dynamic one with embedded sub-documents and arrays Rethink how the data platform works. In relational databases, it is extremely difficult to change the data platform as the application evolves. However, in MongoDB, the apps and APIs come first and the data platform dynamically accommodates the data Getting Schema Design Right Begin the schema design process by considering the application’s requirements. You’ll want to model the data in a way that leverages the flexibility of the document model. In schema migrations, it may seem easy at first to simply mirror the flat schema of the relational database in the document model. However, this negates the advantages enabled by the rich and embedded data structures of the document model. For example, data that belongs to a parent-child relationship in two RDBMS tables can be collapsed (embedded) into a single document in MongoDB. The application data access patterns should also drive schema design with a specific focus on: The read/write ratio of database operations and whether it is more important to optimize the performance of one operation over another The types of queries and updates performed by the databases The lifecycle of the data and growth rate of documents Simplifying Schema Design with Infosys Data Model Converter Infosys has developed a solution called Infosys Data Model Convertor that processes source relational schema and the above-mentioned signals as inputs and automatically provides target MongoDB schema suggestions. Infosys Data Model Converter is available as part of Infosys Modernization Suite which accelerates enterprises’ modernization journey. Each schema suggestion is accompanied by a detailed analysis report. The data modeler can use this as a starting point and iterate over the schema to arrive at the final MongoDB schema. The Infosys Data Model Converter reduces 50-60% of the effort typically spent in schema design. Key Features Boosts productivity by augmenting the migration of RDBMS to NoSQL database Saves time by automatically extracting schema, query and data patterns from an existing RDBMS Comprehensively analyzes the RDBMS entity relations, data and read-and-write patterns Applies a rich set of rules and generates a fully compliant NoSQL target state data model Offers flexibility by externalizing the rules for organization-specific customizations Connects and deploys the model to the target NoSQL platform with sample data Discover more ways in which Infosys can help you unlock value from modernization. Contact us for any modernization questions.

April 15, 2021

Announcing the MongoDB SI Architect Certification Program for Modernization to the Cloud

You know the value of modernization as a strategic initiative. It’s not only about refreshing your portfolio of legacy applications with the latest innovations simply for the sake of moving to the cloud. This is much more than just “lift and shift”. True modernization is about realizing your company’s full potential and gaining a competitive edge through development methodologies, architectural patterns and technologies. And by modernizing with MongoDB, you can build new business functionality 3-5x faster, scale to millions of users wherever they are on the planet, and cut costs by 70% or more. If you’re familiar with our technology and our Modernization Program , you already understand the benefits. But do your customers? And, if not, how do you tell them? To help you get started, the MongoDB Partner team has created the MongoDB SI Architect Certification , a full scale kit of assets related to modernization. This free, self-paced certification helps you improve the modernization experience for a variety of customer types as well as drive conversations with customers around data center exit plans and application qualification for assessing cloud data platforms. Consider this certification the next step in deepening your expertise so you can expand your business opportunities and help customers modernize to the cloud. Customized for System Integrator partners, our certification teaches you how to discuss the benefits of modernization with various customers on a cloud journey. It enables architects to have deep discussions on vertical-based stories, migration tools, best practices, and architecture guidelines. System Integrator partners will also learn the fundamental value of offerings, messaging, objection handling, and more. Most importantly, this certification program equips SI architects with the ability to communicate key takeaways to the customer in a language they understand. Program Structure The free SI Architect certification program is self-paced, takes approximately 20 hours, and divided into six key sections, complete with a final certification exam. Introduction allows partners to access the modernization webinars and modernization program offerings. Top use cases focus on how MongoDB is used in business-wide strategic initiatives, like legacy modernization, cloud data strategy, microservices and more vertical based stories. Customer case studies highlight how MongoDB is deployed and leveraged through real-life customer case studies and proof points. University classes allow participants to leverage MongoDB university on-line as well as on-demand courses relevant to the architects. Competitive edge helps architects understand the true value of MongoDB in comparison to the competition. Final certification culminates the program with a "Talk to the experts" session and final certification exam where participants take a real world industry use case or customer project and assess how to migrate to the cloud. Our “Talk to the experts” session provides users with the opportunity to query experts with questions about the final certification exam. It also introduces the messaging around “MongoDB: The Intelligent Operational Data Platform” and details an Atlas TCO and sizing exercise. In addition to these assets, partners also have access to self-paced developer training and database administrator training here . Note: Download the enhanced Modernization Guide to refresh your knowledge on MongoDB modernization Dive Deeper into MongoDB Cloud Technology What’s one key lesson we know for certain? The data management platform you choose is a key factor in successfully migrating legacy applications to the cloud. The MongoDB Cloud section of our Architecture Guide discusses the unique value MongoDB can bring to organizations making the transition to cloud. Note: Download the Architecture Guide to refresh your knowledge on MongoDB Cloud The key components of the MongoDB cloud platform are: At its core is MongoDB , the general purpose operational database for modern applications. Nearly every application needs a fast database that can deliver single digit millisecond response times; and when it comes to speed, MongoDB delivers. With our flexible document data model, transactional guarantees, rich and expressive query language, and native support for both vertical and horizontal scaling, MongoDB can be used for practically any use case, reducing the need for specialized databases even as your requirements change. With multi-cloud clusters on MongoDB Atlas , customers can realize the benefits of a multi-cloud strategy with true data portability and a simplified management experience. Multi-cloud clusters provide the best-in-class technology across multiple clouds in parallel, migrate workloads across cloud providers seamlessly, and improve high availability with cross-cloud redundancy. Realm Mobile Database extends this data foundation to the edge. Realm is a lightweight database embedded on the client side. Realm helps solve the unique challenges of building for mobile, making it simple to store data on-device while also enabling data access when offline. Realm Sync is seamlessly integrated and keeps data up-to-date across devices and users by automatically syncing data between the client and a backend Atlas cluster. Ready to boose your knowledge and expertise? The Modernization Guide, Architecture Guide, and SI Architect certification program are waiting for you. Get started today. Start the free MongoDB SI Architect certification program today!

March 24, 2021

Optimize Data Modeling and Schema Design with Hackolade and MongoDB

Development teams are constantly searching for new ways to quickly enhance applications and satisfy the rapid progression of customer needs. The dynamic schema evolution in MongoDB enables such a reality through the power and flexibility of storing data in a JSON document format instead of in relational tables. Notably, developers love the flexibility and schema-less nature of the JSON document format. But as application complexity and scale increases in an enterprise environment, this flexibility must be skillfully organized to harness the power of the solution, maximize developer productivity, and lower total cost of ownership. For large enterprises and government agencies, the key is to leverage the benefits of modern applications running on MongoDB Atlas while also ensuring proper data management and governance. This is where a data modeling tool designed specifically for MongoDB will greatly help. Enter Hackolade. For decades, Entity-Relationship Diagrams (ERDs) have been used to visually represent the data structures of relational databases. But ERDs were originally designed for flat structures only. Hackolade , a MongoDB certified technology partner , has enhanced ERD capabilities to accommodate the representation of JSON hierarchical structures with nested objects and arrays. Hackolade is pioneering data modeling and schema design for NoSQL databases and REST APIs. Why it Matters A data model is an abstraction describing and documenting an organization’s information system. It is a collection of Entity-Relationship diagrams, descriptions, constraints, and metadata representing data structures: Hackolade data model for MongoDB A schema, on the other hand, is a “consumable” scope contract describing the layout or structure of a file, a transaction, or a database. It is an authoritative source for producers and consumers to agree on the structure being exchanged or accessed. While data models are useful for humans to understand structure, schemas are the technical artifact necessary for systems to interact. Hackolade provides both, allowing MongoDB customers to easily visualize the data model, intuitively create and enforce schema with MongoDB’s JSON Schema Validator, and iteratively change the schema as the applications evolve. Automatically-generated JSON Schema Validator Customer Benefits Increase data agility with forward-engineering An ERD provides an easy-to-understand picture of your data. As a communication tool, it helps facilitate dialog between application stakeholders like business analysts, designers, architects, developers, and DBAs. With an ERD, you can evaluate different “what if” scenarios, identify the ideal way to denormalize data, and leverage the benefits of MongoDB Atlas technology. Simply apply a query-driven design of the schema after analyzing the access patterns of the application. You can then visualize and evaluate the impacts without writing a line of code—obviously, this is a more productive approach than coding first, then realizing that much needs to be rewritten to accommodate everyone’s needs. The Hackolade software generates several artifacts such as: collection creation with validator script requiring no knowledge of JSON Schema syntax, sample JSON documents, Mongoose schemas, documentation in HTML, Markdown or PDF, plotter output of ERD pictures, document and index sizing estimates, and more. The process is easily integrated into a Jenkins CI/CD pipeline by invoking a flexible Command-Line Interface. Ensure data quality and compliance through schema reverse-engineering Deriving a data model from an existing MongoDB instance is not as easy as fetching a DDL from a relational database. Schemas must be inferred from a representative sample of documents in each collection. Hackolade has perfected its schema inference algorithms to accommodate the flexibility and polymorphism of JSON hierarchical structures. The derived models become a trusted source to feed data dictionaries and data governance suites. Reverse-engineering helps ensure data quality and compliance, with the use of an automated Command-Line Interface process. Facilitate application modernization with the denormalization of legacy data structures Hackolade can import a variety of structures from relational DDLs, logical data models in XSD format, JSON documents and schemas, and Excel templates. To leverage the benefits of MongoDB, these structures should evolve to embed information where applicable and avoid slow JOINs. This should not be done blindly, but based on a proper analysis of the application access patterns in the context of data volume estimates and relationship cardinality. Hackolade provides a handy feature to quickly evolve a relational data model towards a denormalized schema, thereby leveraging the benefits of MongoDB’s document model and facilitating modernization. The process easily hooks into the forward-engineering process described above, generating pictures, scripts, and documentation. Implement continuous evolution and data management The lifecycle of modernized applications does not stop after the initial data migration step. Applications must be successfully operated, and will continue to evolve, resulting in likely schema changes. Hackolade is designed to facilitate agile development approaches and the full lifecycle of modern software. It provides the necessary tooling to design and manage data models and schemas for successful application modernization on MongoDB Atlas. Learn how to maximize developer productivity and lower total cost of ownership using data modeling with Hackolade , and the MongoDB University data modeling advanced course . Download the joint solution brief: MongoDB and Hackolade: Visual Data Modeling for MongoDB Schemas .

March 11, 2021

Capgemini Solutions that help customers modernize applications to MongoDB

Companies across every industry vertical continue to face the challenge of how to effectively migrate and quickly access massive amounts of enterprise data—all while keeping system performance up to par throughout the obstacle-ridden process. The complexities involved with the ubiquitous, traditional Relational Database Management Systems (RDBMS) are many. RDBMS systems can often inhibit performance, falter under heavy volumes and slow down deployment. With MongoDB’s document-based, distributed database, however, performance and volume issues are easily addressed. But when it comes to speeding up time to market? The right auxiliary tools are still needed. Capgemini, a MongoDB partner and global leader in digital transformation, provides the final piece of the puzzle with a new tool rooted in automated intelligence. In this blog, we’ll explore three key Capgemini Solutions that help customers modernize to MongoDB. Tools that expedite time to market Migration from legacy system to MongoDB New development using MongoDB as a backend database Whether your company is developing a new database or migrating from legacy systems to MongoDB, Capgemini’s new Database Convert & Compare (DCC) tool can help. Below, we’ll detail how DCC works, then walk through a few recent, client examples and the immense benefits reaped. Tool: Database Convert & Compare (DCC) A powerful tool developed by the Capgemini team, DCC optimizes activities like database migration, data comparison, validation and much more. The tool can perform data transformations with specific customization based on the source and target database in the scope. When migrating from RDBMS to MongoDB, DCC achieves 70% automation and 30% manual retrofit on a database level. How does DCC work? In context of RDBMS to NoSQL migration, DCC performs the migration in 3 stages. 1) Assessment: Source database schema assessment – DCC extracts source schema information and performs an assessment to generate detailed inventory of data objects such as tables, views, stored procedures and indexes. It also generates a detailed report on data volume from each table which helps in assessing estimated data migration time from source to target Apply analytics to prepare recommendation for target database structure—The target structure varies based on various parameters, such as: Table relationships (one to many, many to many, one to one) Indexes applied on table for performance requirements Column data type 2) Schema Migration Customize tool to apply recommendation from step 1.2 hence generating the script for target database Target schema script preparation – DCC will generate a complete database schema script except for a few object types such as stored procedure, views etc. Produce detailed report of schema migration, inclusive of objects that couldn’t be migrated Manual intervention is required to apply business logic implementation of source database, stored procedures and views to target environment application 3) Data Migration Column mapping – assessment report generates inventory of source database table fields as well as post recommended schema structure; the report also provides recommended field mapping from source to target based on adopted recommendation and DCC customization Post migration data validation script – DCC generates a data validation script after data migration is complete which takes field mapping into consideration from the related assessment and recommendation reports Data migration script for execution – DCC allows for the setup and configuration of different scripts for data migration, such as: One-time data migration from source to target Daily batch run to sync up source and target database data Intermittent data validation during the process of data migrationIf there are any discrepancies found in validation, the job will stop and generate a report with potential root cause of issue in data migration) Standalone data comparison – DCC allows for seclusion of data validation between source and target database. In this case, DCC will generate source database table inventory details and extract target database collection inventory details. Minimal manual intervention is required to perform the field mapping and set the configuration in the tool for data migration execution. Other configuration features such as one time migrations or daily batch migrations can be configured as well. The Capgemini team has successfully implemented and deployed the DCC tool for various banking customers for RDBMS to NoSQL end-to-end migration including for application retrofit and rewiring using other capable tools such as CAP360 Case study 1: Migration from Mainframe to MongoDB for a Large European Investment Bank A large banking client encountered significant challenges in terms of growth and scale-up, low resilience and increased risks, and certainly increasing costs associated with the advent of mobile banking and a related significant increase in volume. To help the client evolve more quickly, Capgemini built an Operational Data Platform to offload expensive mainframe operations, as well as store and process customer transactions for business operations, analysis and reporting. The Challenge: Inefficient and slow to meet customer demand for new digital banking services due to heavy reliance on legacy infrastructure and apps Continued growth in traffic and the launch of new digital services led to increased cost of operations and decreased performance Mainframe was the single point of failure for many applications. Outages resulted in poor customer service, brand erosion, and regulatory concerns The Approach: An analysis of digital channels revealed that 92% of traffic was generated by 25 interaction types, with 85% of these being read-only. To offload these operations from the mainframe, an operational data lake (ODL) was created. MongoDB-based ODL was updated in near real-time via change data capture and messaging queue to power existing apps, new digital services and other APIs. Outcome and Benefits: Accelerated time to market for new digital services, including personalization Improved stand-in capability to support resiliency during planned and unplanned mainframe outages Reduced number of read-only transactions to mainframes (MIPS cost), freeing up resources for additional growth Saved the customer over 80% in year-on-year post migration costs. The new MongoDB database was seamlessly able to handle 25mn+ transactions per day as well as able to handle data volume of over 30 months of history with ~13b transactions held in 114m documents Case study 2: Migration of Large-scale Database from Legacy to MongoDB for US-based Insurance Customer A US-based insurance client faced disparate data spread across 100+ systems, making data aggregation a cumbersome process. The client wanted to access the many data points around a single customer without hindering performance of the entire system. The Challenge: Reconciling different data schemas from multiple systems into a single schema is problematic and, in many cases, impossible. When adding new data sources, it is difficult to iterate on the schema quickly. Providing access to the data within the ‘Single View’ requires ad hoc queries as well as multi-layer indexing and aggregation which becomes complicated for relational databases to provide. Lack of personalization and ability to provide context-based experiences in real time results in lost business opportunities. Approach: In order to assist customer service reps in real-time, we built “The Wall,” a single view application that pulls disparate data from legacy systems for analytics. Additionally, we designed a flexible data model to aggregate disparate data into a single data store. MongoDB’s expressive query language and secondary indexes can reach any field in real time, making data access faster and easier. Our approach was designed based on 4 key foundations: Document Model – Rich and flexible data store. A single document can store up to 16 MB of data. With 20+ data types meant flexibility in terms of managing data Versatility – Variety of structured and non-structured data models defined Analytics – Strong data aggregator framework to aggregate data related to single customer Workload Isolation – Parallel run for operational and analytical workload on same cluster Outcome and Benefits: Our largest insurance customer was able to attain the single view of the customer within 90 days timespan. A different insurance customer achieved 360 degree view of 13 million customers on MongoDB Enterprise Advanced. And yet another esteemed healthcare customer was able to achieve as much as a 300% reduction in processing times and increased processing throughput with 50% less hardware. Ready to accelerate your digital transformation? Capgemini and MongoDB can help you re-envision your data and advance your business processes so you can focus on innovation. Reach out today to get started. Download the Modernization Guide

February 10, 2021