MongoDB Applied

Customer stories, use cases and experience

Tackling the 5G Complexity Beast with MongoDB’s Developer Data Platform Simplicity

The advent and commercialization of 5G is driving a sea change in the mobile user experience. This success is evidenced by the booming adoption of 5G-enabled devices. Supporting real-time business, streaming, and gaming applications on a 5G network is essential for telecommunications companies’ enterprise growth but demanding on the systems that support them. As the “cloudification” of network functions continues to evolve, it grows more challenging for older business support systems (BSS) and operations support systems (OSS) to keep up. To address the needs of increasingly complex networks, operators are reevaluating their data strategy by recognizing that a developer-focused data platform, to address the needs of mission critical systems, can enable a greater level of agility across the enterprise. This is the thesis of a new IDC white paper, sponsored by MongoDB, Effective Data Management is Essential for Taming the 5G Network Complexity Beast (doc #US49660722, September 2022). In the analysis, led by Karl Whitelock, Research Vice President, Communication Service Provider - Operations & Monetization, IDC examines the new generation of services that will drive innovation in multiple industries, and reviews solutions for the challenges telecommunications providers will face amid new operations and monetization strategies derived from 5G and mobile edge computing services. Take me straight to the IDC White Paper: Effective Data Management is Essential for Taming the 5G Network Complexity Beast Building business solutions at the network edge As software-driven 5G services evolve through a cloud-native network architecture, complexity grows. Within the multi-technology network, an advancing web of systems connects data from the mobile network to an edge cloud, HCP cloud, the core network, the internet, and back again. To manage this complexity, network automation and extensive data analytics capabilities are key components in delivering a first-class customer experience. The new generation of digital services is 5G enabled. IDC is witnessing demand from social media, streaming video, search, gaming, transport, and industrial internet IoT applications building network traffic, and associated data, at soaring rates. Businesses across diverse industries are jumping on the 5G bandwagon. The business solutions being dreamed up by developers are redefining services and business outcomes, particularly when utilizing delivery at the network edge. For example: Manufacturing Private 5G networks help high-speed production facilities identify defects and remove incorrectly assembled equipment. Architecture/Construction Robots measure architectural layouts and site dimensions are collected during construction. Records are stored in the cloud for later access by inspectors, builders, and customers. Sporting Events Edge computing can be faster and more reliable for processing data at large scale sporting events. This allows organizers to collect and process data to build interactive digital experiences at the edge.

December 7, 2022
Applied

Simplifying IoT Connectivity with myDevices and MongoDB

In the highly competitive era of Industry 4.0, companies that are able to adopt emerging Internet of Things (IoT) technologies and shift from traditional offerings to digitally differentiated ones are moving to the forefront of their respective industries. McKinsey & Company estimates that by 2030, IoT could enable $5.5 trillion to $12.6 trillion in value globally, including the value captured by consumers and customers of IoT products and services. From smart thermostats to smart factories, IoT already connects billions of devices worldwide. Figure 1 shows potential areas where IoT solutions make a difference. Figure 1:   IoT applications by industry (non-exhaustive). All of these IoT applications and solutions require technologies that can offer low-power operation, low-cost, and low complexity in setting up and maintaining end devices. End devices that are able to communicate wirelessly over large distances with low-power consumption are key. The data generated by IoT devices is time series and high frequency, placing a unique strain on the underlying data infrastructure. Because of the polymorphic nature of IoT sensor data, the database must support flexible data schemas, making it easy for developers to work with the data. It must also ensure that the IoT applications are resilient to future changes. MongoDB embraces the variety and volume of IoT data without compromising on performance. Through its document model, MongoDB eliminates data movement and blends time series with the rest of the enterprise data in a single developer data platform. In this article, we’ll describe how myDevices leverages the MongoDB developer data platform for IoT. Overview of myDevices myDevices is a U.S.-based IoT solutions company that empowers system integrators, MSPs, ISVs, VARS, and enterprise customers to quickly deploy IoT solutions to their customers. The company has more than 1000 plug and play sensors and multiple Long Range Wide Area Network (LoRaWAN) gateway options to create IoT solutions for a variety of use cases. Over time, myDevices has created the world’s most extensive IoT device catalog from more than 150 hardware manufacturers around the globe. LoRaWAN offers unique IoT benefits, such as long range and coverage, which may reach up to 15 kilometers in line of sight (LOS). It offers ultra-low power consumption for end devices, low-cost infrastructure, and high capacity, which makes it possible to link thousands of devices to one single gateway. myDevices understands that connecting devices from disparate manufacturers can be very challenging; thus, they have created a no-code solution that includes plug-and-play templates to connect sensors to the gateway just by scanning a QR code. After the sensor is connected to the gateway, users can perform remote monitoring and device management from a single-view interface. They can also get alerts through text and email and set up charts for visualization of sensor data. The alert rules can be configured as time based or threshold based in the myDevices platform. The myDevices IoT platform is secure from the edge to the application layer through the cloud. The security is composed of LoRaWAN network security at the edge, TLS to the cloud, and SAML at the application layer. Figure 2 shows the architecture of the myDevices platform and how it connects to the sensors. Figure 2:   MyDevices architecture. myDevices also has multiple ready-to-go solutions for a variety of IoT use cases and applications. From machine health predictive maintenance to soil moisture detection, there are sensors that just work with the IoT in a box application. It takes only minutes to set up connectivity between the sensor and myDevices cloud, and myDevices enhances productivity because you don’t have to worry about writing code to extract data from the sensors and establishing secure connectivity with the gateway. As LoRaWAN enables hundreds, if not thousands, of sensors sending data to a single gateway, it requires a database that can easily and automatically scale. When it comes to publishing data out of myDevices cloud to MongoDB Atlas, myDevices provides a webhook integration functionality that can be set up in minutes to establish connectivity between the two systems. Database requirements for IoT and MongoDB Atlas MongoDB and MongoDB Atlas are ideal partners for any IoT deployment, offering: Deployment flexibility (on-premises, in-field, cloud) Multi-cloud flexibility (AWS, Azure, GCP) Schema flexibility (frequent changes and additions) The ability to blend different data (time series, operational) Real-time analytics readiness Automated data tiering As a result, IoT data platforms and service providers, such as Bosch and Software AG, as well as some of the world’s most intensive IoT users, including Toyota, Mercedes-Benz, and Vodafone, choose MongoDB for their IoT platforms and services. MongoDB’s developer data platform supports the entire IoT data life cycle, from ingestion, storage, querying, real-time analytics, and visualization to online archiving (Figure 3). MongoDB Atlas brings the core components of real-time analytics into one developer data platform. Figure 3:   MongoDB Developer Data Platform for IoT. Let's talk about a few features that directly support IoT applications: Native time series platform: MongoDB supports native time series collections with hands-free schema optimization supporting high-efficiency storage and low-latency queries. This is an extremely important feature for IoT applications. Change streams: MongoDB change streams allow applications to access real-time data changes in the database without any complexity or risk. IoT applications can use change streams to subscribe to all data changes on a single collection, a database or an entire deployment and immediately react to them. This approach enables quick response time and fast decision making. Aggregation framework: By using the built-in aggregation framework in MongoDB, users are able to do real-time analytics without having to move the data to another platform. By using the aggregation framework, the work is done inside MongoDB, and the final results can be sent to the application, typically resulting in a smaller amount of data being moved around. For IoT applications, this can be a powerful tool to only transmit the filtered data to the Cloud or central storage resulting in improved security and reduced cost. Data Lake: As data is ingested, Atlas Data Lake automatically optimizes and partitions the data in a format and structure best for analytical queries. This capability significantly reduces the complexity of transforming data for the data scientist tasked with building machine learning models for analytical use cases and applications Data Federation: Atlas Data Federation provides the ability to federate queries across data stored in various supported storage formats, including Atlas Clusters, Data Lake Datasets, AWS S3 buckets, and HTTP stores. This feature reduces complexity of bringing data together for analytical model testing purposes. Data API: Companies can use Atlas Data API to integrate Atlas into any apps and services that support HTTPS requests. Leveraging this feature, the data from the myDevices cloud can be sent to Atlas and then used for storage and for analytical purposes using the aggregation framework or via the Atlas ecosystem connectors with third-party analytical software. Ecosystem integration: MongoDB Spark Connector opens up access to all Spark libraries for use with MongoDB datasets: Datasets for analysis with SQL (benefiting from automatic schema inference), streaming, machine learning, and graph APIs. Charts: MongoDB Charts is the best way to visualize IoT data stored in MongoDB. Charts is built specifically for the document model, no ETL, no time loss to data manipulation or duplication required to visualize rich JSON data. Using Charts, powerful engaging data experiences can be created for the use case stakeholders in no time. Integrating Atlas and myDevices using Webhooks and Data API myDevices offers a variety of no-code integrations for its clients to quickly get started by sending data to the platform of their choice. For MongoDB Atlas clients, this is great news because, by using myDevices Webhook integrator and payload transformation feature, MongoDB Atlas clients can receive and store LoRa sensor data into the specified collection. Let’s run through the methodology to perform this integration: Step 1: Log into your Atlas Cluster and set up Data API and API key. The MongoDB Atlas Data API lets you read and write data in Atlas with standard HTTPS requests. To use the Data API, all you need is an HTTPS client and a valid API key. It is important to understand that the Data API is not a direct connection to the MongoDB database. Instead, it routes requests through a fully managed middleware layer, called Atlas App Services, that sits between your cluster and client apps. This layer handles user authentication and enforces data access rules to ensure that the data is secure. The Data API supports two types of endpoints: Data API endpoints are automatically generated endpoints that each represent a MongoDB operation. You can use the endpoints to create, read, update, delete, and aggregate documents in a MongoDB data source. Custom endpoints are app-specific API routes handled by functions that you write. You can use custom endpoints to run your app's backend logic or as webhooks that integrate with external services. In this example, we are using a data API endpoint. You can follow these easy steps to enable Data API and create a Data API Key. Step 2: Log in your myDevices Console and set up integrations After you log in, click on new webhook creation through the INTEGRATIONS option on the right-hand panel (Figure 4). For the purpose of this article, we are assuming that you have already created an organization in myDevices and added sensors and gateways to it. If you have not, please refer to myDevices API docs to get started. Figure 4:   Set up integrations in myDevices. Step 3: Click on Webhook integration to open up the new Webhook creation panel. In this step, choose Webhook as the desired integration option, as shown in Figure 5. Figure 5:   Choose Webhook as the integration option. Step 4: Add key information. In this step, you’ll want to include key information, such as Url, which is your Data API endpoint, Webhook Header, which will include the api-key at the very minimum, and the payload transform script, where you can specify the cluster, database, and collection where this sensor data needs to be stored (Figure 6). Figure 6:   Paste the endpoint generated by Data API in Atlas. An example payload transformation script looks like the following. This is according to Data API requirements where you have to specify the cluster, database and collection name in the raw body data. function Transform(event, metadata) { return { dataSource: "my_cluster", database: "my_database", collection: "current_sensor", document: event, }; } Step 5: Save your webhook. Once you save your webhook, you can observe sensor data flowing into your MongoDB Atlas collection from the actual device using MongoDB Compass or Atlas Charts (Figure 7). For more details on how to create Charts, please visit the Atlas Charts documentation . Figure 7: Visualize sensor data using Atlas Charts. Conclusion We have shown how easy it is to connect myDevices IoT platform with MongoDB using the Data API . The overall architecture is shown in Figure 8. Figure 8: End-to-end architecture of myDevices and MongoDB Atlas integration. Simplifying IoT connectivity is of paramount importance for any organization looking to embark on a digital transformation journey. Fortunately, both myDevices and MongoDB Atlas provide platforms that simplify management of the full life cycle of an IoT device from provisioning to connectivity to data storage and archival. To learn more about how MongoDB enables IoT for our customers, please visit our IoT use cases page .

December 6, 2022
Applied

4 Ways MongoDB Enhances Your Google BigQuery Experience

MongoDB and Google Cloud continue to build on their partnership, with MongoDB enhancing Google Cloud with pay-as-you-go abilities, unified billing, and integrations with multiple different GC features, including BigQuery . And, when it comes to data architecture, BigQuery and MongoDB are two products that are better together. Google BigQuery and MongoDB are better together Google’s serverless data warehouse, BigQuery, was launched in 2011 with an aim to enhance business agility as their cloud-native data warehouse. BigQuery allows for fast queries that can uncover insights using familiar SQL. When MongoDB is added to the database technology stack as a complementary technology, it enhances the breadth of capabilities for the developer across a variety of use cases, including the following four examples. Combined impact of the Enterprise Data Warehouse and the Operational Data Store BigQuery is best suited as an Enterprise Data Warehouse (EDW), meaning it is designed to optimize long-running analytics. MongoDB Atlas , on the other hand, is best suited as an Operational Data Store (ODS), designed to optimally support high throughput and highly concurrent real-time operational applications that demand random access to an entity’s data in native JSON. This combination means that BigQuery and MongoDB are complementary technologies that can jointly deliver more value — each delivering on their strongest qualities. BigQuery excels at long-running queries, while Atlas handles the real-time operational application needs with thousands of concurrent sessions and millisecond response times. Enriched end-customer experiences BigQuery enables data scientists and analysts with machine learning (ML) models and BI tools for structured and semi-structured data at scale. For roles that need results with a turnaround time of a day or more, BigQuery is a strong tool for big data queries. With MongoDB Atlas, engineers and development teams can build applications faster and handle highly diverse schema, query, and update patterns, adapting to demanding user needs and competition. Atlas can also deliver the real-time or less than 24-hour queries that are necessary to keep your business operational. Additionally, data can easily move back and forth between the two platforms, creating a prime combination for running analytics on operational data. Being able to unlock the full potential of your data across your organization means that everyone has the insight into the business metrics they need, when they need it. This allows quicker decision making, as well as stronger and more accurate reporting. Extensibility to MongoDB Atlas features On top of the value and synergy that can be realized by a BigQuery+Atlas combination, other Atlas features can help enhance the usefulness and sophistication of a data architecture, such as: Atlas Charts can be leveraged to create rich visualizations of any data stored within Atlas. Atlas Triggers and Alerts can apply database logic in response to events or on a predefined schedule. Atlas Search brings full-text search at scale to all data across MongoDB and BigQuery alike. Atlas Data Federation enables aggregating data across multiple data sources, such as Atlas clusters and HTTPS endpoints, and transforming it into analytical formats (e.g., Parquet). This means you can not only access data in real-time, but you can also analyze it in a visual, user-friendly way. This functionality makes your data more actionable, allowing you not only to answer questions about your business data but also make better predictions and future adjustments based on it. Furthermore, being alerted to certain data-based events and triggering new actions based on that information means you can have your data working more efficiently for you, freeing up time to innovate and focus on core business competencies. Lastly, this approach simplifies your data lifecycle, so JSON data from various applications and endpoints can easily be transformed and consumed for rich analytics. Deeper understanding of your customer Businesses can use fully managed MongoDB Atlas to store customer 360 profiles. A 360-degree view of a customer allows businesses to track an individual customer’s journey across multiple channels, devices, purchases, and interactions, and improves customer satisfaction. With the combination of Atlas and BigQuery, businesses can also use compiled data — such as, transactional data, behavioral data, user profile and segmentations, and business analytics — to match user profiles with products and services using Artificial Intelligence (AI). Vertex AI , a managed machine learning platform, provides all the Google cloud services in one place to deploy and maintain AI models. Being able to easily access a 360 view for each customer and have automation around their customer journey helps with customer engagement and loyalty by improving customer satisfaction and retention through personalization and targeted marketing communications. It also enables retailers to aggregate customer interactions across all channels and identify valuable new customers. Google BigQuery and MongoDB Atlas in the real world Current , a leading U.S. challenger bank, uses innovative approaches, services, and technologies to serve people overlooked by traditional banks, regardless of age or income level, to help improve their financial outcomes. To help create customer experiences that cannot exist in traditional systems, Current chose to leverage Google Cloud, including BigQuery, with MongoDB layering the platform to achieve their goals. Read Full Current Story Are you a Google BiqQuery customer that is curious about how MongoDB Atlas can amplify your existing data warehouse or data lake architecture? Try MongoDB Atlas for free today and spin up your first workload in minutes. Try pay-as-you-go Atlas on GC Marketplace

December 1, 2022
Applied

Achieving Industrial Connectivity at Scale with Wimera and MongoDB

Industry 4.0 (I4.0) represents the beginning of the Fourth Industrial Revolution. It includes the current trend of automation technologies in the manufacturing industry as well as disruptive technologies and concepts, such as cyber-physical systems (CPS), Industrial Internet of Things (IIoT), cloud computing, and immersive visualization. Through Industry 4.0, embedded systems, semantic machine-to-machine communication, IIoT, and CPS technologies are integrating the virtual space with the physical world. These technologies are enabling a new generation of industrial systems, such as smart factories, to deal with the complexity of fast-paced and hyper-personalized production. In this article, we’ll explore Wimera’s unique solutions to the challenges of I4.0 and IIoT, built with MongoDB. Information and insights With IIoT, existing industrial systems will be modernized to drive digital transformation and unlock tomorrow's smart enterprise. IIoT has been finding its way into products and sensors while revolutionizing existing manufacturing systems; thus, it is considered a key enabler for the next generation of advanced manufacturing. Industry 4.0 generally comprises many complex components and has broad applications in all manufacturing sectors. The first challenge faced by manufacturing companies when embarking on the I4.0 journey is to sensorize and connect their manufacturing equipment in order to collect, store, and analyze data for information and insights. Wimera Systems is solving this challenge as an I4.0 enablement company offering IIoT solutions using their unique hardware, software application, and AI/ML-based analytics engine. Wimera’s Smart Factory Suite has seen tremendous growth, with 2500+ global installations across 50+ customers. MongoDB has been pivotal to that growth, acting as the core component of the IIoT suite and enabling the company to offer its services at scale without having to worry about managing the complexity of an IIoT database. Bringing AI-powered IIoT to the manufacturing shop floor Manufacturing companies are emerging from the pandemic with a renewed focus on digital transformation and smart factories investment. COVID-19 has heightened the need for IIoT technology and innovation, forcing manufacturers to compete in a digitalized business environment. Many manufacturers still operate using legacy technologies and systems; on most shop floors, equipment and operator efficiency are manually calculated and tracked using spreadsheets. The machines are maintained using time-based rather than condition-based maintenance strategies. And, no real-time visibility exists on consumables and tools usage. All these practices result in increased maintenance costs, suboptimal production, and ultimately, customer dissatisfaction. Wimera understands these challenges all too well, which is why they created the Smart Factory Suite supporting both on-premise and cloud deployments. The Smart Factory Suite provides insights for managing the entire production landscape through interconnected devices and machines, operations, and facilities. It can predict and make real-time adjustments for increased production efficiency and less downtime. The suite is primarily utilized for empowering manufacturing operations, equipment maintenance, warehouse operations, and inventory management. With Smart Factory Suite, Wimera serves a wide range of manufacturing industry sectors including, but not limited to, automotive, electronics, chemical, and food processing companies. Deploy and run anywhere with MongoDB MongoDB, with its freedom to run anywhere, lets Wimera offer both on-premises and cloud deployment options for its customers. In both cases, the suite is directly connected with machine controllers using Wimera libraries for all popular Programmable Logic Controller (PLC) brands. The suite is also connected to legacy machines through external sensors installed by the Wimera team. Data is extracted via the Wimera ReMON Data Acquisition (DAQ) device (Figure 1) that utilizes the MongoDB database as the persistent data storage. MongoDB’s flexible data model makes it easy to combine and enrich this data and enables live dashboards and instant alerts for factory personnel. The data collected and optimized by ReMON DAQ is further fed to ReMON AI , an advanced analytics engine. ReMON AI provides advanced analytics through AI/ML models and leverages MongoDB to deliver application-driven analytics in real time. Figure 1: ReMON DAQ and ReMON AI (source: Wimera ReMON ). Whether through on-premises or cloud deployment (Figures 2 and 3), Wimera’s customers have benefited from MongoDB’s capabilities that are critical for IIoT applications, such as time series collections and the flexible, intuitive document data model. Figure 2: Wimera IoT architecture on premises. Figure 3: Wimera IoT architecture on cloud (using MongoDB on AWS). In one customer example, while deploying IIoT at a multinational CNC machine shop, the customer preferred to use their existing production monitoring application enriched with IoT data coming from Wimera’s Smart Factory Suite. In this case, MongoDB enabled easy and seamless integration of the IoT application with the customer's application via a simple API. Additionally, high-speed data coming from a vibration sensor was handled effectively by MongoDB time series collections, resulting in real-time alerts sent to maintenance teams for instant corrective actions on the shop floor. In another example, a multinational automotive manufacturer wanted a single platform that could collect and combine data coming from vendors in different formats and contexts. MongoDB's flexible document model helped manage the varied data types easily, allowing the customer to benefit from a single application capable of managing multiple vendors in parallel. This flexibility offered by MongoDB enables the customer to keep adding new vendors instantly without changing the underlying cloud infrastructure or tweaking schemas. Interested readers can check out additional case studies on Wimera’s website. Building better together Wimera and MongoDB’s partnership gives customers confidence with validated architectures to ensure successful, optimized, and scalable deployments at their facilities. Wimera’s continued partnership with MongoDB also helps guide the company’s product roadmap as we expand in the IIoT, Smart Factory market together. MongoDB is the only enterprise grade database chosen by the Wimera development team due to easy handling of the large volume of data generated from machines and sensors while maintaining a high performance… If we want to insert thousands of records in a second, then MongoDB is the best choice for that given our solutions are for Industrial IoT. Also, horizontal scaling (adding new columns) is not an easy process in any RDBMS system. But in the case of MongoDB, it is very easy Nagarajan Narayanasamy, CEO, Wimera Systems Private Limited A bright future ahead Since 2019, Wimera has been an early adopter of MongoDB for their Industrial IoT application for discrete manufacturing industries and process industries on multiple domains. “Currently, Narayanasamy says, “Wimera’s Industrial IoT solutions are matured, and we are focused on scaling globally.” Wimera now targets expansion in India, APAC, EU, and USA for the discrete manufacturing and process industries and also for select OEMs and machine builders. “As MongoDB continues to scale itself globally through its multi-cloud data distribution strategy, we see a good synergy partnering with MongoDB for the mutual benefit of both companies and the community as a whole. We also would like to work with MongoDB on the technology roadmap and solve some of the real-life challenges faced by manufacturing industries,” Narayanasamy says. Wimera has recently started their MongoDB Atlas journey, and the adoption will grow as their customers demand more cloud solutions compared to current on-premises deployments. MongoDB will continue to help IoT companies like Wimera take their product offering to the next level and enable their customers to digitally transform their manufacturing operations. To learn more about MongoDB’s role in industrial connectivity and IIoT, please visit our Manufacturing and Industrial IoT page.

December 1, 2022
Applied

Choosing the Right Tool for the Job: Understanding the Analytics Spectrum

Data-driven organizations share a common desire to get more value out of the data they're generating. To maximize that value, many of them are asking the same or similar questions: How long does it take to get analytics and insights from our application data? What would be the business impact if we could make that process faster? What new experiences could we create by having analytics integrated directly within our customer-facing apps? How do our developers access the tools and APIs they need to build sophisticated analytics queries directly into their application code? How do we make sense of voluminous streams of time-series data? We believe the answer to these questions in today's digital economy is application-driven analytics. What is Application-Driven Analytics? Traditionally, there's been a separation at organizations between applications that run the business and analytics that manage the business. They're built by different teams, they serve different audiences, and the data itself is replicated and stored in different systems. There are benefits to the traditional way of doing things and it's not going away. However, in today's digital economy, where the need to create competitive advantage and reduce costs and risk are paramount, organizations will continue to innovate upon the traditional model. Today, those needs manifest themselves in the demand for smarter applications that drive better customer experiences and surface insights to initiate intelligent actions automatically. This all happens within the flow of the application on live, operational data in real time. Alongside those applications, the business also wants faster insights so it can see what's happening, when it's happening. This is known as business visibility, and the goal of it is to increase efficiency by enabling faster decisions on fresher data. In-app analytics and real-time visibility are enabled by what we call application-driven analytics. Find out why the MongoDB Atlas developer data platform was recently named a Leader in Forrester Wave: Translytical Data Platforms, Q4 2022 You can find examples of application-driven analytics in multiple real-world industry use cases including: Hyper-personalization in retail Fraud prevention in financial services Preventative maintenance in manufacturing Single subscriber view in telecommunications Fitness tracking in healthcare A/B testing in gaming Where Application-Driven Analytics fits in the Analytics Ecosystem Application-driven analytics complements existing analytics processes where data is moved out of operational systems into centralized data warehouses and data lakes. In no way does it replace them. However, a broader spectrum of capabilities are now required to meet more demanding business requirements. Contrasting the two approaches, application-driven analytics is designed to continuously query data in your operational systems. The freshest data comes in from the application serving many concurrent users at very low latency. It involves working on much smaller subsets of data compared to centralized analytics systems. Application-driven analytics is typically working with hundreds to possibly a few thousand records at a time. And it's running less complex queries against that data. At the other end of the spectrum is centralized analytics. These systems are running much more complex queries across massive data sets — hundreds of thousands or maybe millions of records, and maybe at petabyte scale — that have been ingested from many different operational data sources across the organization. Table 1 below identifies the required capabilities across the spectrum of different classes of analytics. These are designed to help MongoDB’s customers match appropriate technologies and skill sets to each business use case they are building for. By mapping required capabilities to use cases, you can see how these different classes of analytics serve different purposes. If, for example, we're dealing with recommendations in an e-commerce platform, the centralized data warehouse or data lake will regularly analyze vast troves of first- and third-party customer data. This analysis is then blended with available inventory to create a set of potential customer offers. These offers are then loaded back into operational systems where application-driven analytics is used to decide which offers are most relevant to the customer based on a set of real-time criteria, such as actual stock availability and which items a shopper might already have in their basket. This real-time decision-making is important because you wouldn't want to serve an offer on a product that can no longer be fulfilled or on an item a customer has already decided to buy. This example demonstrates why it is essential to choose the right tool for the job. Specifically, in order to build a portfolio of potential offers, the centralized data warehouse or data lake is an ideal fit. Such technologies can process hundreds of TBs of customer records and order data in a single query. The same technologies, however, are completely inappropriate when it comes to serving those offers to customers in real time. Centralized analytics systems are not designed to serve thousands of concurrent user sessions. Nor can they access real-time inventory or basket data in order to make low latency decisions in milliseconds. Instead, for these scenarios, application-driven analytics served from an operational system is the right technology fit. As we can see, application-driven analytics is complementary to traditional centralized analytics, and in no way competitive to it. The benefits to organizations of using these complementary classes of analytics include: Maximizing competitive advantage through smarter and more intelligent applications Out-innovating and differentiating in the market Improving customer experience and loyalty Reducing cost by improving business visibility and efficiency Through its design, MongoDB Atlas unifies the essential data services needed to deliver on application-driven analytics. It gives developers the tools, tech, and skills they need to infuse analytics into their apps. At the same time, Atlas provides business analysts, data scientists, and data engineers direct access to live data using their regular tools without impacting the app. For more information about how to implement app-driven analytics and how the MongoDB developer data platform gives you the tools needed to succeed, download our white paper, Application-Driven Analytics: Defining the Next Wave of Modern Apps .

November 30, 2022
Applied

MACH Aligned for Retail: Headless

The MACH Alliance is a non-profit organization fostering the adoption of composable architecture principles, namely Microservices , API-First , Cloud-Native SaaS , and Headless. MongoDB, among many other technology companies, is a member of this Alliance, enabling developers to adopt these principles in their applications. In this article, we’ll focus on the fourth principle championed by the MACH Alliance: Headless. Let’s dive in. What is headless? A headless architecture is one where the layers or components of the architecture are decoupled. The “heads” (i.e., frontends) operate independently from the backend logic or “core body” microservices and share data via API. This concept is key to a successful shift toward microservices — without decoupling the architectural layers, you’re running on a modern monolith. Looser coupling also leads to an increase in frontend change and flexibility, reusability of core features, less downtime because there’s no single point of failure, and promotes reusability of key features. Headless applied to retail Retail was one of the first industries to embrace headless architectures, with the term coined in 2012 by Dirk Hoerig, founder of commercetools . These concepts were originally applied to building modern ecommerce solutions and are now being expanded to any application in the IT stack. In this model, the head can be an ecommerce web frontend, or mobile app, or an internal frontend system for stock management. The core body components support the heads (Figure 1). They can be a payment system, a checkout solution, a product catalog, or a warehouse management application. Figure 1:   The “head” and “core body” components, sharing data as part of APIs. Customers and their experiences are at the heart of retail. Adopting headless principles can greatly help companies meet rapidly changing customer requirements and stand out from the competition. Customers require a seamless journey between mobile, web applications, and in-store with data and logic consistent across channels. New channels might also need to be added such as integration with social media, to reach a younger customer base. Retailers might need to be able to sell in multiple regions or across product lines, requiring them to adopt multiple frontends to serve different customer groups without having to rewrite or duplicate the whole IT stack. New features might need to be added quickly to reflect competitors’ moves without tracing changes back through every component of the stack or experiencing downtime. Internal workforce systems can follow similar principles. The common denominators of these example use cases include speed of change and frontend flexibility, avoiding downtime, and reusability of the backend components. Headless solutions enable developers to avoid duplicating efforts by reusing the core capabilities of applications and adapting them to various target systems and use cases. Those principles save developers’ time and can be leveraged to provide a seamless experience to customers, as the underlying data layer and workflows are shared across multiple services offering similar functionalities. Headless architectures also come with the following advantages. Bring new features to market faster New features and MVPs can be introduced with minimal impact on other application components. Release cycles can be managed efficiently via a microservice architecture relying on different squads, and new releases can be pushed to production when ready, independently of the work of other squads. For example, a retailer can expand into a new country quickly by developing a country-specific frontend that reuses existing core components and requires no backend downtime. Scale to meet seasonal demand Companies can independently scale application components where and when required. For example, increased user traffic might require more resources to support frontend components, leaving the backend untouched and vice versa. In an ecommerce scenario, this can take the form of expected deviations from a seasonality standpoint (e.g., end-of-month transactions following salary distribution, holiday shopping) or unplanned variations (e.g., influencer marketing). Thus, this model can result in: Cost savings: Achieve cost reductions as a headless architecture running on the cloud enables to further decouple its pay-as-you-go model, by only paying for the infrastructure required by each front/backend component. Improved customer experience: Develop highly available and responsive applications so that customer experience is not affected by computing resources. Leverage best-of-breed technologies Headless architectures can help companies gain greater flexibility in deploying and managing the IT stack, allowing them to: Focus on value-add development: A composable headless architecture enables companies to choose to build or buy individual components in the stack. As the components are decoupled, it becomes easier to unpick than if the stack is fully integrated — as the APIs can be redirected to the new solution more easily. This approach lets companies put their development activity into value-added functionality should a best-of-breed vendor solution arrive on the market delivering core functionality. Avoid vendor lock-in: This also allows for more seamless technology switches should companies decide to bring development back in-house or switch vendors. Improve talent acquisition and retention: Deploying in a flexible and composable manner lets development teams choose the programming languages and tools they feel best match the requirements, allowing companies to attract and retain top talent. Less downtime with faster troubleshooting A headless architecture also makes it easier to pinpoint which single layer/component is the root cause of issues, as opposed to troubleshooting in monolithic applications where dependencies can be difficult to map. Fewer dependencies mean less downtime; when a change or failure occurs to one component, it doesn't affect the whole stack. For ecommerce retailers, any downtime can have a direct impact on revenue, so an architecture that supports a move towards 24/7 uptime is ideal. Removing data silos and sharing data across multiple journeys also enables companies to implement truly omnichannel experiences and leverage the datasets for other downstream processes, such as user personalization and analytics. Learn how Boots is using MongoDB Atlas to standardize their infrastructure via an API and microservice-driven approach . How can MongoDB help? Headless architectures require a strong data layer to reap all the above-mentioned benefits. MongoDB includes several key features that enable developers to speed up the pace of delivery of new features and bug fixes, scale with minimal effort, and leverage APIs to share data with the different components of the stack. Deliver faster with no downtime MongoDB provides a flexible document model that easily adapts to the needs of different microservices and supports adding new features and data fields without having to rethink the underlying data schema or experience downtime. Let’s consider a product catalog microservice that uses a particular API to read data from certain fields. A second microservice can be developed requiring the same set of fields as the first along with a few new ones connecting via a new API. MongoDB allows the change to be made with no downtime of the product catalog microservice and related API. Scale effortlessly Adding new features and services will likely require scaling the data layer to cater to higher storage and workload. MongoDB, through its sharding capabilities , enables a distributed architecture by horizontally scaling the data layer and by distributing data across multiple servers. This approach can provide better efficiency than a single high-speed, high-capacity server (vertical scaling), to build highly responsive retail solutions. Support composable architectures MongoDB also possesses strong API capabilities to support a microservice-based backend architecture and make data accessible and shareable across components (Figure 2). These capabilities include APIs and drivers supporting a dozen programming languages on the market, such as C, Python, Node.js, and Scala. The MongoDB Unified Query API allows working with data of any type, including time series, arrays, and geospatial. MongoDB Atlas, MongoDB’s Developer Data Platform, comes with the Atlas Data API allowing to programmatically create, read, update, and delete data stored on Atlas clusters as part of standard HTTPS requests. The Atlas GraphQL API allows fine-tuning of API requests by returning only the required data (e.g., information about a particular customer or product). Figure 2:   MongoDB supports a headless architecture via APIs. Data availability and resiliency should also be considered when adopting headless architectures. MongoDB Atlas clusters are highly available and backed by an industry-leading uptime SLA of 99.995% across all cloud providers. If a primary node becomes unavailable, MongoDB Atlas will automatically failover in seconds. Clusters can be also deployed across multiple cloud regions to weather the unlikely event of a total region outage, or in multiple cloud platforms together. Summary Adopting a headless architecture is paramount for retailers wanting to enhance customer experience and build more resilient applications. MongoDB, with its leading database offering, API layer, and high availability is strongly suited to meet the requirements of modern applications. Read our previous blog posts in the MACH series covering Microservices , API-First , and Cloud-Native SaaS .

November 30, 2022
Applied

3 Key Characteristics of Modernization

Analyst and research firm TDWI released its latest report on IT modernization: Maximizing the Business Value of Data: Platforms, Integration, and Management . The report reveals the modernization strategies, objectives, and experiences of more than 300 IT executives, data analysts, data scientists, developers, and enterprise architects. Within the survey itself lies the deeper, fundamental question of what is IT modernization in today's digital economy? It's an important question because it gets at the heart of why organizations want and need to modernize in the first place. Considering the effort, expense, and risks of modernizing, there needs to be a compelling purpose guiding the process in order to keep it on track and ensure its success. By dissecting the TDWI survey questions and responses, we can deduce what the three key characteristics of modernization are. #1: Modernization capabilities If we were to examine the elements and components that comprise modernized architecture, we would get a sense of what modernization looks like but not the purpose behind its deployment. So instead, let's start by looking at the capabilities modern architecture enables so we can get a clearer view of its characteristics and why they matter. Seventy-three percent of survey respondents reported that data democratization and self-service functionality are either extremely or very important. We've heard from numerous organizations that the task of managing data access at companies is slowing down innovation. Ben Herzberg, chief data scientist for data access company, Satori, recently told us , "The majority of organizations are still managing access to data in a manual way. Everyone is feeling the bottleneck. The data analyst who wants to do their job in a meaningful way just wants to understand what data sets they can use and get access to it fast." Getting access to data can be challenging without some sort of self-service data access capability. "Sometimes you have to go through three or four different teams to get access to data," Herzberg says. "It can take a week or two." The TDWI report also indicated a long-standing trend toward easier, more intuitive experiences extending to data integration, data pipelines, data catalog interaction, and monitoring. Survey respondents' top priorities over the next 12 months support this trend. In addition to migrating and consolidating data in the cloud, they intend to prioritize the following key capabilities: Enabling better data management for data science, AI, and ML Supporting development and deployment of data driven applications Supporting expansion in self service Business intelligence (BI) and analytics users Unifying management of data across distributed systems BI and analytics platforms remain one of the fastest growing software markets. The capabilities necessary to power these systems are in high demand: self-service analytics, faster discovery, predictions based on real-time operational data, and integration of rich and streaming data sets. The survey responses also showed that handling an increase in data volume and the number of concurrent users are modernization priorities. And there's pressure to reduce data latency and increase the frequency of updates. The survey showed that one of the most challenging capabilities organizations are dealing with is enabling low latency querying, search, and analytics. Giving users the right data at the right time to answer business questions, solve problems, and innovate with data is critical today and it depends on these capabilities. #2: Modernization outcomes The capabilities organizations seek only serve their modernization goals as far as they enable specific outcomes. And it's outcomes that are ultimately driving modernization initiatives. According to the survey, the number one outcome organizations seek to bring about is gaining fuller value from the data they store and capture. Forty-six percent of respondents cited it as their top challenge. Automating decision-making is another outcome organizations are seeking. Thirty-two percent of respondents rated automating decisions in operations in processes as very important. But it relies on the timely flow of insights into apps, one of the key capabilities identified earlier. Other key modernization outcomes cited in the survey include: Increase efficiency and effectiveness Generate new business strategies and models using analytics Make faster decisions Strengthen relationships via data sharing Improve trust and data quality Increase reuse and flexibility Reduce costs Provide authorized access to live data sets Consolidate data silos Developers in the survey said they were seeking to embed richer, personalized application experiences, with 52% saying they wanted seamless access to diverse data sets and sources. But first, they'll have to overcome several challenges that so far have proved difficult to solve. Sixty-eight percent of respondents said they face challenges processing streaming data and change data capture updates, and 64% struggle to integrate streaming with fast, high volume queries, and the same percentage said they struggle with combining historical and real-time analytics. #3: Modernization platform Modernized problems require modernized solutions. And the one most most commonly cited by respondents was a data platform , which they believe is the key to maximizing value from data. A data platform solves the issue of consolidating unnecessary data silos and ensuring access to data without the hassle of manual intervention or the risk of unauthorized access. Flexibility in the data platform is critical since data environments will continue to evolve, even after modernization milestones have been met. A data platform is one of the key elements that comprise modernized architecture. The TDWI survey cited several other advantages of unifying distributed data within a data platform: Simplifying and accelerating access Discovering data relationships easier and faster Creating a logical layer for single point of access Unifying data governance Reducing unnecessary data movement Modernized architecture Fifty-four percent of respondents said they were in the process of modernizing, and 29% were planning on doing so. The most frequently cited architectural feature by those modernizing or planning to was cloud migration from on-premises systems, with the goal being to change the dimensions of what was possible. But it wasn't just shifting to the cloud that respondents mentioned. The survey also indicated the prevalence of hybrid multi-cloud architectures as well, with data integration and management that span distributed data environments. Distributed architectures can lead to higher performance by putting data closest to where it's being used. It also solves data sovereignty issues by putting data where it's required to be due to regulatory jurisdiction. The report also mentions serverless architecture due to its pay-as-you-go computing model and improved business alignment. With serverless architecture , developers can build applications without thinking about infrastructure or traditional server management. Read the full TDWI report, Maximizing the Business Value of Data: Platforms, Integration, and Management .

November 28, 2022
Applied

Modernize your GraphQL APIs with MongoDB Atlas and AWS AppSync

Modern applications typically need data from a variety of data sources, which are frequently backed by different databases and fronted by a multitude of REST APIs. Consolidating the data into a single coherent API presents a significant challenge for application developers. GraphQL emerged as a leading data query and manipulation language to simplify consolidating various APIs. GraphQL provides a complete and understandable description of the data in your API, giving clients the power to ask for exactly what they need — while making it easier to evolve APIs over time. It complements popular development stacks like MEAN and MERN , aggregating data from multiple origins into a single source that applications can then easily interact with. MongoDB Atlas: A modern developer data platform MongoDB Atlas is a modern developer data platform with a fully managed cloud database at its core. It provides rich features like native time series collections, geospatial data, multi-level indexing, search, isolated workloads, and many more — all built on top of the flexible MongoDB document data model. MongoDB Atlas App Services help developers build apps, integrate services, and connect to their data by reducing operational overhead through features such as hosted Data API and GraphQL API. The Atlas Data API allows developers to easily integrate Atlas data into their cloud apps and services over HTTPS with a flexible, REST-like API layer. The Atlas GraphQL API lets developers access Atlas data from any standard GraphQL client with an API that generates based on your data’s schema. AWS AppSync: Serverless GraphQL and pub/sub APIs AWS AppSync is an AWS managed service that allows developers to build GraphQL and Pub/Sub APIs. With AWS AppSync, developers can create APIs that access data from one or many sources and enable real-time interactions in their applications. The resulting APIs are serverless, automatically scale to meet the throughput and latency requirements of the most demanding applications, and charge only for requests to the API and by real-time messages delivered. Exposing your MongoDB Data over a scalable GraphQL API with AWS AppSync Together, AWS AppSync and MongoDB Atlas help developers create GraphQL APIs by integrating multiple REST APIs and data sources on AWS. This gives frontend developers a single GraphQL API data source to drive their applications. Compared to REST APIs, developers get flexibility in defining the structure of the data while reducing the payload size by bringing only the attributes that are required. Additionally, developers are able to take advantage of other AWS services such as Amazon Cognito, AWS Amplify, Amazon API Gateway, and AWS Lambda when building modern applications. This allows for a severless end-to-end architecture, which is backed by MongoDB Atlas serverless instances and available in pay-as-you-go mode from the AWS Marketplace . Paths to integration AWS AppSync uses data sources and resolvers to translate GraphQL requests and to retrieve data; for example, users can fetch MongoDB Atlas data using AppSync Direct Lambda Resolvers. Below, we explore two approaches to implementing Lambda Resolvers: using the Atlas Data API or connecting directly via MongoDB drivers . Using the Atlas Data API in a Direct Lambda Resolver With this approach, developers leverage the pre-created Atlas Data API when building a Direct Lambda Resolver. This ready-made API acts as a data source in the resolver, and supports popular authentication mechanisms based on API Keys, JWT, or email-password. This enables seamless integration with Amazon Cognito to manage customer identity and access. The Atlas Data API lets you read and write data in Atlas using standard HTTPS requests and comes with managed networking and connections, replacing your typical app server. Any runtime capable of making HTTPS calls is compatible with the API. Figure 1:   Architecture details of Direct Lambda Resolver with Data API Figure 1 shows how AWS AppSync leverages the AWS Lambda Direct Resolver to connect to the MongoDB Atlas Data API. The Atlas Data API then interacts with your Atlas Cluster to retrieve and store the data. MongoDB driver-based Direct Lambda Resolver With this option, the Lambda Resolver connects to MongoDB Atlas directly via drivers , which are available in multiple programming languages and provide idiomatic access to MongoDB. MongoDB drivers support a rich set of functionality and options , including the MongoDB Query Language, write and read concerns, and more. Figure 2:   Details the architecture of Direct Lambda Resolvers through native MongoDB drivers Figure 2 shows how the AWS AppSync endpoint leverages Lambda Resolvers to connect to MongoDB Atlas. The Lambda function uses a MongoDB driver to make a direct connection to the Atlas cluster, and to retrieve and store data. The table below summarizes the different resolver implementation approaches. Table 1:   Feature comparison of resolver implementations Setup Atlas Cluster Set up a free cluster in MongoDB Atlas. Configure the database for network security and access. Set up the Data API. Secret Manager Create the AWS Secret Manager to securely store database credentials. Lambda Function Create Lambda functions with the MongoDB Data APIs or MongoDB drivers as shown in this Github tutorial . AWS AppSync setup Set up AWS Appsync to configure the data source and query. Test API Test the AWS AppSync APIs using the AWS Console or Postman . Figure 3:   Test results for the AWS AppSync query Conclusion To learn more, refer to the AppSync Atlas Integration GitHub repository for step-by-step instructions and sample code. This solution can be extended to AWS Amplify for building mobile applications. For further information, please contact partners@mongodb.com .

November 23, 2022
Applied

MongoDB Joins Auth0 to Help Startups Combat Security Risks

We are excited to announce that MongoDB for Startups is collaborating with Auth0 for Startups to provide top security for applications by the most innovative startups. Why should a startup be part of the MongoDB and Auth0 startup programs? Customers, investors, and stakeholders expect many different things from a company, but one common requirement is responsibly managing their data. Companies choose MongoDB because it accelerates application development and makes it easier for developers to work with data. Developers mindful of security, compliance, and privacy when it comes to data use the robust Auth0 platform to create great customer experiences with features like single sign-on and multi-factor authentication. “Auth0 and MongoDB are very complementary in nature. While MongoDB provides a strong, secure data platform to store sensitive workloads, Auth0 provides secure access for anyone with the proper authorization," says Soumyarka Mondal, Co-founder of Sybill.ai. "We are safely using Auth0 as one of the data stores for the encryption piece, as well as using those keys to encrypt all of our users’ confidential information inside MongoDB.” What is the Auth0 for Startups Program? Auth0, powered by Okta, takes a modern approach to identity and enables startups to provide secure access to any application, for any user. Through Auth0 for Startups, we are bringing the convenience, privacy, and security of Auth0 to early-stage ventures, allowing them to focus on growing their business quickly. The Auth0 for Startups program is free for one year and supports: 100,000 monthly active users Five enterprise connections Passwordless authentication Breached password detection 50+ integrations, 60+ SDKs, and 50+ social & IdP connections What is the MongoDB for Startups Program? MongoDB for Startups is focused on enabling the success of high-growth startups from ideation to IPO. The program is designed to give startups access to the best technical database for their rapidly scaling ventures. Apply to our program and program participants will receive: $500 in credits for all MongoDB cloud products (valid for 12 months) A dedicated technical advisor for a two-hour, one-to-one consultation to help you with your data migration and optimization Co-marketing opportunities Access to the MongoDB developer ecosystem and access to our VC partners. Apply to Auth0 For Startups and the MongoDB for Startups Program today.

November 23, 2022
Applied

MongoDB and AWS: Simplifying OSDU Metadata Management

In this decade of the 2020s, the energy sector is experiencing two major changes at the same time: The transition from fossil to renewables, and the digital transformation that changes the way businesses operate through better applications and tools that help streamline and automate processes. To support both of these challenges, the Open Group OSDU Forum has created a new data platform standard for the energy industry that seeks to reduce data silos and enable transformational workflows via an open, standards-based API set and supporting ecosystem. OSDU (Open Subsurface Data Universe) is an industry-defining initiative that provides a unified approach to store and retrieve data in a standardized way in order to allow reductions in infrastructure cost, simplify the integration of separate business areas, and adopt new energy verticals within the same architectural principles. Amazon Web Services (AWS) — as an early supporter of OSDU — provides a premier, cloud-first offering available across more than 87 availability zones and 27 regions. MongoDB — an OSDU member since 2019 — and AWS are collaborating to leverage MongoDB as part of the AWS OSDU platform for added flexibility and to provide a robust multi-region OSDU offering to major customers. Why MongoDB for OSDU? OSDU provides a unique challenge, as its architecture is set to support a varied data set originating from the oil and gas industry, while also being extensible enough to support the expanding requirements of new energy and renewables. It must be able to support single-use on a laptop for beginning practitioners, yet scale to the needs of experts with varying deployment scenarios — from on-premises, in-field, and cloud — and from single tenant on one region to multi-region and multi-tenant applications. Furthermore, OSDU architectural principles separate raw object data from the metadata that describes it, which puts an additional burden on the flexibility needed to manage OSDU metadata, while supporting all the above requirements. Enter MongoDB Since 2008, MongoDB has championed the use of the document model as the data store that supports a flexible JSON-type structure, which can be considered a superset of different existing data types — from tabular, key-value, and text to geo-spatial, graph, and time series. Thus, MongoDB has the flexibility not only to support just the main metadata services in OSDU but also to adapt to the needs of domain-specific services as OSDU evolves. The flexibility of MongoDB allows users to model and query the data in a variety of ways within the same architecture without the need to proliferate disparate databases for each specific data type, which incurs overhead both in terms of deployment, cost and scale, and the ability to query. The schema flexibility inherent in this document model allows developers to adapt and make changes quickly, without the operational burden that comes with schema changes with traditional tabular databases. MongoDB can also scale from the smallest environment to massive, multi-region deployments, with cross-regional data replication support that is available today across more than 90 regions with MongoDB Atlas . With the addition of MongoDB’s cluster-to-cluster sync , MongoDB can easily support hybrid deployments bridging on-premises or edge to the cloud, a requirement that is increasingly important for energy supermajors or for regions where data sovereignty is paramount. Example: LegalTag An example of the benefit of MongoDB’s document model is OSDU’s LegalTag Compliance Service , which governs the legal status of data in the OSDU data ecosystem. It is a collection of JSON properties that governs how the data can be consumed and ingested. With MongoDB, the properties are directly stored, indexed, and made available to be queried — even via full-text search for more advanced use cases. The schema flexibility simplifies integrating additional derived data from ingested data sources, which is utilized for the further enrichment of the LegalTag metadata. Here the JSON document can accommodate more nodes to integrate this data without the need for new tables and data structures that need to be created and managed. AWS OSDU with MongoDB MongoDB and AWS collaborated to provide a MongoDB-based metadata implementation (Figure 1), which is available for all main OSDU services: Partition, Entitlements, Legal, Schema, Storage. The AWS default ODSU Partition service leverages MongoDB due to its simple replication capabilities (auto-deployable via CloudFormation, Terraform, and Kubernetes), which simplify identifying the correct connection information at runtime to the correct OSDU partition in a multi-region and multi-cluster deployment. The OSDU Entitlements service manages authorization and permissions for access to OSDU services and its data-using groups. The most recent OSDU reference implementation for Entitlements leverages a graph model to manage the relationship between groups, members, and owners. Thus, AWS again chose MongoDB with its inherent graph capabilities through the document model to simplify the implementation without the need to integrate a further dedicated database technology into the architecture. Figure 1:   MongoDB metadata service options with AWS OSDU. Other potential benefits for OSDU MongoDB also offers workload isolation , which provides the ability to dedicate instances only for reporting workloads against the operational dataset. This provides the ability to create real-time observability of the system based on the activity on metadata. Triggers and aggregation pipelines allow the creation of an alternate view of activity in real-time, which can easily be visualized via MongoDB Charts (part of Atlas) without the need for a dedicated visualization system. Flexibility and consistency A major use case for both the energy industry and the direction of OSDU is the ability to capture and preprocess data closest to where it originated. For remote locations where direct connections to the cloud are prohibitive, this approach is often the only option — think Arctic or off-shore locations. Additionally, certain countries have data sovereignty laws that require an alternative deployment option outside of the public cloud. A MongoDB-based OSDU implementation can provide a distinct advantage, as MongoDB as a data platform itself supports deployment in the field (e.g., off-shore), on-premises, in private cloud (e.g., Kubernetes, Terraform), public cloud (e.g., AWS) and as a SaaS implementation (e.g., Atlas). Adoption of MongoDB for OSDU provides consistency across different deployment/cloud scenarios, thereby reducing the overhead for managing and operating a disparate set of technologies where multiple scenarios are required. Conclusion OSDU has been created to change the way data is collected and shared across the oil and gas and energy industry. Its intent is to accelerate digital transformation within the industry. The range of use cases and deployment scenarios requires a solution that provides flexibility in the supported datasets, flexibility for the developer to innovate without additional schema and operational burden, as well as flexibility to be deployable in various environments. Through the collaboration of AWS and MongoDB, there is an additional metadata storage option available for OSDU that provides a modern technology stack with the performance and scalability for the most demanding scenario in the energy industry. 1. MongoDB Atlas 2. MongoDB Edge Computing 3. OSDU Data platform on AWS

November 22, 2022
Applied

Manage and Store Data Where You Want with MongoDB

Increasingly, data is stored in a public cloud as companies realize the agility and cost benefits of running on cloud infrastructure. At any given time, however, organizations must know where their data is located, replicated, and stored — as well as how it is collected and processed to constantly ensure personal data privacy. Creating a proper structure for storing your data just where you want it can be complex, especially with the shift towards geographically dispersed data and the need to comply with local and regional privacy and data security requirements. Organizations without a strong handle on where their data is stored potentially risk millions of dollars in regulatory fines for mishandling data, loss of brand credibility, and distrust from customers. Geographically dispersed data and various compliance regulations also impact how organizations design their applications, and many see these challenges as an opportunity to transform how they engage with data. For example, organizations get the benefits of a multi-cloud strategy and avoid vendor lock-in, knowing that they can still run on-premises or on a different cloud provider. However, a flexible data model is needed to keep data within the confines of the country or region where the data originates. MongoDB runs where you want your data to be — on-premises, in the cloud, or as an on-demand, fully managed global cloud database. In this article, we’ll look at ways MongoDB can help you keep your data exactly where you need it. Major considerations for managing data When managing data, organizations must answer questions in several key areas, including: Process: How is your company going to scale security practices and automate compliance for the most prevalent data security and privacy regulatory frameworks? Penalties: Are your business leaders fully aware of the costs associated with not adhering to regulations when storing and managing your data? Scalability: Do you have an application that you anticipate will grow in the future and can scale automatically as demand requires? Infrastructure: Is legacy infrastructure keeping you from being able to easily comply with data regulations? Flexibility: Is your data architecture agile enough to meet regulations quickly as they grow in breadth and complexity? Cost: Are you wasting time and money with manual processes when adhering to regulations and risking hefty fines related to noncompliance? How companies use MongoDB to store data where they want and need it When storing and managing data in different regions and countries, organizations must also understand the rules and regulations that apply. MongoDB is uniquely positioned to support organizations to meet their data goals with intuitive security features and privacy controls, as well as the ability to geographically deploy data clusters and backups in one or several regions. Zones in sharded clusters MongoDB uses sharding to support deployments with very large data sets and high-throughput operations. In sharded clusters, you can create zones of sharded data based on the shard key, which helps improve the locality of data. Network isolation and access Each MongoDB Atlas project is provisioned into its own virtual private cloud (VPC), thereby isolating your data and underlying systems from other MongoDB Atlas users. This approach allows businesses to meet data requirements while staying highly available within each region. Each shard of data will have multiple nodes that automatically and transparently fail over for zero downtime, all within the same region. Multi-cloud clusters MongoDB Atlas is the only globally distributed, multi-cloud database. It lets you deploy a single cluster across AWS, Microsoft Azure, and Google Cloud without the operational complexity of managing data replication and migration across clouds. With the ability to define a geographic location for each document, your teams can also keep relevant data close to end users for regulatory compliance. IP whitelists IP whitelists allow you to specify a specific range of IP addresses against which access will be granted, delivering granular control over data. Queryable encryption Queryable encryption enables encryption of sensitive data from the client side, stored as fully randomized, encrypted data on the database server side. This feature delivers the utmost in security without sacrificing performance and is available on both MongoDB Atlas and Enterprise Advanced. MongoDB Atlas global clusters Atlas global clusters allow organizations with distributed applications to geographically partition a fully managed deployment in a few clicks and control the distribution and placement of their data with sophisticated policies that can be easily generated and changed. Thus, your organization can not only achieve compliance with local data protection regulations more easily but also reduce overhead. Client-Side Field Level Encryption MongoDB’s Client-Side Field Level Encryption (FLE) dramatically reduces the risk of unauthorized access or disclosure of sensitive data. Fields are encrypted before they leave your application, protecting them everywhere — in motion over the network, in database memory, at rest in storage and backups, and in system logs. Segmenting data by location with sharded clusters As your application gets more popular, you may reach a point where your servers will reach their maximum load. Before you reach that point, you must plan for scaling your database to adjust resources to meet demand. Scaling can occur temporarily, with a sudden burst of traffic, or permanently with a constant increase in the popularity of your services. Increased usage of your application brings three main challenges to your database server: The CPU and/or memory becomes overloaded, and the database server either cannot respond to all the request throughput, or do so in a reasonable amount of time. Your database server runs out of storage and thus cannot store all the data. Your network interface is overloaded, so it cannot support all the network traffic received. When your system resource limits are reached, you will want to consider scaling your database. Horizontal scaling refers to bringing on additional nodes to share the load. This process is difficult with relational databases because of the difficulty in spreading out related data across nodes. With non-relational databases, this is made simpler because collections are self-contained and not coupled relationally. This approach allows them to be distributed across nodes more simply, as queries do not have to “join” them together across nodes. Horizontal scaling with MongoDB Atlas is achieved through sharding. With sharded clusters, you can create zones of sharded data based on the shard key . You can associate each zone with one or more shards in the cluster. A shard can be associated with any number of zones. In a balanced cluster, MongoDB migrates chunks covered by a zone only to those shards associated with the zone: If one of the data centers goes down, the data is still available for reads unlike a single data center distribution. If the data center with a minority of the members goes down, the replica set can still serve write operations as well as read operations. However, if the data center with the majority of the members goes down, the replica set becomes read-only. Figure 1 illustrates a sharded cluster that uses geographic zones to manage and satisfy data segmentation requirements. Figure 1:   Sharded cluster Other benefits of MongoDB Atlas MongoDB Atlas also provides organizations with an intuitive UI or administration API to efficiently perform tasks that would otherwise be very difficult. Upgrading your servers or setting up sharding without having to shut down your servers can be a challenge, but MongoDB Atlas removes this layer of difficulty through the features described here. With MongoDB, scaling your databases can be done with a couple of clicks. Meeting your data goals with MongoDB Organizations are uniquely positioned to store and manage data where they want it with MongoDB’s range of features discussed above. With the shift towards geographically dispersed data, organizations must make sure they are aware of – and fully understand – the local and regional rules and requirements that apply for storing and managing data. To learn more about how MongoDB can help you meet your data goals, check out the following resources: MongoDB Atlas security, with built-in security controls for all your data Entrust MongoDB Cloud Services with sensitive application and user data Scalability with MongoDB Atlas

November 22, 2022
Applied

Optimizing Your MongoDB Deployment with Performance Advisor

We are happy to announce additional enhancements to MongoDB’s Performance Advisor, now available in MongoDB Atlas , MongoDB Cloud Manager , and MongoDB Ops Manager . MongoDB’s Performance Advisor automatically analyzes logs for slow-running queries and provides index suggestions to improve query performance. In this latest update, we’ve made some key updates, including: A new ranking algorithm and additional performance statistics (e.g., average documents scanned, average documents returned, and average object size) make it easier to understand the relative importance of each index recommendation. Support for additional query types including regexes, negation operators (e.g., $ne, $nin, $not), $count, $distinct, and $match to ensure we cover with optimized index suggestions. Index recommendations are now more deterministic so they are less impacted by time and provide more consistent query performance benefits. Before diving further into MongoDB’s Performance Advisor, let’s look at tools MongoDB provides out of the box to simplify database monitoring. Background Deploying your MongoDB cluster and getting your database running is a critical first step, but another important aspect of managing your database is ensuring that your database is performant and running efficiently. To make this easier for you, MongoDB offers several out-of-the-box monitoring tools , such as the Query Profiler, Performance Advisor, Real-Time Performance Panel, and Metrics Charts, to name a few. Suppose you notice that your database queries are running slower. The first place you might go is to the metrics charts to look at the “Opcounters” metrics to see whether you have more operations running. You might also look at the “Operation Execution Time” to see if your queries are taking longer to run. The “Query Targeting” metric shows the ratio of the number of documents scanned over the number of documents returned. This datapoint is a great measure of the overall efficiency of a query — the higher the ratio, the less efficient the query. These and other metrics can help you identify performance issues with your overall cluster, which you can then use as context to dive a level deeper and perform more targeted diagnostics of individual slow-running queries . MongoDB’s Performance Advisor takes this functionality a step further by automatically scanning your slowest queries and recommending indexes where appropriate to improve query performance. Getting started with Performance Advisor The Performance Advisor is a unique tool that automatically monitors MongoDB logs for slow-running queries and suggests indexes to improve query performance. Performance Advisor also helps improve both your read and write performance by intelligently recommending indexes to create and/or drop (Figure 1). These suggestions are ranked by the determined impact on your cluster. Performance Advisor is available on M10 and above clusters in MongoDB Atlas as well as in Cloud Manager and Ops Manager. Figure 1:  Performance Advisor can recommend indexes to create or drop. Performance Advisor will suggest which indexes to create, what queries will be affected by the index, and the expected improvements to query performance. All of these user interactions are available in the user interface directly within Performance Advisor, and indexes can be easily created with just a few clicks. Figure 2 shows additional Performance Advisor statistics about the performance improvements this index would provide. The performance statistics that are highlighted for each index recommendation include: Execution Count: The number of queries per hour that would be covered by the recommended index Avg Execution Time: The average execution time of queries that would be covered by the recommended index Avg Query Targeting: The inefficiency of queries that would be covered by the recommended index, measured by the number of documents or index keys scanned in order to return one document In Memory Sort: The number of in-memory sorts performed per hour for queries that would be covered by the recommended index Avg Docs Scanned: The average number of documents that were scanned by slow queries with this query shape Avg Docs Returned: The average number of documents that were returned by slow queries with this query shape Avg Object Size: The average object size of all objects in the impacted collection If you have multiple index recommendations, they are ranked by their relative impact to query performance so that the most beneficial index suggestion is displayed at the top. This means that the most impactful index is displayed at the top and would be the most beneficial to query performance. Figure 2:  Detailed performance statistics. Creating optimal indexes ensures that queries are not scanning more documents than they return. However, creating too many indexes can slow down write performance, as each write operation needs to check each index before writing. Performance Advisor provides suggestions on which indexes to drop based on whether they are unused or redundant (Figure 3). Users also have the option to “hide” indexes as a way to evaluate the impact of dropping an index without actually dropping the index. Figure 3: Performance Advisor shows which indexes are unused or redundant. The Performance Advisor in MongoDB provides a simple and cost-efficient way to ensure you’re getting the best performance out of your MongoDB database. If you’d like to see the Performance Advisor in action, the easiest way to get started is to sign up for MongoDB Atlas , our cloud database service. Performance Advisor is available on MongoDB Atlas on M10 cluster tiers and higher. Learn more from the following resources: Monitor and Improve Slow Queries Monitor Your Database Deployments

November 22, 2022
Applied

Ready to get Started with MongoDB Atlas?

Start Free