You Asked, We Listened. It's Here - Dark Mode for Atlas is Now Available in Public Preview
We are thrilled to announce a much-anticipated feature for MongoDB Atlas. Dark mode is now available in Public Preview for users worldwide. Dark mode has been the number one requested feature in MongoDB's feedback forum , and we've taken note. Users have tried browser plugins and other makeshift fixes, but now the wait is over. Our development team diligently worked to introduce a dark mode option, improving user experience with a new and refreshing perspective to the familiar interface of Atlas. This update—which includes 300 converted pages—is not just for our community. It also benefits us as developers, promoting a seamless dark mode experience across different tools in the developer workflow. Dark mode is sleek and sophisticated, aligning with the preferred working styles of many of our developers. Remember that this is an ongoing project, and there may be areas within Atlas that need refining. Rest assured, we will be monitoring our feedback channels closely. Not just a sleek interface We took a thoughtful approach to the overall dark mode user experience, particularly with respect to accessibility considerations. We ensured that our dark mode theme met accessibility standards by checking and adjusting all text, illustrations, and UI elements for color and contrast to help reduce eye strain and address those with light sensitivities while making sure it was still easy to read. We also focused on accommodating the overall light-to-dark background contrast while staying mindful of how they may layer or interact with other elements. Beyond aesthetics, dark mode is a proven method for extending battery life. For our users with OLED or AMOLED screens dark mode ensures the device’s battery life stretches even further by illuminating fewer pixels and encouraging lower brightness levels. Health benefits A typical engineer spends no fewer than eight hours a day in front of a computer, exposing their eyes to multiple digital screens, according to data from Medium . This screen usage can lead to dry eyes, insomnia, and headaches. While dark text on a light background is best for legibility purposes, light text on a dark background helps reduce eye strain in low-light conditions. Enable dark mode preview today To update the theme at any time, navigate to the User Menu in the top right corner, then select User Preferences . Under Appearance , there will be three options. Light Mode: This is the default color scheme. Dark Mode: Our new dark theme. Auto (Sync with OS): This setting will match the operating system's setting. A few things to keep in mind This is a user setting and does not impact other users within a project or organization. Dark mode is not currently available for Charts, Documentation, University, or Cloud Manager. Since we are releasing this in Public Preview , there might be some minor visual bugs. The goal of Public Preview releases is to generate interest and gather feedback from early adopters. It is not necessarily feature-complete and does not typically include consulting, SLAs, or technical support obligations. We have conducted comprehensive internal testing, and we did not find anything that prevents users from using Atlas. While we are still making a few finishing touches feel free to share any feedback using this form . Thank you to all our users who provided valuable feedback and waited patiently for this feature! Keep the feedback coming . We hope you enjoy dark mode, designed to improve accessibility, reduce eye strain and fatigue, and enhance readability. We invite you to experience the difference. Try dark mode today through your MongoDB Atlas portal .
Data Resilience with MongoDB Atlas
Data is the central currency in today's digital economy. Studies have shown that 43% of companies that experience major data loss incidents are unable to resume business operations. A range of scenarios can lead to data loss, yet within the realm of database technology, they typically fall under three main categories: catastrophic technical malfunctions, human error, and cyber attacks. A data loss event due to a catastrophic breakdown, human error, or cyber attack is not a matter of if, but a matter of when it will occur. Hence, businesses need to focus on how to avoid and minimize the effects as much as possible. Failure to effectively address these risks can lead to extended periods of downtime of a few hours or even a few weeks following an incident. The average cost of cyberattacks is a surprising $4.45 million, with some attacks costing in the hundreds of millions. Reputational harm is harder to quantify but no doubt real and substantial. The specific industry you're in might be subject to regulatory frameworks designed to counter cyber attacks. Businesses that are subject to regulatory regimes must maintain compliance with these requirements. This can determine the configuration of your disaster recovery approach. In this blog post, we'll explain the key disaster recovery (DR) capabilities available with MongoDB Atlas . We'll also cover the core responsibilities and strategies for data resilience including remediation, and recovery objectives (RTO/RPO). Planning for data resilience in Atlas Data resilience is not a one-size-fits-all proposition, which is why we offer a range of choices in Atlas for a comprehensive strategy. Our sensible defaults ensure you're automatically safeguarded, while also offering a variety of choices to precisely align with the needs of each individual application. When formulating a disaster recovery plan, organizations commonly begin by assessing their recovery point objective (RPO) and recovery time objective (RTO). The RPO specifies the amount of data the business can tolerate losing during an incident, while the RTO indicates the speed of recovery. Since not all data carries the same urgency, analyzing the RPO and RTO on a per-application basis is important. For instance, critical customer data might have specific demands compared to clickstream analytics. The criteria for RTO, RPO, and the length of time you need to retain backups will influence the financial and performance implications of maintaining backups. With MongoDB Atlas, we provide standard protective measures by default, with customizable options for tailoring protection to the service level agreements specified by the RPO and RTO in your DR plan. These are enhanced by additional features that can be leveraged to achieve greater levels of availability and durability for your most vital tasks. These features can be grouped into two main categories: prevention and recovery. Backup, granular recovery, and resilience There are many built-in features that are designed to prevent disasters from ever happening in the first place. Some key features and capabilities that enable a comprehensive prevention strategy include multi-region and multi-cloud clusters , encryption at rest , Queryable Encryption , cluster termination safeguards , backup compliance protocols , and the capability to test resilience . (We will discuss the features in-depth in part two of this series.) While prevention might satisfy the resilience needs of certain applications, different applications may demand greater resilience against failures based on the business requirements of data protection and disaster recovery. MongoDB provides comprehensive management of data backups, including the geographic distribution of backups across multiple regions, and the ability to prevent backups from being deleted, all through an automated retention schedule. Recovery capabilities are aimed at supporting RTO and minimizing data loss and include continuous cloud backups with point-in-time recovery. Atlas cloud backups utilize the native snapshot feature of your cluster's cloud service provider, ensuring backup storage is kept separate from your MongoDB Atlas instances. Backups are essentially snapshots that capture the condition of your database cluster at a specific moment. They serve as a safeguard in case data is lost or becomes corrupted. For M10+ clusters, you have the option of utilizing Atlas Cloud Backups, which leverage the cluster's cloud service provider for storing backups in a localized manner. Atlas comes with strong default backup retention of 12 months out of the box. You also have the option to customize snapshot and retention schedules, including the time of day for snapshots, the frequency at which snapshots are taken over time, and retention duration. Another important feature is continuous cloud backup with point-in-time recovery, which enables you to restore data to the moment just before any incident or disruption, such as a cyber attack. To ensure your backups are regionally redundant and you can still restore even if the primary region that your backups are in is down, MongoDB Atlas offers the ability to copy these critical backups, with the point-in-time data, to any secondary region available from your cloud provider in Atlas. For the most stringent regulations, or for businesses that want to ensure backups are available even after a bad actor or cyber attack, MongoDB Atlas can ensure that no user, regardless of role, can ever delete a backup before a predefined protected retention period with the Backup Compliance Policy. Whatever your regulatory obligations or business needs are, MongoDB Atlas provides the flexibility to tailor your backup settings for requirements. Crucially, this ensures you can recover quickly, minimizing data loss and meeting your RPO in the event of a disaster recovery scenario. When properly configured, testing has shown that Atlas can quickly recover to the exact timestamp before a disaster or failure event, giving you a zero-minute RPO and RTO of less than 15 minutes when utilizing optimized restores. Recovery times can vary due to cloud provider disk warming and which point in time you are restoring to. So, it is important to also test this regularly. This means that regardless of your regulatory or business requirements, MongoDB Atlas allows you to configure your backups to ensure that you can meet your recovery requirements and, most importantly, recover with precision and speed to ensure that your data loss is minimal and your recovery point objectives are met should you experience a recovery event. Conclusion As regulations and business needs continue to evolve, and cyber-attacks become more sophisticated and varied, creating and implementing a data resilience strategy can be simple and manageable. MongoDB Atlas comes equipped with built-in measures that deliver robust data resilience at the database layer, ensuring your ability to both avoid incidents and promptly restore operations with minimal data loss if an incident does occur. Furthermore, setting up and overseeing additional advanced data resilience features is straightforward, with automation driven by a pre-configured policy that operates seamlessly at any scale. This streamlined approach supports compliance without the need for manual interventions, all within the MongoDB Atlas platform. For more information on the data resilience and disaster recovery features in MongoDB Atlas, download the Data Resilience Strategy with MongoDB Atlas whitepaper. To get started on Atlas today, we invite you to launch a free tier today .
Introducing Atlas for the Edge
This post is also available in: Deutsch , Português . We are thrilled to introduce MongoDB Atlas for the Edge at MongoDB.local London. This new solution is designed to streamline the management of data generated across various sources at the edge, including devices, on-premises data centers, and the cloud. Edge computing, which brings data processing closer to end-users, offers significant advantages. At the same time, it often proves challenging due to complex networking, data volume management, and security concerns, which can deter many organizations. They are also costly to build, maintain, and scale. Some challenges organizations face include: Significant technical expertise to manage the complexity of networking and high volumes of distributed data required to deliver reliable applications that run anywhere Stitching together hardware and software solutions from multiple vendors, resulting in complex and fragile systems that are often built using legacy technology that is limited by one-way data movement and requires specialized skills to manage and operate Constant optimization of edge devices due to their constraints — like limited data storage and intermittent network access — which makes keeping operational data in sync between edge locations and the cloud difficult Security vulnerabilities and frequent firmware patches and updates to ensure data privacy and compliance MongoDB Atlas for the Edge simplifies all of these manual tasks. It allows MongoDB to run on diverse edge infrastructure, from self-managed, on-premises servers to cloud deployments offered by major cloud providers. Data seamlessly flows between and is kept synchronized across all sources, ensuring real-time data delivery with minimal latency. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. With MongoDB Atlas for the Edge, organizations can now use a single, unified interface to deliver a consistent and frictionless development experience from the edge to the cloud — and everything in between. Together, the capabilities included with MongoDB Atlas for the Edge allow organizations to significantly reduce the complexity of building edge applications and architectures: Run MongoDB on a variety of edge infrastructure for high reliability with ultra-low latency: With MongoDB Atlas for the Edge, organizations can run applications on MongoDB using a wide variety of infrastructure, including self-managed, on-premises servers, such as those in remote warehouses or hospitals, in addition to edge infrastructure managed by major cloud providers including AWS, Google Cloud, and Microsoft Azure. For example, data stored in MongoDB Enterprise Advanced on self-managed servers can be automatically synced with MongoDB Atlas Edge Server on AWS Local Zones and MongoDB Atlas in the cloud to deliver real-time application experiences to edge devices with high reliability and single-digit millisecond latency. MongoDB Atlas for the Edge allows organizations to deploy applications anywhere, even in remote, traditionally disconnected locations — and keep data synchronized between edge devices, edge infrastructure, and the cloud — to enable data-rich, fault-tolerant, real-time application experiences. Atlas Edge Server is now in private preview, learn more on our product page . Run applications in locations with intermittent network connectivity: With Atlas Edge Server and Atlas Device Sync , organizations can use a pre-built, local-first data synchronization layer for applications running on kiosks or on mobile and IoT devices to prevent data loss and improve offline application experiences. MongoDB Edge Servers can be deployed in remote locations to allow devices to sync directly with each other—without the need for connectivity to the cloud—using built-in network management capabilities. Once network connectivity is available, data is automatically synchronized between devices and the cloud to ensure applications are up to date for use cases like inventory and package tracking across supply chains, optimizing delivery routes in remote locations, and accessing electronic health records with intermittent network connectivity. Build and deploy AI-powered edge computing applications: Data is required for generative AI and machine learning technologies to function and Atlas for the Edge provides provides the data transport necessary to provide low-latency, intelligent functionality at the edge directly on devices—even when network connectivity is unavailable. For example, data stored on MongoDB Atlas can be enhanced with embeddings with Atlas Vector Search . These documents can be synchronized down to mobile or edge devices using Atlas Device Sync. The embeddings can then be used with platform specific libraries like CoreML to perform ML classification. Additionally in reverse, data is the oil for training AI models and edge computing developers spend a ton of time writing non-differentiated code to synchronize data to the cloud, particularly in poor connectivity locations. By gather data at the edge and then using Atlas Device Sync to synchronize the data to the cloud - the data can then be used to train models or use Atlas Vector Search to generate embeddings and relevance search. Store and process real-time and batch data from IoT devices to make it actionable: With MongoDB Atlas Stream Processing , organizations can ingest and process high-velocity, high-volume data from millions of IoT devices (e.g., equipment sensors, factory machinery, medical devices) in real-time streams or in batches when network connectivity is available. Data can then be easily aggregated, stored, and analyzed using MongoDB Time Series collections for use cases like predictive maintenance and anomaly detection with real-time reporting and alerting capabilities. MongoDB Atlas for the Edge provides all of the tools necessary to process and synchronize virtually any type of data across edge locations and the cloud to ensure consistency and availability. Easily secure edge applications for data privacy and compliance: MongoDB Atlas for the Edge helps organizations ensure their edge deployments are secure with built-in security capabilities. The Atlas Device SDK provides out-of-the-box data encryption at rest, on devices, and in transit over networks to ensure data is protected and secure. Additionally, Atlas Device Sync provides fine-grained role-based access, with built-in identity and access management (IAM) capabilities that can also be combined with third-party IAM services to easily integrate edge deployments with existing security and compliance solutions. Some of the leading organizations are leveraging Atlas for the Edge today. For example: Cathay Pacific , Hong Kong’s home airline providing passenger and cargo services to destinations around the world, understood the need for digital transformation in their critical pilot briefing process and in-flight operations. With MongoDB Atlas, they were the very first to digitize their flight operations process with an iPad app, Flight Folder, enabling one of the first zero paper flights in the world in September of 2019. MongoDB’s developer data platform met their requirements for this and many other projects, successfully improving costs, operational efficiency, and accuracy, while also reducing environmental impact. Read the case study to learn more. Cloneable provides low/no-code tools to enable instant deployment of AI applications to a spectrum of devices—mobile, IoT devices, robots, and beyond. “We collaborated with MongoDB because Atlas for the Edge provided capabilities that allowed us to move faster while providing enterprise-grade experiences,” said Tyler Collins, CTO at Cloneable. “For example, the local data persistence and built-in cloud synchronization provided by Atlas Device Sync enables real-time updates and high reliability, which is key for Cloneable clients bringing complex, deep tech capabilities to the edge. Machine learning models distributed down to devices can provide low-latency inference, computer vision, and augmented reality. Atlas Vector Search enables vector embeddings from images and data collected from various devices to allow for improved search and analyses. MongoDB supports our ability to streamline and simplify heavy data processes for the enterprise.” To learn more about the solution announced today, and find out how retailers and healthcare organizations are leveraging the solution, please visit the web page for Atlas for the Edge .
Boosting Performance and Insights with MongoDB Atlas and New Relic
In order to keep up with the demands of the modern business landscape, organizations must prioritize monitoring and optimizing application performance. Today, we're excited to announce a powerful collaboration between MongoDB Atlas, the leading database-as-a-service platform, and New Relic , the renowned monitoring and observability solution. This integration enables users to seamlessly monitor, analyze, and optimize their MongoDB deployments with efficiency and ease. Observability to ensure availability and uptime Monitoring a MongoDB Atlas deployment is now simpler than ever with the integration of New Relic. By connecting the two platforms, users can effortlessly monitor the performance of their MongoDB metrics, such as latency, throughput, and error rates, directly within the New Relic user interface. With real-time visibility into the health and performance of their databases, developers, and operations teams can quickly identify potential bottlenecks and proactively address issues before they impact end-users. Similarly, administrators can get an immediate high-level view of the health and availability of their MongoDB databases. Intelligent alerting and notifications The combination of MongoDB Atlas and New Relic empowers users to set up intelligent alerts and notifications tailored to their specific business requirements. Leveraging New Relic's alerting capabilities, users can create custom alert policies based on performance metrics and query patterns. Whether it's a sudden increase in response times or an unexpected spike in query access patterns, teams can receive timely notifications via email, Slack, or other preferred channels, enabling them to take immediate action. Powerful dashboarding and reporting Users can also take advantage of comprehensive dashboarding and reporting capabilities. With customizable dashboards and rich visualizations, users can gain real-time insights into the performance of their MongoDB clusters. Additionally, New Relic's reporting tools enable teams to generate detailed reports on database performance, query analytics, and overall system health, empowering them to make data-driven decisions and track improvements over time. By combining the strengths of these two powerful platforms, users can now unlock a new level of control and efficiency in managing their MongoDB databases. From streamlined monitoring and analysis to improved troubleshooting and enhanced collaboration, this integration leads organizations to proactively optimize their applications, ensure scalability, and deliver exceptional user experiences. With MongoDB Atlas and New Relic working cohesively, businesses can stay ahead in today's rapidly evolving digital landscape, where performance and efficiency are key differentiators. If you’d like to see the MongoDB Atlas and New Relic integration in action, sign up for MongoDB Atlas , our cloud database service, and learn more about New Relic’s monitoring and observability capabilities .
Real-Time Inventory Tracking with Computer Vision & MongoDB Atlas
In today’s rapidly evolving manufacturing landscape, digital twins of factory processes have emerged as a game-changing technology. But why are they so important? Digital twins serve as virtual replicas of physical manufacturing processes, allowing organizations to simulate and analyze their operations in a virtual environment. By incorporating artificial intelligence and machine learning, organizations can interpret and classify objects, leading to cost reductions, faster throughput speeds, and improved quality levels. Real-time data, especially inventory information, plays a crucial role in these virtual factories, providing up-to-the-minute insights for accurate simulations and dynamic adjustments. In the first blog , we covered a 5-step high level plan to create a virtual factory. In this blog, we delve into the technical aspects of implementing a real-time computer vision inventory inference solution as seen in Figure 1 below. Our focus will be on connecting a physical factory with its digital twin using MongoDB Atlas, which facilitates real-time interaction between the physical and digital realms. Let's get started! Figure 1: High Level Overview Part 1: The physical factory sends data to MongoDB Atlas Let’s start with the first task of transmitting data from the physical factory to MongoDB Atlas. Here, we focus on sending captured images of raw material inventory from the factory to MongoDB for storage and further processing as seen in Figure 2. Using the MQTT protocol, we send images as base64 encoded strings. AWS IoT Core serves as our MQTT broker, ensuring secure and reliable image transfer from the factory to MongoDB Atlas. Figure 2: Sending images to MongoDB Atlas via AWS IoT Core For simplicity purposes, in this demo, we directly store the base64 encoded image strings in MongoDB documents. This is because each image received from the physical factory is small enough to fit into one document. However, this is not the only method to work with images (or generally large files) in MongoDB. Within our developer data platform , we have various storage methods, including GridFS for larger files or binary data for smaller ones (less than 16MB). Moreover, object storage services like AWS S3 or Google Cloud Storage, coupled with MongoDB data federation are commonly used in production scenarios. In this real-world scenario, integrating object storage services with MongoDB provides a scalable and cost-efficient architecture. MongoDB is excellent for fast and scalable reads and writes of operational data, but when retrieving images with very low latency is not a priority, the storage of these large files in ‘buckets’ helps reduce costs while getting all the benefits of working with MongoDB Atlas. Robert Bosch GmbH , for instance, uses this architecture for Bosch's IoT Data Storage , which helps service millions of devices worldwide efficiently. Coming back to our use case, to facilitate communication between AWS IoT Core and MongoDB, we employ Rules defined in AWS IoT Core, which helps us send data to an HTTPS endpoint. This endpoint is configured directly in MongoDB Atlas and allows us to receive and process incoming data. If you want to learn more about MongoDB Data APIs, check this blog from our Developer Center colleagues. Part 2: MongoDB Atlas to AWS SageMaker for CV prediction Now it’s time for the inference part! We’ve trained a built-in multi-label classification model provided by Sagemaker, using images like in Figure 3. The images were annotated with using an .lst file format following the schema: So in an image where only the red and white pieces are present, but no blue is present in the warehouse, we would have an annotation such as: Figure 3: Sample image used for the Computer Vision model The model was built using 24 training images and 8 validation images, which was a simplicity-based decision to demonstrate the capabilities of the implementation rather than building a powerful model. Regardless of the extremely low training/validation sample, we managed to achieve a 0.97 validation accuracy. If you want to learn more about how the model was built, check out the Github repo . With a model trained and ready to predict, we created a model endpoint in Sagemaker where we send new images through a POST request so it answers back with the predicted values. We use an Atlas Function to drive this functionality. Every minute, it grabs the latest image stored in MongoDB and sends it to the Sagemaker endpoint. It then waits for the response. When the response is received, we get an array with three decimal values between 0 and 1 representing the likelihood of each piece (blue, red, white) being in the stock. We interpret the numeric values with a simple rule: if the value is above 0.85, we consider the piece being in stock. Finally, the same Atlas function writes the results in a collection (Figure 4) that keeps the current state of the inventory of the physical factory. More details about the function here . Figure 4: Collection storing real time stock status of the factory The beauty comes when we have MongoDB Realm incorporated on the Virtual Factory as seen in Figure 5. It’s automatically and seamlessly synced with MongoDB Atlas through Device Sync. The moment we update the collection with the inventory status of the physical factory in MongoDB Atlas, the virtual factory, with Realm, is automatically updated. The advantage here, besides not needing to include any additional lines of code for the data transfer, is that conflict resolution will be handled out of the box and when connection is lost, the data won’t be lost and rather updated as soon as the connection is re-established. This essentially enables a real-time synchronized digital twin without the hustle of managing data pipelines, configuring your code for edge cases and lose time in non-competitive work. Figure 5: Connecting Atlas and Realm via Device Sync Just as an example of how companies are implementing Realm and Device Sync for mission-critical applications: The airline Cathay Pacific revolutionized how pilots logged critical flight data such as wind speed, elevation, and oil pressure. Historically, it was done manually via pen and paper until they switched to a fully digital, tablet-based app with MongoDB, Realm, and Device Sync. With this, they eliminated all papers from flights and did one of the first zero-paper flights in the world in 2019. Check out the full article here . As you can see, the combination of these technologies is what enables the development of truly connected, highly performant digital twins within just one platform. Part 3: CV results are sent to Digital Twin via Device Sync In the process of sending data to the digital twin through device sync, developers can follow a straightforward procedure. First, we need to navigate to Atlas and access the Realm SDK section. Here, they can choose their preferred programming language and the data models will be automatically pre-built based on the schemas defined in the MongoDB collections. MongoDB Atlas simplifies this task by offering a copy-paste functionality as seen in Figure 6 , eliminating the need to construct data models from scratch. For this specific project, the C# SDK was utilized. However, developers have the flexibility to select from various SDK options, including Kotlin, C++, Flutter, and more, depending on their preferences and project requirements. Once the data models are in place, simply activating device sync completes the setup. This enables seamless bidirectional communication. Developers can now send data to their digital twin effortlessly. Figure 6: Realm C# SDK Object Model example One of the key advantages of using device sync is its built-in conflict resolution capability. Whether facing offline interruptions or any conflicting changes, MongoDB Atlas automatically manages conflict resolution. The "Always on '' feature is particularly crucial for Digital Twins, ensuring constant synchronization between the device and the MongoDB Atlas. This powerful feature saves developers significant time that would otherwise be spent on building custom conflict resolution mechanisms, error-handling functions, and connection-handling methods. With device sync handling conflict resolution out of the box, developers can focus on building and improving their applications. They can be confident in the seamless synchronization of data between the digital twin and MongoDB Atlas. Part 4: Virtual factory sends inventory status to the user For this demonstration, we built the Digital Twin of our physical factory using Unity so that it can be interactive through a VR headset. With this, the user can order a piece on the physical world by interacting with the Virtual Twin, even if the user is thousands of miles away from the real factory. In order to control the physical factory through the headset, it’s crucial that the app informs the user whether or not a piece is present in the stock, and this is where Realm and Device Sync come into play. Figure 7: User is informed of which pieces are not in stock in real time. In Figure 7, the user intended to order a blue piece on the Digital Twin and the app is informing that the piece is not in stock, and therefore not activating the order neither on the physical factory nor its digital twin. What’s happening behind on the backend is that the app is reading the Realm object that stores the stock status of the physical factory and deciding if the piece is orderable or not. Remember that this Realm object is in real-time sync with MongoDB Atlas, which in turn is constantly updating the stock status on the collection in Figure 4 based on Sagemaker inferences. Conclusion In this blog, we presented a four-part process demonstrating the integration of a virtual factory and computer vision with MongoDB Atlas. This solution enables transformative real-time inventory management for manufacturing companies. If you're interested in learning more and getting hands-on experience, feel free to explore our accompanying GitHub repository for further details and practical implementation.
Ambee's AI Environmental Data Revolution: Powered by MongoDB Atlas
Ambee , a fast-growing climate tech start-up based in India, is making waves in the world of environmental data with its mission to create a sustainable future. With over 1 million daily active users, Ambee provides proprietary climate and environmental data-as-a-service to empower governments, healthcare organizations, and private companies to make informed decisions about their policies and business strategies. Their comprehensive data encompasses emissions, pollen levels, air quality, soil conditions, and more, all crucial for driving climate action while positively impacting businesses’ bottom lines. Ambee's pollen and air quality map From the outset, MongoDB Atlas has been at the core of Ambee's database architecture, supporting their AI and ML models. Ambee needed something that could manage a vast and diverse data set. MongoDB's flexible document model proved to be a perfect fit, enabling them to store all their data in one centralized location and operationalize it for various use cases. On average, Ambee adds around 10 to 15GB of data every hour. A significant advantage of MongoDB for Ambee lies in its ability to handle geospatial data, a critical element for their work. With data sourced from satellites, paid data providers, soil readings, airplanes, proprietary IoT devices, and much more, Ambee relies on MongoDB's geospatial capabilities to provide accurate and granular geographical insights. This precision is one of Ambee's key differentiators, setting them apart in the industry. Ambee's use of artificial intelligence adds another layer of value to their data services. By running AI models on MongoDB Atlas, they not only deliver data-as-a-service to their clients but also provide intelligent recommendations. Ambee's AI-driven platform, Ambee AutoML, serves as a central repository, enabling developers with limited machine learning expertise to train high-quality models. This democratization of machine learning empowers a broader audience to harness its potential, crucial in Ambee’s aim to fight climate change with data. The practical application of Ambee's AI and data services is amazing. Ambee's data powers many companies across the Fortune 500, including Boots, Kimberly Clark, and many more, to support a variety of use cases. Be it personalized marketing or digital healthcare, Ambee's datasets have helped businesses worldwide achieve remarkable results. For instance, Boots, a leading British health and beauty retailer, uses Ambee's data to identify regions where pollen and environmental factors trigger allergies. AI recommendations help allocate resources efficiently, enabling Boots to mitigate the impact of allergies and enhance their bottom line while aiding more individuals in need. Ambee has also made a US pollen and air quality map publicly available for anyone to check. In another use case, Ambee employs AI models to forecast forest fires and their potential outcomes in the U.S. and Canada, providing organizations with critical warnings to protect lives and property in wildfire-prone areas. Ambee's forest fire dashboard Ambee's future looks promising as they continue to grow covering more regions and incorporating more data, all of which makes its AI-powered services more powerful. The company's APIs are designed for ease of use, and to be as simple as possible for developers to get started with. This ease of use and extensive documentation is helping drive the popularity of the service. The MongoDB-powered APIs are getting more than 10 million calls every day. Madhusudhan Anand, CTO Ambee said: "Our work with MongoDB Atlas showcases how we can create a sustainable future by providing easily accessible environmental data. MongoDB's unique capabilities in handling diverse data and geospatial information have been instrumental in our success. Together, we are shaping a greener world." Extensive API Documentation As Ambee's popularity and impact continue to grow, its suite of data-driven products is expanding substantially. The company will soon be launching a number of sophisticated tools and platforms to help businesses take their operations to the next level. The next big piece will be C6—a carbon management and accounting platform. Ambee aims to help companies measure, report, and reduce their digital emissions. This will be followed by a programmatic advertising tool that can run campaigns based on environment triggers. All of which will be powered by MongoDB Atlas. And to unlock these innovative AI solutions, Ambee's team is looking to take advantage of the full developer data platform. For example, using MongoDB Atlas Federated Queries and Atlas search to make 70TB of exclusive environmental data operational.
Introducing the Certified MongoDB Atlas Connector for Power BI
This is a collaborative post from MongoDB and Microsoft. We thank Alexi Antonino, Natacha Bagnard, Jad Jarouche from MongoDB, and Bob Zhang, Mahesh Prakriya, and Rajeev Jain from Microsoft for their contributions. Introducing MongoDB Atlas Connector for Power BI, the certified solution that facilitates real-time insights on your Atlas data directly in the Power BI interfaces that analysts know and love! Supporting Microsoft’s Intelligent Data Platform , this integration bridges the gap between Developers and Analytics teams, allowing analysts who rely on Power BI for insights to natively transform, analyze, and share dashboards that incorporate live MongoDB Atlas data. Available in June , the Atlas Power BI Connector empowers companies to harness the full power of their data like never before. Let’s take a deeper look into how the Atlas Power BI Connector can unlock comprehensive, real-time insights on live application data that will help take your business to the next level. Effortlessly model document data with Power Query The Atlas Power BI Connector makes it easy to model document data with native Power BI features and data modeling capabilities. With its SQL-92 compatible dialect, mongosql, you can tailor your data to fit any requirements by transforming heavily nested document data to fit your exact needs, all from your Power Query dashboard. Gain real-time insights on live application data By using the Power BI Connector to connect directly to MongoDB Atlas, you can build up-to-date dashboards in Power BI Desktop and scale insights to your organization through Power BI Service with ease. With no delays caused by data duplication, you can stay ahead of the curve by unlocking real-time insights on Atlas data that are relevant to your business. Empower cross-source data analysis The Power BI Connector's integration with MongoDB Atlas enables you to seamlessly model, analyze, and share insightful dashboards that are built from multiple data sources. By combining Atlas's powerful Data Federation capabilities with Power BI's advanced analytics and visualization tools, you can easily create comprehensive dashboards that offer valuable insights into your data, regardless of where it is stored. See it in action Log in and activate the Atlas SQL Interface to try out the Atlas Power BI Connector ! If you are new to Atlas or Power BI, get started for free today on Azure Marketplace or Power BI Desktop .
MongoDB Atlas Integrations for AWS CloudFormation and CDK are now Generally Available
COSMOS SQL Migration to MongoDB Atlas
Azure Cosmos DB is Microsoft's proprietary globally distributed, multi-model database service. Cosmos DB supports SQL interface as one of the models in addition to the Cosmos MongoDB API. Even customers with the SQL interface use COSMOS for the document model and the convenience of working with a SQL interface. We have seen customers struggle with scalability issues and costs with Cosmos DB and want to move to MongoDB Atlas. Migrating an application from Cosmos DB SQL to MongoDB Atlas involves both application refactoring and data migration from Cosmos to MongoDB. The current tool set for migrating data from Cosmos SQL to MongoDB Atlas is fairly limited. While the Azure datamigration tool can be used for a 1 time export, customers frequently need zero downtime for migrations which the datamigration tool cannot satisfy. All writes into the source COSMOS SQL should be discontinued before the data migration can be performed. This puts a lot of pressure on the customer in terms of downtime requirements and planning out the migration. PeerIslands has built a COSMOS SQL migrator tool that addresses these concerns. The tool provides a way to perform COSMOS SQL migration with near zero downtime. The architecture of the tool is explained below. Initial Snapshot The tool uses the native datamigrationtool to export data as JSON files from Azure Cosmos DB SQL API. The Data Migration tool is an open-source solution that imports/exports data to/from Azure Cosmos DB. The exported data in JSON format is then imported into MongoDB Atlas using the mongoimport. Figure 1: Initial Snapshot processing stages. Change data capture Using the combination of the above tools we complete the initial snapshot. But what happens to documents that are updated or newly inserted during migration? Just prior to the initial snapshot process being started, the migration tool starts the change capture process. The migration tool listens to the ongoing changes in CosmosDB using the Kafka Source Connector provided by Azure and pushes the changes to a Kafka topic. Optionally KSQL can be used to perform any transformation required. Once the changes are in Kafka, the migration tool uses the Atlas Sink Connector to push the ongoing message to the Atlas Cluster. Below is the diagram depicting the flow of change stream messages from Cosmos SQL to MongoDB. Figure 2: The flow of change stream messages from Cosmos SQL to MongoDB The COSMOS SQL migration tool provides a GUI based point & click interface that brings together the above capabilities for handling the entire migration process. Since the tool is capable of change data capture, the tool provides a lot of flexibility for migrating your data without any downtime. Figure 3: Cosmos SQL migration tool dashboard In addition to data migration, PeerIslands can help with complete application refactoring required for migrating out of COSMOS SQL interface. Reach out to firstname.lastname@example.org if you need to migrate from COSMOS SQL to MongoDB Atlas.
Improving Building Sustainability with MongoDB Atlas and Bosch
Demystifying Sharding with MongoDB
Sharding is a critical part of modern databases, yet it is also one of the most complex and least understood. At MongoDB World 2022 , sharding software engineer Sanika Phanse presented Demystifying Sharding in MongoDB , a brief but comprehensive overview of the mechanics behind sharding. Read on to learn about why sharding is necessary, how it is executed, and how you can optimize the sharding process for faster queries. Watch this deep-dive presentation on the ins and outs of sharding, featuring MongoDB sharding software engineer Sanika Phanse. What is sharding, and how does it work? In MongoDB Atlas , sharding is a way to horizontally scale storage and workloads in the face of increased demand — splitting them across multiple machines. In contrast, vertical scaling requires the addition of more physical hardware, for example, in the form of servers or components like CPUs or RAM. Once you’ve hit the capacity of what your servers can support, sharding becomes your solution. Past a certain point, vertical scaling requires teams to spend significantly more time and money to keep pace with demand. Sharding, however, spreads data and traffic across your servers, so it’s not subject to the same physical limitations. Theoretically, sharding could enable you to scale infinitesimally, but, in practice, you are scaling proportionally to the number of servers you add. Each additional shard increases both storage and throughput, so your servers can simultaneously store more data and process more queries. How do you distribute data and workloads across shards? At a high level, sharding data storage is straightforward. First, a user must specify a shard key, or a subset of fields to partition their data by. Then, data is migrated across shards by a background process called the balancer , which ensures that each shard contains roughly the same amount of data. Once you specify what your shard key will be, the balancer will do the rest. A common form of distribution is ranged sharding, which assigns data to various shards through a range of shard keys. Using this approach, one shard will contain all the data with shard keys ranging from 0-99, the next will contain 100-199, and so forth. In theory, sharding workloads is also simple. For example, if you receive 1,000 queries per second on a single server, sharding your workload across two servers would divide the number of queries per second equally, where each server receives 500 queries per second. . However, these ideal conditions aren’t always attainable, because workloads aren’t always evenly distributed across shards. Imagine a group of 50,000 students, whose grades are split between two shards. If half of them decide to check their grades — and all of their records happen to fall in the same shard ID range — then all their data will be stored on the same shard. As a result, all the traffic will be routed to one shard server. Note that both of these examples are highly simplified; real-world situations are not as neat. Shards won’t always contain a balanced range of shard IDs, because data might not be evenly divided across shards. Additionally, 50,000 students, while large, is still too small of a sample size to be in a sharded cluster. How do you map and query sharded data? Without an elegant solution, users may encounter latency or failed queries when they try to retrieve sharded data. The challenge is to tie together all your shards, so it feels like you’re communicating with one database, rather than several. This solution starts with the config server, which holds metadata describing the sharded cluster, as well as the most up-to-date routing table, which maps shard keys to shard connection strings. To increase efficiency, routers regularly contact the config server to create a cached copy of this routing table. Nonetheless, at any given point in time, the config server’s version of the routing table can be considered the single source of truth. To query sharded data, your application sends your command to the team of routers. After a router picks up the command, it will then use the shard key from the command’s query, in conjunction with its cached copy of the routing table, to direct the query to the correct location. Rather than using the entire document, the user will only select one field (or combination of fields) to serve as the shard key. Then, the query will make its way to the correct shard, execute the command, update, and return a successful result to the router. Operations aren’t always so simple, especially when queries do not specify shard keys. In this case, the router realizes that it is unaware of where your data exists. Thus, it sends the query to all the shards, and then it waits to gather all the responses before returning to the application. Although this specific query is slow if you have many shards, it might not pose a problem if this query is infrequent or uncommon. How do you optimize shards for faster queries? Shard keys are critical for seamless operations. When selecting a shard key, use a field that matches on all (or most) of your data and has a high cardinality. This step ensures granularity among shard key values, which allows the data to be distributed evenly across shards. Additionally, your data can be resharded as needed, to fit changing requirements or to improve efficiency. Users can also accelerate queries with thoughtful planning and preparation, such as optimizing their data structures for the most common, business-critical query patterns. For example, if your workload makes lots of age-based queries and few _ID-based queries, then it might make sense to sort data by age to ensure more targeted queries. Hospitals are good examples, as they pose unique challenges. Assuming that the hospital’s patient documents would contain fields such as insurance, _ID value, and first and last names, which of these values would make sense as a shard key? Patient name is one possibility, but it is not unique, as many people might have the same name. Similarly, insurance can be eliminated, because there are only a handful of insurance providers, and people might not even have insurance. This key would violate both the high-cardinality principle, as well as the requirement that every document has this value filled. The best candidate for shard key would be the patient ID number or _ID value. After all, if one patient visits, that does not indicate whether another patient will (or will not) visit. As a result, the uniqueness of the _ID value will be very useful, as it will enable users to make targeted queries to the one document that is relevant to the patient. Faced with repeating values, users can also create compound shard keys instead. By including hyphenated versions of multiple fields, such as _ID value, patient names, and providers, a compound shard key can help reduce query bottlenecks and latency. Ultimately, sharding is a valuable tool for any developer, as well as a cost-effective way to scale out your database capacity. Although it may seem complicated in practice, sharding (and working effectively with sharded data) can be very intuitive with MongoDB. To learn more about sharding — and to see how you can set it up in your own environment — contact the MongoDB Professional Services team today.
MongoDB Named as a Leader in The Forrester Wave™: Translytical Data Platforms, Q4 2022
In The Forrester Wave™: Translytical Data Platforms, Q4 2022, translytical data platforms are described by Forrester as being “designed to support transactional, operational, and analytical workloads without sacrificing data integrity, performance, and analytics scale.” Characterized as next-generation data platforms, the Forrester report further notes that “Adoption of these platforms continues to grow strongly to support new and emerging business cases, including real-time integrated insights, scalable microservices, machine learning (ML), streaming analytics, and extreme transaction processing.” To help users understand this emerging technology landscape, Forrester published its previous Translytical Data Platforms Wave back in 2019. Three years on, Forrester has named MongoDB as a Leader in its latest Translytical Data Platforms Wave. We believe MongoDB was named a Leader in this report due to the R&D investments made in further building out capabilities in MongoDB Atlas , our multi-cloud developer data platform. These investments were driven by the demands of the developer communities we work with day-in, day-out. You told us how you struggle to bring together all of the data infrastructure needed to power modern digital experiences – from transactional databases to analytics processing, full-text search, and streaming. This is exactly what our developer data platform offers. It provides an elegant, integrated, and fully-managed data architecture accessed via a unified set of APIs. With MongoDB Atlas, developers are more productive, they ship code faster and improve it more frequently. Translytics and the Rise of Application-Driven Analytics Translytics is part of an important shift that we at MongoDB call application-driven analytics . By building smarter apps and increasing the speed of business insights, application-driven analytics gives you the opportunity to out-innovate your competitors and improve efficiency. To do this you can no longer rely only on copying data out of operational systems into separate analytics stores. Moving data takes time and creates too much separation between application events and actions. Instead, analytics processing has to be “shifted left” to the source of your data – to the applications themselves. This is the shift MongoDB calls application-driven analytics . It’s a shift that impacts both the skills and the technologies developers and analytics teams use every day. This is why understanding the technology landscape is so important. Overall, MongoDB is good for customers that are driving their strategy around developers who are tasked with building analytics into their applications. The Forrester Wave™: Translytical Data Platforms, Q4 2022 Evaluating the top vendors in the Translytic Data Platforms Wave Forrester evaluated 15 of the most significant translytical data platform vendors against 26 criteria. These criteria span current offering and strategy through to market presence. Forrester gave MongoDB the highest possible scores across eleven criteria, including: Number of customers Performance Scalability Dev Tools/API Multi-model Streaming Cloud / On-prem / distributed architecture Commercial model The report cites that “MongoDB ramps up its translytical offering aggressively”, and that “Organizations use MongoDB to support real-time analytics, systems of insight, customer 360, internet of things (IoT), and mobile applications.” Access your complimentary copy of the report here . Customer Momentum Many development teams start out using MongoDB as an operational database for both new cloud-native services as well as modernized legacy apps. More and more of these teams are now improving customer experience and speeding business insight by adopting application-driven analytics. Examples include: Bosch for predictive maintenance using IoT sensor data. Keller Williams for relevance-based property search and sales dashboarding. Iron Mountain for AI-based information discovery and intelligence. Volvo Connect for fleet management. Getting started on your Translytics Journey The MongoDB Atlas developer data platform is engineered to help you make the shift to Translytics and application-driven analytics – leading to smarter apps and increased business visibility. The best way to get started is to sign up for an account on MongoDB Atlas . Then create a free database cluster, load your own data or our sample data sets, and explore what’s possible within the platform. The MongoDB Developer Center hosts an array of resources including tutorials, sample code, videos, and documentation organized by programming language and product. Whether you are a developer or a member of an analytics team, it's never been easier to get started enriching your transactional workloads with analytics!