144 results

Unleashing Developer Potential–and Managing Costs–with MongoDB Atlas

In today's business landscape, where unpredictability has become the norm, engineering leaders have to balance the dual challenges of managing uncertainty while optimizing IT costs. Indeed, the 2024 MarginPLUS Deloitte survey—which draws on insights from over 300 business leaders—emphasizes a collective pivot towards growth initiatives and cost transformations amidst the fluctuating global economic climate. MongoDB Atlas: A developer's ally for cost-effective productivity Executives across industries want to cut costs without impacting innovation; based on the Deloitte survey , 83% of companies are looking to change how they run their business margin improvement efforts. This is where MongoDB Atlas , the most advanced cloud database service on the market, comes in. An integrated suite of data services that simplify how developers build with data, MongoDB Atlas helps teams enhance their productivity without compromising on cost efficiency by offering visibility and control over spending—balancing developer freedom with data governance and cost management. This helps organizations escape the modernization hamster wheel—or the vicious cycle of continuously updating technology without making real progress, and draining resources while failing to deliver meaningful improvements. Put another way, MongoDB gives teams more time to innovate, instead of just maintaining the status quo. Outlined below are the built-in features of MongoDB Atlas, which enable customers to get the most out of their data while also focusing on budget optimization. Strategic features for cost optimization with MongoDB Atlas Right-sizing your cluster Use MongoDB Atlas’s Cluster Sizing Guide or auto-scalability to match your cluster with your workload, optimizing resource use with options for every requirement, including Low CPU options for lighter workloads. Pausing clusters and global distribution Save costs by pausing your cluster , and securely storing data for up to 30 days with auto-resume. Furthermore, Global Clusters improve performance across regions while maintaining cost efficiency and compliance. Index and storage management Enhance performance and reduce costs with MongoDB Atlas’s Performance Advisor , which provides tailored index and schema optimizations for better query execution and potential reductions in cluster size. Strategic data management Reduce storage expenses using Online Archive for infrequently accessed data and TTL indexes for efficient Time Series data management, ensuring only essential data is stored. Securely backup data before deletion with mongodump. Enhanced spend management Use spend analysis, billing alerts , and usage insights via the Billing Cost Explorer for detailed financial management and optimization. Resource tagging and customizable dashboards provide in-depth financial reporting and visual expense tracking, supporting effective budgeting and cost optimization. Additionally, Opt for serverless instances to adjust to your workload's scale, offering a pay-for-what-you-use model that eliminates overprovisioning concerns. Transforming uncertainty into advancement MongoDB Atlas equips IT decision-makers and developers with the features and tools to balance developer productivity with strategic cost management, transforming economic uncertainty into a platform for strategic advancement. MongoDB Atlas is more than a database management solution; it’s a strategic partner in optimizing your IT spending, ensuring that your organization remains agile, efficient, and cost-effective in the face of change. Need expert assistance in taking control of your MongoDB Atlas costs? MongoDB’s Professional Services team can provide a deep-dive assessment of your environment to build a tailored optimization plan—and to help you execute. Reach out to learn how we can support your cost optimization goals! If you haven't yet set up your free cluster on MongoDB Atlas , now is a great time to do so. You have all the instructions in this DevCenter article .

April 8, 2024

Enhanced Atlas Functionality: Introducing Resource Tagging for Projects

We are thrilled to announce that Atlas has now extended its tagging functionality to include projects in addition to deployments . This enhancement enables users to apply resource tags to projects, further enriching the way you can associate metadata with your cloud resources. With this new capability, categorizing, organizing, and tracking your projects within Atlas becomes more intuitive and effective, offering a streamlined approach to managing your resources. Enhancing project management with resource tagging Incorporating resource tagging into projects significantly enhances visibility and streamlines project management. By applying tags, teams can categorize resources, making it easier to understand the purpose or specific metadata associated with a project. This practice is especially beneficial in large-scale projects, where organizing resources systematically can vastly improve productivity. Tags serve as versatile markers, representing various attributes of a project such as environment, criticality, cost center, or application, thereby simplifying project organization. Furthermore, tags lay the groundwork for supporting automation and policy enforcement within organizations. By utilizing tags, tasks related to access controls, compliance, and other policies can be automated, enhancing operational efficiency. Auditing processes also benefit from tagging, facilitating tracking, and ensuring resources meet specific business requirements. In environments where teamwork is essential, adding tags to projects aids in streamlined collaboration. Tags allow team members to quickly grasp the purpose or function of different resources, surfacing critical information about the project that can help reduce miscommunication and conflicts. Overall, adopting resource tagging in cloud resource management unlocks significant improvements in performance and efficiency, making it an invaluable tool for modern organizational needs. How to add tags to projects You can view and manage tagging on projects in multiple areas: Atlas UI: When creating a new project , on the Organization Project List, or within Project Settings. Admin API: Various operations on projects were enhanced to allow you to view, create, and manage tags applied to projects, such as CreateOneProject and ReturnAllProjects . Atlas CLI: various commands on projects were enhanced to all you to view, create, and manage tags applied to projects. Resource tagging best practices We recognize that the complexity of tagging use cases varies, tailored to an organization's unique structure and specific business requirements. With this in mind, we’ve designed resource tagging in Atlas to support a variety of use cases. We suggest defining tags that should be applied across all projects to get started. This will ensure your tagging approach is reliable and consistent across all resources. If you have multiple deployments within a project, apply more granular metadata on each deployment. In the simplified example below, an organization has three projects containing one or more deployments. Each project contains a deployment for each development environment. We’ve added common tags to the projects and more granular tags to identify the environment at the deployment level. Given the uniqueness of each organization, we've designed a flexible system with simplicity at its heart, using key-value pairs. If you have a flatter organization structure in Atlas (e.g. with one deployment per project), consider adding all tags at the level that makes the most sense for your organization. This may vary depending on how you manage your deployments, existing tag workflows, or where you desire to view tags in the Atlas UI. Finally, here are a few points to consider when tagging: Do not include any sensitive information such as Personally Identifiable Information (PII) or Protected Health Information (PHI) in your resource tag keys or values. Use a standard naming convention for all tags, including spelling, case, and punctuation. Define and communicate a strategy for enforcing mandatory tags. We recommend starting by identifying the environment and the application, service, or workload. Use namespaces or prefixes to easily identify tags owned by different business units. Use programmatic tools like Terraform or the Admin API to manage the database of your tags. In summary The introduction of resource tagging for projects marks an improvement in how users can intuitively categorize, organize, and track projects within Atlas, streamlining cloud resource management. We're eager to hear your thoughts and ideas on further applications of resource tagging in Atlas. Please share your feedback and suggestions at , as your input is invaluable in shaping the future of our platform.

February 15, 2024

Leveraging MongoDB Atlas in your Internal Developer Platform (IDP)

DevOps, a portmanteau of “Developer” and “Operations”, rose to prominence around the early 2010s and established a culture of incorporating automated processes and tools designed to deliver applications and services to users faster than the traditional software development process. A significant part of that was the movement to "shift left" by empowering developers to self-serve their infrastructure needs, in theory offering them more control over the application development lifecycle in a way that reduced the dependency on central operational teams. While these shifts towards greater developer autonomy were occurring, the proliferation of public clouds, specific technologies (like GitHub, Docker, Kubernetes, Terraform), and microservices architectures entered the market and became standard practice in the industry. As beneficial as these infrastructure advancements were, these technical shifts added complexity to the setups that developers were using as a part of their application development processes. As a result, developers needed to have a more in-depth, end-to-end understanding of their toolchain, and more dauntingly, take ownership of a growing breadth of infrastructure considerations. This meant that the "shift left" drastically increased the cognitive load on developers, leading to inefficiencies because self-managing infrastructure is time-consuming and difficult without a high level of expertise. In turn, this increased the time to market and hindered innovation. Concurrently, the increasing levels of permissions that developers needed within the organization led to a swath of compliance issues, such as inconsistent security controls, improper auditing, unhygienic data and data practices increased overhead which ate away at department budgets, and incorrect reporting. Unsurprisingly, the desire to enable developers to self-serve to build and ship applications hadn't diminished, but it became clear that empowering them without adding friction or a high level of required expertise needed to become a priority. With this goal in mind, it became clear that investment was required to quickly and efficiently abstract away the complexities of the operational side of things for developers. From this investment comes the rise of Platform Engineering and Internal Developer Platforms (whether companies are labeling it as such or not). Platform engineering and the rise of internal developer platforms Within a developer organization, platform engineering (or even a central platform team) is tasked with creating golden paths for developers to build and ship applications at scale while keeping infrastructure spend and cognitive load on developers low. At the core of the platform engineering ethos is the goal of optimizing the developer experience to accelerate the delivery of applications to customers. Like teaching someone to fish, platform teams help pave the way for greater developer efficiency by providing them with pipelines that they can take and run with, reducing time to build, and paving the way for greater developer autonomy without burdening developers with complexity. To do this, platform teams strive to design toolchains and workflows based on the end goals of the developers in their organization. Therefore, it’s critical for the folks tasked with platform engineering to understand the needs of their developers, and then build a platform that is useful to the target audience. The end result is what is often (but not exclusively) known as an Internal Developer Platform. What is an IDP? An IDP is a collection of tools and services, sourced and stitched together by central teams to create golden paths for developers who will then use the IDP to simplify and streamline application building. IDPs reduce complexity and lower cognitive load on developers - often by dramatically simplifying the experience of configuring infrastructure and services that are not a direct part of the developer's application. They encourage developers to move away from spending excess time managing the tools they use and allow them to focus on delivering applications at speed and scale. IDPs enable developers the freedom to quickly and easily build, deploy, and manage applications while reducing risk and overhead costs for the organization by centralizing oversight and iteration of development practices. An IDP is tailored with developers in mind and will often consist of the following tools: Infrastructure platform that enabled running a wide variety of workloads with the highest degree of security, resilience, and scalability, and a high degree of automation (eg. Kubernetes) Source code repository system that allows teams to establish a single source of truth for configurations, ensuring version control, data governance, and compliance. (eg. Github, Gitlab, BitBucket) Control interface that enables everyone working on the application to interact with and manage its resources. (eg. Port or Backstage) Continuous integration and continuous deployment (CI/CD) pipeline that applies code and infrastructure configuration to an infrastructure platform. (eg. ArgoCD, Flux, CircleCI, Terraform, CloudFormation) Data layer that can handle changes to schemas and data structures. (eg. MongoDB Atlas) Security layer to manage permissions in order to keep compliance. Examples of this are roles-based compliance tools or secrets management tools (eg. Vault). While some tools have overlap and not all of them will be a part of a specific IDP, the goal of platform engineering efforts is to build an IDP for their developers that is tightly integrated with infrastructure resources and services to maximize automation, standardization, self-service, and scale for developers, as well as maximizing security whilst minimizing overhead for the enterprise. While there will be many different terms that different organizations and teams use to refer to their IDP story, at its core, an IDP is a tailored set of tech, tools, and processes , built and managed by a central team, and used to provide developers with golden paths that enable greater developer self-service, lower cognitive load, and reduce risk. How does MongoDB Atlas fit into this story? Developers often cite working with data as one of the most difficult aspects of building applications. Rigid and unintuitive data technologies impede building applications and can lead to project failure if they don’t deliver the data model flexibility and query functionality that your applications demand. A data layer that isn’t integrated into your workflows slows deployments, and manual operations are a never-ending drag on productivity. Failures and downtime lead to on-call emergencies – not to mention the enormous potential risk of a data breach. Therefore, making it easy to work with data is critical to improving the developer experience. IDPs are in part about giving developers the autonomy to build applications. For this reason, MongoDB’s developer data platform is a natural fit for an IDP because it serves as a developer data platform that can easily fit into any team’s existing toolstack and abstracts away the complexities associated with self-managing a data layer. MongoDB’s developer data platform is a step beyond a traditional database in that it helps organizations drive innovation at scale by providing a unified way to work with data that address transactional workloads, app-driven analytics, full-text search, vector search, stream data processing, and more, prioritizing an intuitive developer experience and automating security, resilience, and performance at scale. This simplification and broad coverage of different use cases make a monumental difference to the developer experience. By incorporating MongoDB Atlas within an IDP, developer teams have a fully managed developer data platform at their disposal that enables them to build and underpin best-in-class applications. This way teams won’t have to worry about adding the overhead and manual work involved in self-hosting a database and then building all these other supporting functionality that come out of the box with MongoDB Atlas. Lastly, MongoDB Atlas can be hosted on more cloud regions than any other cloud database in the market today with support for AWS, Azure, and Google Cloud. How can I incorporate MongoDB Atlas into my IDP? MongoDB Atlas’ Developer Data Platform offers many ways to integrate Atlas into their IDP through many tools that leverage the MongoDB Atlas Admin API. The Atlas Admin API can be used independently or via one of these tools/integrations and provides a programmatic interface to directly manage and automate various aspects of MongoDB Atlas, without needing to switch between UIs or incorporate manual scripts. These tools include: Atlas Kubernetes Operator HashiCorp Terraform Atlas Provider AWS CloudFormation Atlas Resources Atlas CDKs Atlas CLI Atlas Go SDK Atlas Admin API With the Atlas Kubernetes Operator, platform teams are able to seamlessly integrate MongoDB Atlas into the current Kubernetes deployment pipeline within their IDP allowing their developers to manage Atlas in the same way they manage their applications running in Kubernetes. First, configurations are stored and managed in a git repository and applied to Kubernetes via CD tools like ArgoCD or Flux. Then, Atlas Operator's custom resources are applied to Atlas using the Atlas Admin API and support all the building blocks you need, including projects, clusters, database users, IP access lists, private endpoints, backup, and more. For teams that want to take the IaC route in connecting Atlas to their IDP, Atlas offers integrations with HashiCorp Terraform and AWS CloudFormation which can also be used to programmatically spin up Atlas services off the IaC integrations built off the Atlas Admin API in the Cloud environment of their choice.. Through provisioning with Terraform, teams can deploy, update, and manage Atlas configurations as code with either the Terraform Provider or the CDKTF. MongoDB also makes it easier for Atlas customers who prefer using AWS CloudFormation to easily manage, provision, and deploy MongoDB Atlas services in three ways: through resources from the CloudFormation Public Registry, AWS Quick Starts, and the AWS CDK. Other programmatic ways that Atlas can be incorporated into an IDP are through Atlas CLI, which interacts with Atlas from a terminal with short and intuitive commands and accomplishes complex operational tasks such as creating a cluster or setting up an access list interactively Atlas Go SDK which provides platform-specific and Go language-specific tools, libraries, and documentation to help build applications quickly and easily Atlas Admin API provides a RESTful API, accessed over HTTPS, to interact directly with MongoDB Atlas control plane resources. Get started with MongoDB Atlas today The fastest way to get started is to create a MongoDB Atlas account from the AWS Marketplace , Azure Marketplace , or Google Cloud Marketplace . Go build with MongoDB Atlas today!

January 4, 2024

You Asked, We Listened. It's Here - Dark Mode for Atlas is Now Available in Public Preview

We are thrilled to announce a much-anticipated feature for MongoDB Atlas. Dark mode is now available in Public Preview for users worldwide. Dark mode has been the number one requested feature in MongoDB's feedback forum , and we've taken note. Users have tried browser plugins and other makeshift fixes, but now the wait is over. Our development team diligently worked to introduce a dark mode option, improving user experience with a new and refreshing perspective to the familiar interface of Atlas. This update—which includes 300 converted pages—is not just for our community. It also benefits us as developers, promoting a seamless dark mode experience across different tools in the developer workflow. Dark mode is sleek and sophisticated, aligning with the preferred working styles of many of our developers. Remember that this is an ongoing project, and there may be areas within Atlas that need refining. Rest assured, we will be monitoring our feedback channels closely. Not just a sleek interface We took a thoughtful approach to the overall dark mode user experience, particularly with respect to accessibility considerations. We ensured that our dark mode theme met accessibility standards by checking and adjusting all text, illustrations, and UI elements for color and contrast to help reduce eye strain and address those with light sensitivities while making sure it was still easy to read. We also focused on accommodating the overall light-to-dark background contrast while staying mindful of how they may layer or interact with other elements. Beyond aesthetics, dark mode is a proven method for extending battery life. For our users with OLED or AMOLED screens dark mode ensures the device’s battery life stretches even further by illuminating fewer pixels and encouraging lower brightness levels. Health benefits A typical engineer spends no fewer than eight hours a day in front of a computer, exposing their eyes to multiple digital screens, according to data from Medium . This screen usage can lead to dry eyes, insomnia, and headaches. While dark text on a light background is best for legibility purposes, light text on a dark background helps reduce eye strain in low-light conditions. Enable dark mode preview today To update the theme at any time, navigate to the User Menu in the top right corner, then select User Preferences . Under Appearance , there will be three options. Light Mode: This is the default color scheme. Dark Mode: Our new dark theme. Auto (Sync with OS): This setting will match the operating system's setting. A few things to keep in mind This is a user setting and does not impact other users within a project or organization. Dark mode is not currently available for Charts, Documentation, University, or Cloud Manager. Since we are releasing this in Public Preview , there might be some minor visual bugs. The goal of Public Preview releases is to generate interest and gather feedback from early adopters. It is not necessarily feature-complete and does not typically include consulting, SLAs, or technical support obligations. We have conducted comprehensive internal testing, and we did not find anything that prevents users from using Atlas. While we are still making a few finishing touches feel free to share any feedback using this form . Thank you to all our users who provided valuable feedback and waited patiently for this feature! Keep the feedback coming . We hope you enjoy dark mode, designed to improve accessibility, reduce eye strain and fatigue, and enhance readability. We invite you to experience the difference. Try dark mode today through your MongoDB Atlas portal .

November 15, 2023

Data Resilience with MongoDB Atlas

Data is the central currency in today's digital economy. Studies have shown that 43% of companies that experience major data loss incidents are unable to resume business operations. A range of scenarios can lead to data loss, yet within the realm of database technology, they typically fall under three main categories: catastrophic technical malfunctions, human error, and cyber attacks. A data loss event due to a catastrophic breakdown, human error, or cyber attack is not a matter of if, but a matter of when it will occur. Hence, businesses need to focus on how to avoid and minimize the effects as much as possible. Failure to effectively address these risks can lead to extended periods of downtime of a few hours or even a few weeks following an incident. The average cost of cyberattacks is a surprising $4.45 million, with some attacks costing in the hundreds of millions. Reputational harm is harder to quantify but no doubt real and substantial. The specific industry you're in might be subject to regulatory frameworks designed to counter cyber attacks. Businesses that are subject to regulatory regimes must maintain compliance with these requirements. This can determine the configuration of your disaster recovery approach. In this blog post, we'll explain the key disaster recovery (DR) capabilities available with MongoDB Atlas . We'll also cover the core responsibilities and strategies for data resilience including remediation, and recovery objectives (RTO/RPO). Planning for data resilience in Atlas Data resilience is not a one-size-fits-all proposition, which is why we offer a range of choices in Atlas for a comprehensive strategy. Our sensible defaults ensure you're automatically safeguarded, while also offering a variety of choices to precisely align with the needs of each individual application. When formulating a disaster recovery plan, organizations commonly begin by assessing their recovery point objective (RPO) and recovery time objective (RTO). The RPO specifies the amount of data the business can tolerate losing during an incident, while the RTO indicates the speed of recovery. Since not all data carries the same urgency, analyzing the RPO and RTO on a per-application basis is important. For instance, critical customer data might have specific demands compared to clickstream analytics. The criteria for RTO, RPO, and the length of time you need to retain backups will influence the financial and performance implications of maintaining backups. With MongoDB Atlas, we provide standard protective measures by default, with customizable options for tailoring protection to the service level agreements specified by the RPO and RTO in your DR plan. These are enhanced by additional features that can be leveraged to achieve greater levels of availability and durability for your most vital tasks. These features can be grouped into two main categories: prevention and recovery. Backup, granular recovery, and resilience There are many built-in features that are designed to prevent disasters from ever happening in the first place. Some key features and capabilities that enable a comprehensive prevention strategy include multi-region and multi-cloud clusters , encryption at rest , Queryable Encryption , cluster termination safeguards , backup compliance protocols , and the capability to test resilience . (We will discuss the features in-depth in part two of this series.) While prevention might satisfy the resilience needs of certain applications, different applications may demand greater resilience against failures based on the business requirements of data protection and disaster recovery. MongoDB provides comprehensive management of data backups, including the geographic distribution of backups across multiple regions, and the ability to prevent backups from being deleted, all through an automated retention schedule. Recovery capabilities are aimed at supporting RTO and minimizing data loss and include continuous cloud backups with point-in-time recovery. Atlas cloud backups utilize the native snapshot feature of your cluster's cloud service provider, ensuring backup storage is kept separate from your MongoDB Atlas instances. Backups are essentially snapshots that capture the condition of your database cluster at a specific moment. They serve as a safeguard in case data is lost or becomes corrupted. For M10+ clusters, you have the option of utilizing Atlas Cloud Backups, which leverage the cluster's cloud service provider for storing backups in a localized manner. Atlas comes with strong default backup retention of 12 months out of the box. You also have the option to customize snapshot and retention schedules, including the time of day for snapshots, the frequency at which snapshots are taken over time, and retention duration. Another important feature is continuous cloud backup with point-in-time recovery, which enables you to restore data to the moment just before any incident or disruption, such as a cyber attack. To ensure your backups are regionally redundant and you can still restore even if the primary region that your backups are in is down, MongoDB Atlas offers the ability to copy these critical backups, with the point-in-time data, to any secondary region available from your cloud provider in Atlas. For the most stringent regulations, or for businesses that want to ensure backups are available even after a bad actor or cyber attack, MongoDB Atlas can ensure that no user, regardless of role, can ever delete a backup before a predefined protected retention period with the Backup Compliance Policy. Whatever your regulatory obligations or business needs are, MongoDB Atlas provides the flexibility to tailor your backup settings for requirements. Crucially, this ensures you can recover quickly, minimizing data loss and meeting your RPO in the event of a disaster recovery scenario. When properly configured, testing has shown that Atlas can quickly recover to the exact timestamp before a disaster or failure event, giving you a one-minute RPO and RTO of less than 15 minutes when utilizing optimized restores. Recovery times can vary due to cloud provider disk warming and which point in time you are restoring to. So, it is important to also test this regularly. This means that regardless of your regulatory or business requirements, MongoDB Atlas allows you to configure your backups to ensure that you can meet your recovery requirements and, most importantly, recover with precision and speed to ensure that your data loss is minimal and your recovery point objectives are met should you experience a recovery event. Conclusion As regulations and business needs continue to evolve, and cyber-attacks become more sophisticated and varied, creating and implementing a data resilience strategy can be simple and manageable. MongoDB Atlas comes equipped with built-in measures that deliver robust data resilience at the database layer, ensuring your ability to both avoid incidents and promptly restore operations with minimal data loss if an incident does occur. Furthermore, setting up and overseeing additional advanced data resilience features is straightforward, with automation driven by a pre-configured policy that operates seamlessly at any scale. This streamlined approach supports compliance without the need for manual interventions, all within the MongoDB Atlas platform. For more information on the data resilience and disaster recovery features in MongoDB Atlas, download the Data Resilience Strategy with MongoDB Atlas whitepaper. To get started on Atlas today, we invite you to launch a free tier today .

October 3, 2023

Introducing Atlas for the Edge

Update May 2, 2024: Atlas Edge Server is now in public preview. Check out our blog to learn more . We are thrilled to introduce MongoDB Atlas for the Edge at MongoDB.local London. This new solution is designed to streamline the management of data generated across various sources at the edge, including devices, on-premises data centers, and the cloud. Edge computing, which brings data processing closer to end-users, offers significant advantages. At the same time, it often proves challenging due to complex networking, data volume management, and security concerns, which can deter many organizations. They are also costly to build, maintain, and scale. Some challenges organizations face include: Significant technical expertise to manage the complexity of networking and high volumes of distributed data required to deliver reliable applications that run anywhere Stitching together hardware and software solutions from multiple vendors, resulting in complex and fragile systems that are often built using legacy technology that is limited by one-way data movement and requires specialized skills to manage and operate Constant optimization of edge devices due to their constraints — like limited data storage and intermittent network access — which makes keeping operational data in sync between edge locations and the cloud difficult Security vulnerabilities and frequent firmware patches and updates to ensure data privacy and compliance MongoDB Atlas for the Edge simplifies all of these manual tasks. It allows MongoDB to run on diverse edge infrastructure, from self-managed, on-premises servers to cloud deployments offered by major cloud providers. Data seamlessly flows between and is kept synchronized across all sources, ensuring real-time data delivery with minimal latency. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. With MongoDB Atlas for the Edge, organizations can now use a single, unified interface to deliver a consistent and frictionless development experience from the edge to the cloud — and everything in between. Together, the capabilities included with MongoDB Atlas for the Edge allow organizations to significantly reduce the complexity of building edge applications and architectures: Run MongoDB on a variety of edge infrastructure for high reliability with ultra-low latency: With MongoDB Atlas for the Edge, organizations can run applications on MongoDB using a wide variety of infrastructure, including self-managed, on-premises servers, such as those in remote warehouses or hospitals, in addition to edge infrastructure managed by major cloud providers including AWS, Google Cloud, and Microsoft Azure. For example, data stored in MongoDB Enterprise Advanced on self-managed servers can be automatically synced with MongoDB Atlas Edge Server on AWS Local Zones and MongoDB Atlas in the cloud to deliver real-time application experiences to edge devices with high reliability and single-digit millisecond latency. MongoDB Atlas for the Edge allows organizations to deploy applications anywhere, even in remote, traditionally disconnected locations — and keep data synchronized between edge devices, edge infrastructure, and the cloud — to enable data-rich, fault-tolerant, real-time application experiences. Atlas Edge Server is now in private preview, learn more on our product page . Run applications in locations with intermittent network connectivity: With Atlas Edge Server and Atlas Device Sync , organizations can use a pre-built, local-first data synchronization layer for applications running on kiosks or on mobile and IoT devices to prevent data loss and improve offline application experiences. MongoDB Edge Servers can be deployed in remote locations to allow devices to sync directly with each other—without the need for connectivity to the cloud—using built-in network management capabilities. Once network connectivity is available, data is automatically synchronized between devices and the cloud to ensure applications are up to date for use cases like inventory and package tracking across supply chains, optimizing delivery routes in remote locations, and accessing electronic health records with intermittent network connectivity. Build and deploy AI-powered edge computing applications: Data is required for generative AI and machine learning technologies to function and Atlas for the Edge provides provides the data transport necessary to provide low-latency, intelligent functionality at the edge directly on devices—even when network connectivity is unavailable. For example, data stored on MongoDB Atlas can be enhanced with embeddings with Atlas Vector Search . These documents can be synchronized down to mobile or edge devices using Atlas Device Sync. The embeddings can then be used with platform specific libraries like CoreML to perform ML classification. Additionally in reverse, data is the oil for training AI models and edge computing developers spend a ton of time writing non-differentiated code to synchronize data to the cloud, particularly in poor connectivity locations. By gather data at the edge and then using Atlas Device Sync to synchronize the data to the cloud - the data can then be used to train models or use Atlas Vector Search to generate embeddings and relevance search. Store and process real-time and batch data from IoT devices to make it actionable: With MongoDB Atlas Stream Processing , organizations can ingest and process high-velocity, high-volume data from millions of IoT devices (e.g., equipment sensors, factory machinery, medical devices) in real-time streams or in batches when network connectivity is available. Data can then be easily aggregated, stored, and analyzed using MongoDB Time Series collections for use cases like predictive maintenance and anomaly detection with real-time reporting and alerting capabilities. MongoDB Atlas for the Edge provides all of the tools necessary to process and synchronize virtually any type of data across edge locations and the cloud to ensure consistency and availability. Easily secure edge applications for data privacy and compliance: MongoDB Atlas for the Edge helps organizations ensure their edge deployments are secure with built-in security capabilities. The Atlas Device SDK provides out-of-the-box data encryption at rest, on devices, and in transit over networks to ensure data is protected and secure. Additionally, Atlas Device Sync provides fine-grained role-based access, with built-in identity and access management (IAM) capabilities that can also be combined with third-party IAM services to easily integrate edge deployments with existing security and compliance solutions. Some of the leading organizations are leveraging Atlas for the Edge today. For example: Cathay Pacific , Hong Kong’s home airline providing passenger and cargo services to destinations around the world, understood the need for digital transformation in their critical pilot briefing process and in-flight operations. With MongoDB Atlas, they were the very first to digitize their flight operations process with an iPad app, Flight Folder, enabling one of the first zero paper flights in the world in September of 2019. MongoDB’s developer data platform met their requirements for this and many other projects, successfully improving costs, operational efficiency, and accuracy, while also reducing environmental impact. Read the case study to learn more. Cloneable provides low/no-code tools to enable instant deployment of AI applications to a spectrum of devices—mobile, IoT devices, robots, and beyond. “We collaborated with MongoDB because Atlas for the Edge provided capabilities that allowed us to move faster while providing enterprise-grade experiences,” said Tyler Collins, CTO at Cloneable. “For example, the local data persistence and built-in cloud synchronization provided by Atlas Device Sync enables real-time updates and high reliability, which is key for Cloneable clients bringing complex, deep tech capabilities to the edge. Machine learning models distributed down to devices can provide low-latency inference, computer vision, and augmented reality. Atlas Vector Search enables vector embeddings from images and data collected from various devices to allow for improved search and analyses. MongoDB supports our ability to streamline and simplify heavy data processes for the enterprise.” To learn more about the solution announced today, and find out how retailers and healthcare organizations are leveraging the solution, please visit the web page for Atlas for the Edge .

September 26, 2023

Boosting Performance and Insights with MongoDB Atlas and New Relic

In order to keep up with the demands of the modern business landscape, organizations must prioritize monitoring and optimizing application performance. Today, we're excited to announce a powerful collaboration between MongoDB Atlas, the leading database-as-a-service platform, and New Relic , the renowned monitoring and observability solution. This integration enables users to seamlessly monitor, analyze, and optimize their MongoDB deployments with efficiency and ease. Observability to ensure availability and uptime Monitoring a MongoDB Atlas deployment is now simpler than ever with the integration of New Relic. By connecting the two platforms, users can effortlessly monitor the performance of their MongoDB metrics, such as latency, throughput, and error rates, directly within the New Relic user interface. With real-time visibility into the health and performance of their databases, developers, and operations teams can quickly identify potential bottlenecks and proactively address issues before they impact end-users. Similarly, administrators can get an immediate high-level view of the health and availability of their MongoDB databases. Intelligent alerting and notifications The combination of MongoDB Atlas and New Relic empowers users to set up intelligent alerts and notifications tailored to their specific business requirements. Leveraging New Relic's alerting capabilities, users can create custom alert policies based on performance metrics and query patterns. Whether it's a sudden increase in response times or an unexpected spike in query access patterns, teams can receive timely notifications via email, Slack, or other preferred channels, enabling them to take immediate action. Powerful dashboarding and reporting Users can also take advantage of comprehensive dashboarding and reporting capabilities. With customizable dashboards and rich visualizations, users can gain real-time insights into the performance of their MongoDB clusters. Additionally, New Relic's reporting tools enable teams to generate detailed reports on database performance, query analytics, and overall system health, empowering them to make data-driven decisions and track improvements over time. By combining the strengths of these two powerful platforms, users can now unlock a new level of control and efficiency in managing their MongoDB databases. From streamlined monitoring and analysis to improved troubleshooting and enhanced collaboration, this integration leads organizations to proactively optimize their applications, ensure scalability, and deliver exceptional user experiences. With MongoDB Atlas and New Relic working cohesively, businesses can stay ahead in today's rapidly evolving digital landscape, where performance and efficiency are key differentiators. If you’d like to see the MongoDB Atlas and New Relic integration in action, sign up for MongoDB Atlas , our cloud database service, and learn more about New Relic’s monitoring and observability capabilities .

August 24, 2023

Real-Time Inventory Tracking with Computer Vision & MongoDB Atlas

In today’s rapidly evolving manufacturing landscape, digital twins of factory processes have emerged as a game-changing technology. But why are they so important? Digital twins serve as virtual replicas of physical manufacturing processes, allowing organizations to simulate and analyze their operations in a virtual environment. By incorporating artificial intelligence and machine learning, organizations can interpret and classify objects, leading to cost reductions, faster throughput speeds, and improved quality levels. Real-time data, especially inventory information, plays a crucial role in these virtual factories, providing up-to-the-minute insights for accurate simulations and dynamic adjustments. In the first blog , we covered a 5-step high level plan to create a virtual factory. In this blog, we delve into the technical aspects of implementing a real-time computer vision inventory inference solution as seen in Figure 1 below. Our focus will be on connecting a physical factory with its digital twin using MongoDB Atlas, which facilitates real-time interaction between the physical and digital realms. Let's get started! Figure 1: High Level Overview Part 1: The physical factory sends data to MongoDB Atlas Let’s start with the first task of transmitting data from the physical factory to MongoDB Atlas. Here, we focus on sending captured images of raw material inventory from the factory to MongoDB for storage and further processing as seen in Figure 2. Using the MQTT protocol, we send images as base64 encoded strings. AWS IoT Core serves as our MQTT broker, ensuring secure and reliable image transfer from the factory to MongoDB Atlas. Figure 2: Sending images to MongoDB Atlas via AWS IoT Core For simplicity purposes, in this demo, we directly store the base64 encoded image strings in MongoDB documents. This is because each image received from the physical factory is small enough to fit into one document. However, this is not the only method to work with images (or generally large files) in MongoDB. Within our developer data platform , we have various storage methods, including GridFS for larger files or binary data for smaller ones (less than 16MB). Moreover, object storage services like AWS S3 or Google Cloud Storage, coupled with MongoDB data federation are commonly used in production scenarios. In this real-world scenario, integrating object storage services with MongoDB provides a scalable and cost-efficient architecture. MongoDB is excellent for fast and scalable reads and writes of operational data, but when retrieving images with very low latency is not a priority, the storage of these large files in ‘buckets’ helps reduce costs while getting all the benefits of working with MongoDB Atlas. Robert Bosch GmbH , for instance, uses this architecture for Bosch's IoT Data Storage , which helps service millions of devices worldwide efficiently. Coming back to our use case, to facilitate communication between AWS IoT Core and MongoDB, we employ Rules defined in AWS IoT Core, which helps us send data to an HTTPS endpoint. This endpoint is configured directly in MongoDB Atlas and allows us to receive and process incoming data. If you want to learn more about MongoDB Data APIs, check this blog from our Developer Center colleagues. Part 2: MongoDB Atlas to AWS SageMaker for CV prediction Now it’s time for the inference part! We’ve trained a built-in multi-label classification model provided by Sagemaker, using images like in Figure 3. The images were annotated with using an .lst file format following the schema: So in an image where only the red and white pieces are present, but no blue is present in the warehouse, we would have an annotation such as: Figure 3: Sample image used for the Computer Vision model The model was built using 24 training images and 8 validation images, which was a simplicity-based decision to demonstrate the capabilities of the implementation rather than building a powerful model. Regardless of the extremely low training/validation sample, we managed to achieve a 0.97 validation accuracy. If you want to learn more about how the model was built, check out the Github repo . With a model trained and ready to predict, we created a model endpoint in Sagemaker where we send new images through a POST request so it answers back with the predicted values. We use an Atlas Function to drive this functionality. Every minute, it grabs the latest image stored in MongoDB and sends it to the Sagemaker endpoint. It then waits for the response. When the response is received, we get an array with three decimal values between 0 and 1 representing the likelihood of each piece (blue, red, white) being in the stock. We interpret the numeric values with a simple rule: if the value is above 0.85, we consider the piece being in stock. Finally, the same Atlas function writes the results in a collection (Figure 4) that keeps the current state of the inventory of the physical factory. More details about the function here . Figure 4: Collection storing real time stock status of the factory The beauty comes when we have MongoDB Realm incorporated on the Virtual Factory as seen in Figure 5. It’s automatically and seamlessly synced with MongoDB Atlas through Device Sync. The moment we update the collection with the inventory status of the physical factory in MongoDB Atlas, the virtual factory, with Realm, is automatically updated. The advantage here, besides not needing to include any additional lines of code for the data transfer, is that conflict resolution will be handled out of the box and when connection is lost, the data won’t be lost and rather updated as soon as the connection is re-established. This essentially enables a real-time synchronized digital twin without the hustle of managing data pipelines, configuring your code for edge cases and lose time in non-competitive work. Figure 5: Connecting Atlas and Realm via Device Sync Just as an example of how companies are implementing Realm and Device Sync for mission-critical applications: The airline Cathay Pacific revolutionized how pilots logged critical flight data such as wind speed, elevation, and oil pressure. Historically, it was done manually via pen and paper until they switched to a fully digital, tablet-based app with MongoDB, Realm, and Device Sync. With this, they eliminated all papers from flights and did one of the first zero-paper flights in the world in 2019. Check out the full article here . As you can see, the combination of these technologies is what enables the development of truly connected, highly performant digital twins within just one platform. Part 3: CV results are sent to Digital Twin via Device Sync In the process of sending data to the digital twin through device sync, developers can follow a straightforward procedure. First, we need to navigate to Atlas and access the Realm SDK section. Here, they can choose their preferred programming language and the data models will be automatically pre-built based on the schemas defined in the MongoDB collections. MongoDB Atlas simplifies this task by offering a copy-paste functionality as seen in Figure 6 , eliminating the need to construct data models from scratch. For this specific project, the C# SDK was utilized. However, developers have the flexibility to select from various SDK options, including Kotlin, C++, Flutter, and more, depending on their preferences and project requirements. Once the data models are in place, simply activating device sync completes the setup. This enables seamless bidirectional communication. Developers can now send data to their digital twin effortlessly. Figure 6: Realm C# SDK Object Model example One of the key advantages of using device sync is its built-in conflict resolution capability. Whether facing offline interruptions or any conflicting changes, MongoDB Atlas automatically manages conflict resolution. The "Always on '' feature is particularly crucial for Digital Twins, ensuring constant synchronization between the device and the MongoDB Atlas. This powerful feature saves developers significant time that would otherwise be spent on building custom conflict resolution mechanisms, error-handling functions, and connection-handling methods. With device sync handling conflict resolution out of the box, developers can focus on building and improving their applications. They can be confident in the seamless synchronization of data between the digital twin and MongoDB Atlas. Part 4: Virtual factory sends inventory status to the user For this demonstration, we built the Digital Twin of our physical factory using Unity so that it can be interactive through a VR headset. With this, the user can order a piece on the physical world by interacting with the Virtual Twin, even if the user is thousands of miles away from the real factory. In order to control the physical factory through the headset, it’s crucial that the app informs the user whether or not a piece is present in the stock, and this is where Realm and Device Sync come into play. Figure 7: User is informed of which pieces are not in stock in real time. In Figure 7, the user intended to order a blue piece on the Digital Twin and the app is informing that the piece is not in stock, and therefore not activating the order neither on the physical factory nor its digital twin. What’s happening behind on the backend is that the app is reading the Realm object that stores the stock status of the physical factory and deciding if the piece is orderable or not. Remember that this Realm object is in real-time sync with MongoDB Atlas, which in turn is constantly updating the stock status on the collection in Figure 4 based on Sagemaker inferences. Conclusion In this blog, we presented a four-part process demonstrating the integration of a virtual factory and computer vision with MongoDB Atlas. This solution enables transformative real-time inventory management for manufacturing companies. If you're interested in learning more and getting hands-on experience, feel free to explore our accompanying GitHub repository for further details and practical implementation.

August 1, 2023

Ambee's AI Environmental Data Revolution: Powered by MongoDB Atlas

Ambee , a fast-growing climate tech start-up based in India, is making waves in the world of environmental data with its mission to create a sustainable future. With over 1 million daily active users, Ambee provides proprietary climate and environmental data-as-a-service to empower governments, healthcare organizations, and private companies to make informed decisions about their policies and business strategies. Their comprehensive data encompasses emissions, pollen levels, air quality, soil conditions, and more, all crucial for driving climate action while positively impacting businesses’ bottom lines. Ambee's pollen and air quality map From the outset, MongoDB Atlas has been at the core of Ambee's database architecture, supporting their AI and ML models. Ambee needed something that could manage a vast and diverse data set. MongoDB's flexible document model proved to be a perfect fit, enabling them to store all their data in one centralized location and operationalize it for various use cases. On average, Ambee adds around 10 to 15GB of data every hour. A significant advantage of MongoDB for Ambee lies in its ability to handle geospatial data, a critical element for their work. With data sourced from satellites, paid data providers, soil readings, airplanes, proprietary IoT devices, and much more, Ambee relies on MongoDB's geospatial capabilities to provide accurate and granular geographical insights. This precision is one of Ambee's key differentiators, setting them apart in the industry. Ambee's use of artificial intelligence adds another layer of value to their data services. By running AI models on MongoDB Atlas, they not only deliver data-as-a-service to their clients but also provide intelligent recommendations. Ambee's AI-driven platform, Ambee AutoML, serves as a central repository, enabling developers with limited machine learning expertise to train high-quality models. This democratization of machine learning empowers a broader audience to harness its potential, crucial in Ambee’s aim to fight climate change with data. The practical application of Ambee's AI and data services is amazing. Ambee's data powers many companies across the Fortune 500, including Boots, Kimberly Clark, and many more, to support a variety of use cases. Be it personalized marketing or digital healthcare, Ambee's datasets have helped businesses worldwide achieve remarkable results. For instance, Boots, a leading British health and beauty retailer, uses Ambee's data to identify regions where pollen and environmental factors trigger allergies. AI recommendations help allocate resources efficiently, enabling Boots to mitigate the impact of allergies and enhance their bottom line while aiding more individuals in need. Ambee has also made a US pollen and air quality map publicly available for anyone to check. In another use case, Ambee employs AI models to forecast forest fires and their potential outcomes in the U.S. and Canada, providing organizations with critical warnings to protect lives and property in wildfire-prone areas. Ambee's forest fire dashboard Ambee's future looks promising as they continue to grow covering more regions and incorporating more data, all of which makes its AI-powered services more powerful. The company's APIs are designed for ease of use, and to be as simple as possible for developers to get started with. This ease of use and extensive documentation is helping drive the popularity of the service. The MongoDB-powered APIs are getting more than 10 million calls every day. Madhusudhan Anand, CTO Ambee said: "Our work with MongoDB Atlas showcases how we can create a sustainable future by providing easily accessible environmental data. MongoDB's unique capabilities in handling diverse data and geospatial information have been instrumental in our success. Together, we are shaping a greener world." Extensive API Documentation As Ambee's popularity and impact continue to grow, its suite of data-driven products is expanding substantially. The company will soon be launching a number of sophisticated tools and platforms to help businesses take their operations to the next level. The next big piece will be C6—a carbon management and accounting platform. Ambee aims to help companies measure, report, and reduce their digital emissions. This will be followed by a programmatic advertising tool that can run campaigns based on environment triggers. All of which will be powered by MongoDB Atlas. And to unlock these innovative AI solutions, Ambee's team is looking to take advantage of the full developer data platform. For example, using MongoDB Atlas Federated Queries and Atlas search to make 70TB of exclusive environmental data operational.

July 26, 2023

Introducing the Certified MongoDB Atlas Connector for Power BI

This is a collaborative post from MongoDB and Microsoft. We thank Alexi Antonino, Natacha Bagnard, Jad Jarouche from MongoDB, and Bob Zhang, Mahesh Prakriya, and Rajeev Jain from Microsoft for their contributions. Introducing MongoDB Atlas Connector for Power BI, the certified solution that facilitates real-time insights on your Atlas data directly in the Power BI interfaces that analysts know and love! Supporting Microsoft’s Intelligent Data Platform , this integration bridges the gap between Developers and Analytics teams, allowing analysts who rely on Power BI for insights to natively transform, analyze, and share dashboards that incorporate live MongoDB Atlas data. Available in June , the Atlas Power BI Connector empowers companies to harness the full power of their data like never before. Let’s take a deeper look into how the Atlas Power BI Connector can unlock comprehensive, real-time insights on live application data that will help take your business to the next level. Effortlessly model document data with Power Query The Atlas Power BI Connector makes it easy to model document data with native Power BI features and data modeling capabilities. With its SQL-92 compatible dialect, mongosql, you can tailor your data to fit any requirements by transforming heavily nested document data to fit your exact needs, all from your Power Query dashboard. Gain real-time insights on live application data By using the Power BI Connector to connect directly to MongoDB Atlas, you can build up-to-date dashboards in Power BI Desktop and scale insights to your organization through Power BI Service with ease. With no delays caused by data duplication, you can stay ahead of the curve by unlocking real-time insights on Atlas data that are relevant to your business. Empower cross-source data analysis The Power BI Connector's integration with MongoDB Atlas enables you to seamlessly model, analyze, and share insightful dashboards that are built from multiple data sources. By combining Atlas's powerful Data Federation capabilities with Power BI's advanced analytics and visualization tools, you can easily create comprehensive dashboards that offer valuable insights into your data, regardless of where it is stored. See it in action Log in and activate the Atlas SQL Interface to try out the Atlas Power BI Connector ! If you are new to Atlas or Power BI, get started for free today on Azure Marketplace or Power BI Desktop .

May 23, 2023

MongoDB Atlas Integrations for AWS CloudFormation and CDK are now Generally Available

Infrastructure as Code (IaC) tools allows developers to manage and provision infrastructure resources through code, rather than through manual configuration. IaC have empowered developers to apply similar best practices from software development to application instructure deployments. This includes: Automation - helping to ensure repeatable, consistent, and reliable infrastructure deployments Version Control - check in IaC code into GitHub, BitBucket, AWS CodeCommit, or GitLab for improved team collaboration and higher code quality Security - create clear audit trails of each infrastructure modification Disaster Recovery - IaC scripts can be used to quickly recreate infrastructure in the event of availability zone or region outages Cost Savings - prevent overprovisioning and waste of cloud resources Improved Compliance - easier to enforce organizational policies and standards Today we are doubling down on this commitment and announcing MongoDB Atlas integrations with AWS CloudFormation and Cloud Development Kit (CDK). AWS CloudFormation allows customers to define and provision infrastructure resources using JSON or YAML templates. CloudFormation provides a simple way to manage infrastructure as code and automate the deployment of resources. AWS Cloud Development Kit (CDK) is an open-source software development framework that allows customers to define cloud infrastructure in code and provision it through AWS CloudFormation. It supports multiple programming languages and allows customers to use high-level abstractions to define infrastructure resources. These new integrations are built on top of the Atlas Admin API and allow users to automate infrastructure deployments by making it easy to provision, manage, and control Atlas Infrastructure as Code in the cloud. MongoDB Atlas & AWS CloudFormation: To meet developers where they are, we now have multiple ways to get started with MongoDB Atlas using AWS Infrastructure as Code. Each of these allow users to provision, manage, and control Atlas infrastructure as code on AWS: Option 1: AWS CloudFormation Customers can begin their journey using Atlas resources directly from the AWS CloudFormation Public Registry . We currently have 33 Atlas Resources and will continue adding more. Examples of available Atlas resources today include: Dedicated Clusters, Serverless Instances, AWS PrivateLink , Cloud Backups, and Encryption at Rest using Customer Key Management. In addition, we have published these resources to 22 (and counting) AWS Regions where MongoDB Atlas is supported today. Learn how to get started via this quick demo . Option 2: AWS CDK After its launch in 2019 as an open source project, AWS CDK has gained immense popularity among the developer community with over a thousand external contributors and more than 1.3 million weekly downloads. AWS CDK abstracts away the low-level details of cloud infrastructure, making it easier for developers to define and manage their infrastructure natively in their programming language of choice. This helps to simplify the deployment process and eliminates context switching. Under the hood, AWS CDK synthesizes CloudFormation templates on your behalf which is then deployed to AWS accounts. In AWS CDK, L1 (Level 1) and L2 (Level 2) constructs refer to two different levels of abstraction for defining infrastructure resources: L1 constructs are lower-level abstractions that provide a one-to-one mapping to AWS CloudFormation resources. They are essentially AWS CloudFormation resources wrapped in code, making them easier to use in a programming context. L2 constructs are higher-level abstractions that provide a more user-friendly and intuitive way to define AWS infrastructure. They are built on top of L1 constructs and provide a simpler and more declarative API for defining resources. Today we announce MongoDB Atlas availability for AWS CDK in JavaScript and TypeScript, with plans for Python, Java, Go, and .NET support coming later in 2023. Now customers can easily deploy and manage all available Atlas resources by vending AWS CDK applications with prebuilt L1 Constructs. We also have a growing number of L2 and L3 CDK Constructs available. These include Constructs to help users to quickly deploy the core resources they need to get started with MongoDB Atlas on AWS in just a few lines JavaScript or TypeScript (see awscdk-resources-mongodbatlas to learn more). Users can also optionally select to add more advanced networking configurations such as VPC peering and AWS PrivateLink. Option 3: AWS Partner Solutions (previously AWS Quick Starts) Instead of manually pulling together multiple Atlas CloudFormation resources, AWS Partner Solutions gives customers access to pre-built CloudFormation templates for both general and specific use cases with MongoDB Atlas. By using AWS Partner Solution templates, customers can save time and effort compared to architecting their deployments from scratch. These were jointly created and incorporate best practices from MongoDB Atlas and AWS. Go to the AWS Partner Solutions Portal to get started. Start building today! These MongDB Atlas integrations with AWS CloudFormation are free and open source licensed under Apache License 2.0 . Users only pay for underlying Atlas resources created and can get started with Atlas always free tier ( M0 clusters ). Getting started today is faster than ever with MongoDB Atlas and AWS CloudFormation. We can’t wait to see what you will build next with this powerful combination! Learn more about MongoDB Atlas integrations with AWS CloudFormation

February 28, 2023

COSMOS SQL Migration to MongoDB Atlas

Azure Cosmos DB is Microsoft's proprietary globally distributed, multi-model database service. Cosmos DB supports SQL interface as one of the models in addition to the Cosmos MongoDB API. Even customers with the SQL interface use COSMOS for the document model and the convenience of working with a SQL interface. We have seen customers struggle with scalability issues and costs with Cosmos DB and want to move to MongoDB Atlas. Migrating an application from Cosmos DB SQL to MongoDB Atlas involves both application refactoring and data migration from Cosmos to MongoDB. The current tool set for migrating data from Cosmos SQL to MongoDB Atlas is fairly limited. While the Azure datamigration tool can be used for a 1 time export, customers frequently need zero downtime for migrations which the datamigration tool cannot satisfy. All writes into the source COSMOS SQL should be discontinued before the data migration can be performed. This puts a lot of pressure on the customer in terms of downtime requirements and planning out the migration. PeerIslands has built a COSMOS SQL migrator tool that addresses these concerns. The tool provides a way to perform COSMOS SQL migration with near zero downtime. The architecture of the tool is explained below. Initial Snapshot The tool uses the native datamigrationtool to export data as JSON files from Azure Cosmos DB SQL API. The Data Migration tool is an open-source solution that imports/exports data to/from Azure Cosmos DB. The exported data in JSON format is then imported into MongoDB Atlas using the mongoimport. Figure 1: Initial Snapshot processing stages. Change data capture Using the combination of the above tools we complete the initial snapshot. But what happens to documents that are updated or newly inserted during migration? Just prior to the initial snapshot process being started, the migration tool starts the change capture process. The migration tool listens to the ongoing changes in CosmosDB using the Kafka Source Connector provided by Azure and pushes the changes to a Kafka topic. Optionally KSQL can be used to perform any transformation required. Once the changes are in Kafka, the migration tool uses the Atlas Sink Connector to push the ongoing message to the Atlas Cluster. Below is the diagram depicting the flow of change stream messages from Cosmos SQL to MongoDB. Figure 2: The flow of change stream messages from Cosmos SQL to MongoDB The COSMOS SQL migration tool provides a GUI based point & click interface that brings together the above capabilities for handling the entire migration process. Since the tool is capable of change data capture, the tool provides a lot of flexibility for migrating your data without any downtime. Figure 3: Cosmos SQL migration tool dashboard In addition to data migration, PeerIslands can help with complete application refactoring required for migrating out of COSMOS SQL interface. Reach out to if you need to migrate from COSMOS SQL to MongoDB Atlas.

January 31, 2023