MongoDB Updates

The newest releases and freshest updates

New Aggregation Pipeline Text Editor Debuts in MongoDB Compass

There’s a reason why Compass is one of MongoDB’s most-loved developer tools: because it provides an approachable and powerful visual user interface for interacting with data on MongoDB. As part of this, Compass’s Aggregation Pipeline Builder abstracts away the finer points of MongoDB’s Query API syntax and provides a guided experience for developing complex queries. But what about when you want less rather than more abstraction? That’s where our new Aggregation Pipeline Text Editor comes in. Recently released on Compass, the Aggregation Pipeline Text Editor allows users to write free-form aggregations. While users could previously write and edit pipelines through a guided and structured builder organized by aggregation stage, a text-based builder can be preferable for some users. This new pipeline editor makes it easy for users to: See the entire pipeline without having to excessively scroll through the UI Stay “in the flow” when writing aggregations if they are already familiar with MongoDB’s Query API syntax Copy and paste aggregations built elsewhere (like in MongoDB’s VS Code Extension ) into Compass Use built-in syntax formatting to make pipeline text “pretty” before copying it over from Compass to other tools The Aggregation Pipeline Text Editor in Compass. Notice how toward the top right you can click on “stages” to move back to the traditional stage-based Aggregation Pipeline Builder. Ultimately, the addition of the Aggregation Pipeline Text Editor to Compass gives users more flexibility depending on how they want to build aggregations. For a more guided experience and to get result previews when adding each new stage, the existing Aggregation Pipeline Builder will work best for most users. But when writing free-form aggregations or copying and pasting aggregation text from other tools, the Aggregation Pipeline Text Editor may be preferable. It also previews the final pipeline output, rather than the stage-by-stage preview that exists today. Users will be able to access either both the traditional Aggregation Pipeline Builder and the new Pipeline Text Editor from directly within the Aggregations tab in Compass and can switch between the two views without losing their work. To get access to the new Aggregation Pipeline Text Editor, make sure to download the latest version of Compass here . And as always, we welcome your continued feedback on how to improve Compass. If you have ideas for how to improve your experience with Compass you can submit them on our UserVoice platform here . We’ll have even more great features coming in Compass soon. Keep checking back on our blog for the latest news!

January 26, 2023

Introducing MongoDB Connector for Apache Kafka version 1.9

Today, MongoDB released version 1.9 of the MongoDB Connector for Apache Kafka! This article highlights the key features of this new release! Pre/Post document states In MongoDB 6.0, Change Streams added the ability to retrieve the before and after state of an entire document . To enable this functionality on the collection you can set it as a parameter in the createCollection command such as: db.createCollection( "temperatureSensor", { changeStreamPreAndPostImages: { enabled: true } } ) Alternatively, for existing collections, use colMod as shown below: db.runCommand( { collMod: <collection>, changeStreamPreAndPostImages: { enabled: <boolean> } } ) Once the collection is configured for pre and post images, you can set the source connector parameter to include this extra information in the change event. For example, consider this source definition: { "name": "mongo-simple-source", "config": { "connector.class": "com.mongodb.kafka.connect.MongoSourceConnector", "connection.uri": "<< MONGODB CONNECTION STRING >>", "database": "test", "collection": "temperatureSensor", "":"whenavailable" } } When the following document is inserted: db.temperatureSensor.insertOne({'sensor_id':1,'value':100}) Then an update is applied: db.temperatureSensor.updateOne({'sensor_id':1},{ $set: { 'value':105}}) You can see the change stream event written to Kafka topic is as follows: { "_id": { "_data": "82636D39C8000000012B022C0100296E5A100444B0F5E386F04767814F28CB4AAE7FEE46645F69640064636D399B732DBB998FA8D67E0004" }, "operationType": "update", "clusterTime": { "$timestamp": { "t": 1668102600, "i": 1 } }, "wallTime": { "$date": 1668102600716 }, "ns": { "db": "test", "coll": "temperatureSensor" }, "documentKey": { "_id": { "$oid": "636d399b732dbb998fa8d67e" } }, "updateDescription": { "updatedFields": { "value": 105 }, "removedFields": [], "truncatedArrays": [] }, "fullDocumentBeforeChange": { "_id": { "$oid": "636d399b732dbb998fa8d67e" }, "sensor_id": 1, "value": 100 } } Note the fullDocumentBeforeChange key includes the original document before the update occurred. Starting the connector at a specific time Prior to version 1.9, when the connector starts as a source, it will open a MongoDB change stream and any new data will get processed by the source connector. To copy all the existing data in the collection first before you begin processing the new data, you specify the “ copy.existing ” property. One frequent user request is to start the connector based upon a specific timestamp versus when the connector starts. In 1.9 a new parameter called startup.mode was added to specify when to start writing data. startup.mode=latest (default) “Latest” is the default behavior and starts processing the data when the connector starts. It ignores any existing data when the connector starts. startup.mode=timestamp “timestamp” allows you to start processing at a specific point in time as defined by additional startup.mode.timestamp.* properties. For example, to start the connector from 7AM on November 21, 2022, you set the value as follows:’2022-11-21T07:00:00Z’ Supported values are an ISO-8601 format string date as shown above or as a BSON extended string format. startup.mode=copy.existing Same behavior as the existing as the configuration option, “copy.existing=true”. Note that “copy.existing” as a separate parameter is now deprecated. If you defined any granular copy.existing parameters such as copy.existing.pipeline, just prepend them with “startup.mode.copy.existing.” property name. Reporting MongoDB errors to the DLQ Kafka supports writing errors to a dead letter queue . In version 1.5 of the connector, you could write all exceptions to the DLQ through the mongo.error.tolerance=’all’ . One thing to note was that these errors were Kafka generated errors versus errors that occurred within MongoDB. Thus, if the sink connector failed to write to MongoDB due to a duplicate _id error, for example, this error wouldn’t be written to the DLQ. In 1.9, errors generated within MongoDB will be reported to the DLQ. Behavior change on inferring schema Prior to version 1.9 of the connector, if you are inferring schema and insert a MongoDB document that contains arrays with different value data types, the connector is naive and would simply set the type for the whole array to be a string. For example, consider a document that resembles: { "myfoo": [ { "key1": 1 }, { "key1": 1, "key2": "dogs" } ] } If we set output.schema.infer.value . to true on a source connector, the message in the Kafka Topic will resemble the following: … "fullDocument": { … "myfoo": [ "{\"key1\": 1}", "{\"key1\": 1, \"key2\": \"dogs\"}" ] }, … Notice the array items contain different values. In this example, key1 is a subdocument with a single value the number 1, the next item in the “myfoo” array is a subdocument with the same “key1” field and value of an integer, 1, and another field, “key 2” that has a string as a value. When this scenario occurs the connector will wrap the entire array as a string. This behavior can also apply when using different keys that contain different data type values. In version 1.9, the connector when presented with this configuration will not wrap the arrays, rather it will create the appropriate schemas for the variable arrays with different data type values. The same document when run in 1.9 will resemble: "fullDocument": { … "myfoo": [ { "key1": 1, }, { "key1": 1, "key2": "DOGS" } ] }, Note that this behavior is a breaking change and that inferring schemas when using arrays can cause performance degradation for very large arrays using different data type values. Download the latest version of the MongoDB Connector for Apache Kafka from Confluent Hub! To learn more about the connector read the MongoDB Online Documentation . Questions? Ask on the MongoDB Developer Community Connectors and Integrations forum!

January 12, 2023

What’s New in Atlas Charts: Easy Organization-Wide Sharing

We’re excited to announce improvements to sharing dashboards in MongoDB Atlas Charts . Data visualization is a powerful tool for discovering insights, and sharing visualizations across your team helps amplify those insights to propel businesses forward. With organization-wide sharing in Atlas Charts, we’re making it even easier to share the insights you discover from your application data across your entire organization. Sharing dashboards Atlas Charts has always made it possible to share visualizations with either individual members or everyone inside your Atlas project. Assuming a user had access to a given data source in Atlas, adding a user to a Charts project was effectively a one-click process. However, many teams do not broadly share database access unless an individual specifically needs it. And, if you want to share data with many members of your team, provisioning users one by one is tedious. Once users are in a Charts project, however, sharing a dashboard with everyone inside the project becomes relatively easy — you can invite all users in your project to view your dashboard with a single action. There are probably scenarios in which some members of your organization have Atlas access and others do not. In this case, if your team has enabled Federated Authentication and uses a third-party authentication provider, such as Google or Okta, Charts now makes it simple to turn on sharing dashboards across your entire organization. Granting access This approach makes sharing company-wide information quick and easy. For example, you can keep employees aware of product or platform growth or other key business metrics. Any members of your organization can be granted access to view these dashboards with a single click, as shown in Figure 1. Figure 1: &nbsp; A look at a dashboard shared across an organization. Note that, with these changes to dashboard sharing, your ability to maintain the security of your data remains unchanged. New dashboard viewers still need at least viewer access to any data source behind the charts in a shared dashboard, thereby ensuring that your company's sensitive data remains private. Additionally, project owners can now manage data source access at a deployment level, which means they can give access to their clusters or federated database instances . This capability is in addition to the already available granular control of data source access at a collection level, which was introduced as part of recent improvements we made to data sources. You can read more about managing access to data sources in your organization in our documentation . We hope you find these sharing improvements valuable and start leveraging this capability to share additional insights across your organization. New to Atlas Charts? Get started today by logging into or signing up for MongoDB Atlas , deploying or selecting a cluster, and activating Charts for free.

December 5, 2022

Atlas Charts Adds Support for Serverless and Online Archive Data Sources

We recently introduced streamlined data sources in Atlas Charts, which eliminates the manual steps involved with adding data sources into Charts. With MongoDB Atlas project data automatically available in Charts, your visualization workflow can become quicker and simpler than ever. With this feature, Atlas Charts users can now visualize two new sources of data: Serverless instances and Atlas cluster data that’s been archived using MongoDB Atlas Online Archive . For those unfamiliar with these data sources, here’s a quick summary: A serverless instance is an Atlas deployment model that lets you seamlessly scale usage based on workload demand and ensures you are only charged for resources you need. Online Archive enables automated data tiering of Atlas data, helping you scale your storage and optimize costs while keeping data accessible. Use cases These data sources serve two distinct use cases, based on your needs. So, whether you are trying to eliminate upfront resource provisioning using a serverless instance or creating archives of your high-volume workloads, such as time-series or log data to reduce costs with Online Archive, Charts makes these sources natively available for visualization with zero ETL, just as it always has with your other Atlas clusters. To learn how easy it is to visualize these new data sources, let’s create a serverless database called “ServerlessInstance0” and separately activate Online Archive on a database called “Cluster0” that will run daily in Atlas (Figure 1). Figure 1: Screenshot showing a serverless database deployed in MongDB Atlas. When setting up an Online Archive, Atlas creates two instances of your data (Figure 2). One instance includes only your archived data. The second instance contains your archive data and your live cluster data. This setup gives you additional flexibility to query data as your use case demands. Figure 2: Screenshot showing Online Archive instances in Atlas. Moving on to the Data Sources page in Charts (Figure 3), all of the data sources are shown, including serverless instances and Atlas cluster data archived in Online Archive, neatly categorized based on the instance type and ready for use in charts and dashboards. (Note that project owners maintain full control of these data sources.) For more details about connecting and disconnecting data sources, review our documentation . Figure 3: Screenshot showing Serverless and Online Archive data sources in Atlas Charts. With these additions, Charts now supports all the cluster configurations you can create in Atlas, and we are excited to see how you achieve your visualization goals using these new data sources. New to Atlas Charts? Get started today by logging into or signing up for MongoDB Atlas , deploying or selecting a cluster, and activating Charts for free.

October 27, 2022

Introducing Pay-As-You-Go MongoDB Atlas on Azure Marketplace

MongoDB was an official sponsor at the recent two-day, jam-packed 2022 Microsoft Ignite event. The centralized theme was “How to empower the customer to do more with less” in the Microsoft Cloud. The interactive conference created a meeting space for professionals to connect in-person with subject matter experts to discuss current and future points of digital transformation, attend workshops, learn key announcements, and discover innovative new offerings. Microsoft officially announced MongoDB to be part of a set of companies that make up the new Microsoft Intelligent Data Platform Partner Ecosystem and we are pleased to highlight our expanded alliance. Our partnership provides a frictionless process for developers to access MongoDB Atlas , the leading multi-cloud developer data platform available on the Microsoft Azure Marketplace . By procuring Atlas through the Azure Marketplace, customers can access a streamlined procurement and billing experience and use their Azure accounts to pay for their Atlas usage. MongoDB is also offering a free trial of the Atlas database through the Azure Marketplace. With the new Pay-As-You-Go Atlas listing on the Azure Marketplace, you only pay for the Atlas resources you use, with no upfront commitment required. You will receive just one monthly invoice on your Azure account that includes your Atlas usage, and you can apply existing Azure committed spend to it. Read the Azure Marketplace documentation to learn how to take advantage of the Microsoft Azure consumption commitment (MACC) and Azure commit to consume (CtC). You can even start free with an M0 Atlas cluster and scale up as needed. A free Atlas cluster comes with 512 MB of storage, out-of-the-box security features, and a basic support plan. If you’d like to upgrade your support plan, you can select one in Atlas and the additional cost will also be billed through Azure. MongoDB offers several support subscriptions with varying SLAs and levels of technical support. Whether you’re a new or existing Atlas customer, you can subscribe to Atlas directly from the Azure Marketplace. After you subscribe, you’ll be prompted to log in or create a new Atlas account. You can then deploy a new Atlas cluster or link your existing cluster(s) to your Azure account. Atlas customers can take advantage of best-in-class database features including: Production-grade security features, such as always-on authentication, network isolation, end-to-end encryption, and role-based access controls to keep your data protected. Global, high availability. Clusters are fault-tolerant and self-healing by default. Deploy across multiple regions for even better guarantees and low-latency local reads. Support for any class of workload. Build full-text search, run real-time analytics, share visualizations, and sync to the edge with fully integrated and native Atlas data services that require no manual data replication or additional infrastructure. New integrations that empower builders, developers, and digital natives to unlock the power of MongoDB Atlas when running on Azure—including PowerApps, PowerAutomate, PowerBI, Synapse, and Purview—to seamlessly add Atlas to existing architectures. With MongoDB Atlas on Microsoft Azure, developers receive access to the most comprehensive, secure, scalable, and cloud–based developer data platform in the market. Now, with the availability of Atlas on the Azure Marketplace, it’s never been easier for users to start building with Atlas while streamlining procurement and billing processes. Get started today through the Atlas on Azure Marketplace listing.

October 19, 2022

Introducing Snapshot Distribution in MongoDB Atlas

Data is at the heart of everything we do and in today’s digital economy has become an organization's most valuable asset. But sometimes the lengths that need to be taken to protect that data can present added challenges and result in manual processes that ultimately slow development, especially when it comes to maintaining a strict backup and recovery strategy. MongoDB Atlas aims to ease this burden by providing the features needed to help organizations not only retain and protect their data for recovery purposes, but to meet compliance regulations with ease. Today we’re excited to announce the release of a new backup feature, Snapshot Distribution. Snapshot Distribution allows you to easily distribute your backup snapshots across multiple geographic regions within your primary cloud provider with the click of a button. You can configure how snapshots are distributed directly within your backup policy and Atlas will automatically distribute them to other regions as selected—no manual process necessary. How to distribute your snapshots To enable Snapshot Distribution, navigate to the backup policy for your cluster and select the toggle to copy snapshots to other regions. From there, you can add any number of regions within your primary cloud provider—including regions you are not deployed in—to store snapshot copies. You can even customize your configuration to copy only specific types of snapshots to certain regions. Copy snapshots to other regions Restore your cluster faster with optimized, intelligent restores If you need to restore your cluster, Atlas will intelligently decide whether to use the original snapshot or a copied snapshot for optimal restore speeds. Copied snapshots may be utilized in cases where you are restoring to a cluster in the same region as a snapshot copy, including multi-region clusters if the snapshots are copied to every cluster region. Alternatively, if the original snapshot becomes unavailable due to a regional outage within your cloud provider, Atlas will utilize a copy in the nearest region to enable restores regardless of the cloud region outage. Perform point in time restore Get started with Snapshot Distribution Although storing additional snapshot copies in varying places may not always be required, this can be extremely useful in several situations, such as: For organizations who have a compliance requirement to store backups in different geographical locations from their primary place of operation For organizations operating multi-region clusters looking for faster direct-attach restores for the entire cluster If you fall into either of these categories, Snapshot Distribution may be a valuable feature addition to your current backup policy, allowing you to automate prior manual processes and free up development time to focus on innovation. Check out the documentation to learn more or navigate to your backup policy to enable this feature. Enable Snapshot Distribution

September 29, 2022

What’s New in Atlas Charts: Streamlined Data Sources

We’re excited to announce a major improvement to managing data sources in MongoDB Atlas Charts : Atlas data is now available for visualization automatically, with zero setup required. Every visualization relies on an underlying data source. In the past, Charts made adding Atlas data as a source fairly straightforward, but teams still needed to manually choose clusters and collections from which to power their dashboards. Streamlined data sources , however, eliminates the manual steps required to add data sources into Charts. This feature further optimizes your data visualization workflow by automatically making clusters, serverless instances, and federated database instances in your project available as data sources within Charts. For example, if you start up a new cluster or collection and want to create a visual quickly, you can simply go into one of your dashboards and start building a chart immediately. Check out streamlined data sources in action: See how the new data sources experience streamlines your data visualization workflow in Charts. Maintain full control of your data Although all project data will be available automatically to project members by default, we know how important it is to be able to control what data can be used by your team. For example, you may have sensitive customer data or company financials in a cluster. Project owners maintain full control over limiting access to data like this when needed. As shown in the following image, with a few clicks, you can select any cluster or collection, confirm whether or not any charts are using a data source, and disconnect when ready. If you have collections that you want some of your team to access but not others, this can be easily achieved under Data Access in collection settings as seen in the following image. With every release, our goal is to make visualizing Atlas data more frictionless and powerful. The Streamlined data sources feature helps us take a big step in this direction. Building data visualizations just got even easier with Atlas Charts. Give it a try today ! New to Atlas Charts? Get started today by logging into or signing up for MongoDB Atlas , deploying or selecting a cluster, and activating Charts for free.

September 21, 2022

MongoDB Connector for Apache Kafka 1.8 Available Now

MongoDB has released version 1.8 of the MongoDB Connector for Apache Kafka with new monitoring and debugging capabilities. In this article, we’ll highlight key features of this release. JMX monitoring The MongoDB Connector works with Apache Kafka Connect to provide a way for users to easily move data between MongoDB and the Apache Kafka. The MongoDB connector is written in Java and now implements Java Management Extensions (JMX) interfaces that allow you to access metrics reporting. These metrics will make troubleshooting and performance tuning easier. JMX technology, which is part of the Java platform, provides a simple, standard way for applications to provide metrics reporting with many third-party tools available to consume and present the data. For those who might not be familiar with JMX monitoring , let’s look at a few key concepts. An MBean is a managed Java object, which represents a particular component that is being measured or controlled. Each component can have one or more MBean attributes. The MongoDB Connector for Apache Kafka publishes MBeans under the “com.mongodb.kafka.connector” domain. Many open source tools are available to monitor JMX metrics, such as the console-based JmxTerm or the more feature-complete monitoring and alerting tools like Prometheus . JConsole is also available as part of the Java Development Kit (JDK). Note: Regardless of your client tool, MBeans for the connector are only available when there are active source or sink configurations defined on the connector. Visualizing metrics Figure 1: Source task JMX metrics from JConsole. Figure 1 shows some of the metrics exposed by the source connector using JConsole. In this example, a sink task was created and by default is called “sink-task-0”. The applicable metrics are shown in the JConsole MBeans panel. A complete list of both source and sink metrics will be available in the MongoDB Kafka Connector online documentation shortly after the release of 1.8. MongoDB Atlas is a great platform to store, analyze, and visualize monitoring metrics produced by JMX. If you’d like to try visualizing JMX metrics in MongoDB Atlas generated by the connector, check out jmx2mongo . This tool continuously writes JMX metrics to a MongoDB time series collection. Once the data is in MongoDB Atlas, you can easily create charts from the data like the following: Figure 2: MongoDB Atlas Chart showing successful batch writes vs writes greater than 100ms. Figure 2 shows the number of successful batch writes performed by a MongoDB sink task and the number of those batch writes that took longer than 100ms to execute. There are many other monitoring use cases available; check out the latest MongoDB Kafka Connector documentation for more information. Extended debugging Over the years, the connector team collected requests from users to enhance error messages or provide additional debug information for troubleshooting. In 1.8, you will notice additional log messages and more descriptive errors. For example, before 1.8, if you set the copy.existing parameter, you may get the log message: “Shutting down executors.” This message is not clear. To address this lack of clarity, the message now reads: “Finished copying existing data from the collection(s).” These debugging improvements in combination with the new JMX metrics will make it easier for you to gain insight into the connector and help troubleshoot issues you may encounter. If you have ideas for additional metrics or scenarios where additional debugging messages would be helpful, please let us know by filing a JIRA ticket . For more information on the latest release, check out the MongoDB Kafka Connector documentation . To download the connector, go to the MongoDB Connector repository in GitHub or download from the Confluent Hub .

September 19, 2022

New in Atlas Search: Improve Content Recommendations With “More Like This”

We’re proud to announce the release of More Like This, a key MongoDB Atlas Search feature that allows developers to easily build more relevant and engaging experiences for their end users. With the moreLikeThis operator, you can display documents that are similar to a result document. In this article, we’ll explain how it works and how you can get started using this new feature. Content recommendation done easily People who use travel booking apps, streaming services, and e-commerce websites are likely familiar with “Frequently Bought With,” “Similar Products,” or “You Might Also Enjoy” sections in their search experiences — in other words, content recommendation that guides them toward new or related products to buy, movies to stream, recipes to make, or news articles to read (among other things). Instead of building and tuning a recommendation engine to provide this functionality, developers can create engaging, browsable search experiences by defining a similarity threshold between documents to surface relevant documents. How it works Under the hood, the moreLikeThis search operator extracts the most representative terms from a reference document or documents and returns a set of similar documents. The representative terms are selected based on term frequency-inverse document frequency (TF-IDF), which is calculated by looking at a given term’s frequency in a given document multiplied by its frequency in the corpus. TF-IDF is calculated by looking at a term’s frequency multiplied by its frequency in the corpus. Atlas Search indexes term frequency by default, which means there is less up-front configuration required when compared with other search solutions. Additionally, developers have the ability to define what constitutes sufficient similarity for their use cases, with control over variables such as the number of query terms selected and the minimum and maximum document frequency thresholds. Use cases An example use case might look like this: An online bookstore wants to upsell users who have reached the checkout stage with similar books. On the checkout page, the user is served with a More Like This query result in the form of an “Other Books You Might Like” section that contains an array of book titles based on multiple fields in the document (e.g., title, publisher, genre, author). More Like This can be applied to use cases like ecommerce, content management systems, application search, or anywhere you want to share more relevant content with your users to drive deeper engagement. For more examples of how to configure More Like This, refer to our examples in the Docs . To learn how to get started with More Like This, refer to our documentation . For real-world Atlas Search implementation examples, go to our Developer Center .

August 10, 2022

Introducing the Ability to Independently Scale Analytics Node Tiers for MongoDB Atlas

We’re excited to announce analytics node tiers for MongoDB Atlas. Analytics node tiers provide greater control and flexibility by allowing you to customize the exact infrastructure you need for your analytics workloads. Analytics node tiers provide control and flexibility Until now, analytics nodes in MongoDB’s Atlas clusters have used the same cluster tier as all other nodes. However, operational and analytical workloads can vary greatly in terms of resource requirements. Analytics node tiers allow you to enhance the performance of your analytics workloads by choosing the best tier size for your needs. This means you can choose an analytics node tier larger or smaller than the operational nodes in your cluster. This added level of customization ensures you achieve the performance required for both transactional and analytical queries — without the need to over- or under-provision your entire cluster for the sake of the analytical workload. Analytics node tiers are available in both Atlas and Atlas for Government . A standard replica set contains a primary node for reads and writes and two secondary nodes that are read only. Analytics nodes provide an additional read-only node that is dedicated to analytical reads. Choose a higher or lower analytics node tier based on your analytics needs Teams with large user bases using their BI dashboards may want to increase their analytics node tiers above that of their operational nodes. Choosing a higher tier can be useful when you have many users or require more memory to serve analytics needs. Scaling up the entire cluster tier would be costly, but scaling up just your analytics node tiers helps optimize the cost. Teams with inconsistent needs may want to decrease their analytics node tier below that of their operational nodes. The ability to set a lower tier gives you flexibility and cost savings when you have fewer users or analytics are not your top priority. With analytics node tiers, you get more discretion and control over how you manage your analytics workloads by choosing the appropriately sized tier for your analytics needs. Get started today by setting up a new cluster or adding an analytics node tier to any existing cluster. Check out our documentation to learn more.

August 3, 2022

7 Big Reasons to Upgrade to MongoDB 6.0

First announced at MongoDB World 2022, MongoDB 6.0 is now generally available and ready for download now. MongoDB 6.0 includes the capabilities introduced with the previous 5.1–5.3 Rapid Releases and debuts new abilities to help you address more use cases, improve operational resilience at scale, and secure and protect your data. The common theme in MongoDB 6.0 is simplification: Rather than forcing you to turn to external software or third-party tools, these new MongoDB capabilities allow you to develop, iterate, test, and release applications more rapidly. The latest release helps developers avoid data silos, confusing architectures, wasted time on integrating external tech, missed SLAs and other opportunities, and the need for custom work (such as pipelines for exporting data). Here’s what to expect in MongoDB 6.0. 1. Even more support for working with time series data Used in everything from financial services to e-commerce, time series data is critical for modern applications. Properly collected, processed, and analyzed, time series data provide a gold mine of insights — from user growth to promising areas of revenue — helping you grow your business and improve your application. First introduced in MongoDB 5.0, time series collections provide a way to handle these workloads without resorting to adding a niche technology and the resulting complexity. In addition, it was critical to overcome obstacles unique to time series data, such as high volume, storage and cost considerations, and gaps in data continuity (caused by sensor outages). Since its introduction, time series collections have been continuously updated and improved with a string of rapid releases . We began by introducing sharding for time series collections (5.1) to better distribute data, before rolling out columnar compression (5.2) to improve storage footprints, and finally moving on to densification and gap-filling (5.3) for allowing teams to run time series analytics — even when there are missing data points. As of 6.0, time series collections now include secondary and compound indexes on measurements, improving read performance and opening up new use cases like geo-indexing. By attaching geographic information to time series data, developers can enrich and broaden analysis to include scenarios involving distance and location. This could take the form of tracking temperature fluctuations in refrigerated delivery vehicles during a hot summer day or monitoring the fuel consumption of cargo vessels on specific routes. We’ve also improved query performance and sort operations. For example, MongoDB can now easily return the last data point in a series — rather than scanning the whole collection — for faster reads. You can also use clustered and secondary indexes to efficiently perform sort operations on time and metadata fields. 2. A better way to build event-driven architectures With the advent of applications like Seamless or Uber, users have come to expect real-time, event-driven experiences, such as activity feeds, notifications, or recommendation engines. But moving at the speed of the real world is not easy, as your application must quickly identify and act on changes in your data. Introduced in MongoDB 3.6, change streams provide an API to stream any changes to a MongoDB database, cluster, or collection, without the high overhead that comes from having to poll your entire system. This way, your application can automatically react, generating an in-app message notifying you that your delivery has left the warehouse or creating a pipeline to index new logs as they are generated. The MongoDB 6.0 release enriches change streams, adding abilities that take change streams to the next level. Now, you can get the before and after state of a document that’s changed, enabling you to send updated versions of entire documents downstream, reference deleted documents, and more. Further, change streams now support data definition language (DDL) operations, such as creating or dropping collections and indexes. To learn more, check out our blog post on change streams updates . 3. Deeper insights from enriched queries MongoDB’s aggregation capabilities allow users to process multiple documents and return computed results. By combining individual operators into aggregation pipelines, you can build complex data processing pipelines to extract the insights you need. MongoDB 6.0 adds additional capabilities to two key operators, $lookup and $graphlookup , improving JOINS and graph traversals, respectively. Both $lookup and $graphlookup now provide full support for sharded deployments. The performance of $lookup has also been upgraded. For instance, if there is an index on the foreign key and a small number of documents have been matched, $lookup can get results between 5 and 10 times faster than before. If a larger number of documents are matched, $lookup will be twice as fast as previous iterations. If there are no indexes available (and the join is for exploratory or ad hoc queries), then $lookup will yield a hundredfold performance improvement. The introduction of read concern snapshot and the optional atClusterTime parameter enables your applications to execute complex analytical queries against a globally and transactionally consistent snapshot of your live, operational data. Even as data changes beneath you, MongoDB will preserve point-in-time consistency of the query results returned to your users. These point-in-time analytical queries can span multiple shards with large distributed datasets. By routing these queries to secondaries, you can isolate analytical workloads from transactional queries with both served by the same cluster, avoiding slow, brittle, and expensive ETL to data warehouses. To learn more, visit our documentation . 4. More operators, less work Boost your productivity with a slate of new operators, which will enable you to push more work to the database — while spending less time writing code or manipulating data manually. These new MongoDB operators will automate key commands and long sequences of code, freeing up more developer time to focus on other tasks. For instance, you can easily discover important values in your data set with operators like $maxN , $minN , or $lastN . Additionally, you can use an operator like $sortArray to sort elements in an array directly in your aggregation pipelines. 5. More resilient operations From the beginning, MongoDB’s replica set design allows users to withstand and overcome outages. Initial sync is how a replica set member in MongoDB loads a full copy of data from an existing member — critical for catching up nodes that have fallen behind, or when adding new nodes to improve resilience, read scalability, or query latency. MongoDB 6.0 introduces initial sync via file copy, which is up to four times faster than existing, current methods. This feature is available with MongoDB Enterprise Server. In addition to the work on initial sync, MongoDB 6.0 introduces major improvements to sharding, the mechanism that enables horizontal scalability. The default chunk size for sharded collections is now 128 MB, meaning fewer chunk migrations and higher efficiency from both a networking perspective and in internal overhead at the query routing layer. A new configureCollectionBalancing command also allows the defragmentation of a collection in order to reduce the impact of the sharding balancer. 6. Additional data security and operational efficiency MongoDB 6.0 includes new features that eliminate the need to choose between secure data or efficient operations. Since its GA in 2019, client-side field-level encryption (CSFLE) has helped many organizations manage sensitive information with confidence, especially as they migrate more of their application estate into the public cloud. With MongoDB 6.0, CSFLE will include support for any KMIP-compliant key management provider. As a leading industry standard, KMIP streamlines storage, manipulation, and handling for cryptographic objects like encryption keys, certificates, and more. MongoDB’s support for auditing allows administrators to track system activity for deployments with multiple users, ensuring accountability for actions taken across the database. While it is important that auditors can inspect audit logs to assess activities, the content of an audit log has to be protected from unauthorized parties as it may contain sensitive information. MongoDB 6.0 allows administrators to compress and encrypt audit events before they are written to disk, leveraging their own KMIP-compliant key management system. Encryption of the logs will protect the events' confidentiality and integrity. If the logs propagate through any central log management systems or SIEM, they stay encrypted. Additionally, Queryable Encryption is now available in preview. Announced at MongoDB World 2022, this pioneering technology enables you to run expressive queries against encrypted data — only decoding the data when it is made available to the user. This ensures that data remains encrypted throughout its lifecycle, and that rich queries can be run efficiently without having to decrypt the data first. For a deep dive into the inner workings of Queryable Encryption, check out this feature story in Wired . 7. A smoother search experience and seamless data sync Alongside the 6.0 Major Release, MongoDB will also make ancillary features generally available and available in preview. The first is Atlas Search facets , which enable fast filtering and counting of results, so that users can easily narrow their searches and navigate to the data they need. Released in preview at MongoDB World 2022 , facets will now include support for sharded collections. Another important new addition is Cluster-to-Cluster Sync , which enables you to effortlessly migrate data to the cloud, spin up dev, test, or analytics environments, and support compliance requirements and audits. Cluster-to-Cluster Sync provides continuous, unidirectional data synchronization of two MongoDB clusters across any environment, be it hybrid, Atlas, on-premises, or edge. You’ll also be able to control and monitor the synchronization process in real time, starting, stopping, resuming, or even reversing the synchronization as needed. Ultimately, MongoDB 6.0’s new abilities are intended to facilitate development and operations, remove data silos, and eliminate the complexity that accompanies the unnecessary use of separate niche technologies. That means less custom work, troubleshooting, and confusing architectures — and more time brainstorming and building. MongoDB 6.0 is not an automatic upgrade unless you are using Atlas serverless instances. If you are not an Atlas user, download MongoDB 6.0 directly from the download center . If you are already an Atlas user with a dedicated cluster, take advantage of the latest, most advanced version of MongoDB. Here’s how to upgrade your clusters to MongoDB 6.0 .

July 19, 2022

Ready to get Started with MongoDB Atlas?

Start Free