MongoDB Atlas Online Archive for Data Tiering is now GA
We’re thrilled to announce that MongoDB Atlas Online Archive is now Generally Available. With Online Archive, you can seamlessly tier your data across Atlas clusters and fully managed cloud object stores, gaining the flexibility to set the perfect price to performance ratio across your data. Eliminate the need to manually migrate or delete valuable data. Simply set a rule on your Atlas cluster to automate data archival while retaining easy access to query all your data using a single connection string. With this capability, you can bring new and previously cost-prohibitive use cases onto MongoDB Atlas , our first-class managed offering, and manage your entire data lifecycle without replicating or migrating it across multiple systems. What is Atlas Online Archive? Online Archive is a fully managed data tiering solution that allows you to tier data across your "hot" database storage layer and "colder" cloud object storage to maintain queryability while optimizing on cost and performance. Online Archive is a good fit for many different use cases, including: Insert heavy workloads, where data is immutable and has lower performance requirements as it ages Historical log keeping and time-series datasets Storing valuable data that would have otherwise been deleted using TTL indexes We’ve received amazing feedback from the community over the past few months while the feature was in beta and we’re now confident in supporting your production workloads. Our users have put the feature through a variety of use cases in production and development workloads which has enabled us to make a wide range of improvements. Online Archive gives me the flexibility to store all of my data without incurring high costs, and feel safe that I won't lose it. It's the perfect solution. Ran Landau, CTO, Splitit Autonomous Archival Management It's easy to get started with Online Archive and it requires no ongoing maintenance once it’s been set up. In order to activate the feature, you can follow these simple steps: Navigate to the “Online Archive” tab on your cluster card and begin the setup flow. Set an archiving rule by selecting a date field, with dot-notation if it’s nested, or creating a custom filter. Choose commonly queried fields that you want your archival queries to be optimized for, with a few things in mind: Your data will always be “partitioned” by the date field in your archive, but can be partitioned by up to two additional fields as well. The fields that you most commonly query should be towards the top of the list (date can be moved to the top or bottom). Query fields should be chosen carefully as they cannot be changed after the fact and will have a large impact on query performance. Avoid choosing a field that has unique values as it will have negative performance impacts for queries that need to scan lots of data. And you’re done! MongoDB Atlas will automatically move data off of your cluster and into a more cost-effective storage layer that can still be queried with a single connection string that combines cluster and archive data, powered by Atlas Data Lake . What's Next? Along with announcing Online Archive as Generally Available, we’re excited to share a few additional product enhancements which should be available in the coming months: Custom filters for your archival rules using a non-date based field Support for BYO Key Encryption on your archival data A dedicated connection string for archive-only queries Support for additional time formats Improved performance and stability Try Atlas Online Archive Online Archive allows you to right-size your Atlas clusters by storing hot data that is regularly accessed in live storage and moving colder data to a cheaper storage tier. Billing for this feature will include the cost to store data in our fully managed cloud object storage and usage based pricing for querying archive data. We can’t wait to see what new workloads you’ll bring onto MongoDB Atlas with the new flexibility provided by Online Archive! To get started, sign up for an Atlas account and deploy any dedicated cluster (M10 or higher). Have questions? Check out the documentation or head over to our community forums to get answers from fellow developers. And if we’re missing a feature you’d like to see, please let us know ! Safe Harbor Statement The development, release, and timing of any features or functionality described for MongoDB products remains at MongoDB's sole discretion. This information is merely intended to outline our general product direction and it should not be relied on in making a purchasing decision nor is this a commitment, promise or legal obligation to deliver any material, code, or functionality. Except as required by law, we undertake no obligation to update any forward-looking statements to reflect events or circumstances after the date of such statements.
New Ways to Customize Your Charts
When it comes to building charts, we know that details matter. Small differences in layout, styling or composition can make a big difference in how well your chart communicates the story behind your data. That’s why we’ve just released a whole bunch of new capabilities in MongoDB Charts , giving you more control than ever. Here’s what’s new: Secondary Y Axis: Charts can be a great way to show correlation between two different datasets, but when their scales differ greatly it can be hard to see the correlation. By choosing to plot one more series on a secondary Y Axis, you can allow them to make the most of the available space and highlight any interesting relationships. Secondary Y Axis can be enabled on Grouped Column, Discrete Line, Continuous Line and Continuous Area charts. Legend Position: Chart legends can now be moved to the top, right or bottom of your chart, or hidden altogether. “All Others” Group: Charts has long allowed you to limit a chart to show, say, just the top 10 values. The new “All Others” option allows you to add an additional bar or donut segment that shows the value of all other categories not included in the limit. “Count by Value” aggregation: Building multi-series charts is now easier than ever, with the new “Count by Value” aggregation option. This will automatically create series from each distinct value found in a field. String binning with Regular Expressions: Last month we introduced binning of string values, allowing you to choose the exact values to go into each bin. This month we’ve extended this further by allowing you to use Regular Expressions to assign values to a bin based on powerful patterns. Scatter Mark formatting: We’ve ramped up the customization options available on Scatter charts, allowing you to control the size, border thickness and opacity of each plotted mark. Line Dash Styles: A new option on Discrete and Continuous Line charts results in a different dash style for each series, making it easier to differentiate the series and improve the accessibility of your charts. Here’s one example of a chart that shows off the secondary Y axis, custom legend position and line dash styles: And here’s another, showing the effect you can get by customizing your scatter chart’s mark style: We hope you enjoy these new charting capabilities, but we’re not done yet! Over the next couple of months, we’ll be moving our focus to Table charts, adding options like conditional formatting, text wrapping and column pinning. If you have any other ideas for new customization features, please let us know using the MongoDB Feedback Engine . If you haven’t tried Charts yet, you can get started for free by signing up for MongoDB Atlas and deploying a free tier cluster.
Client-Side Field Level Encryption is now on Azure and Google Cloud
We’re excited to announce expanded key management support for Client-Side Field Level Encryption (FLE). Initially released last year with Amazon’s Key Management Service (KMS), native support for Azure Key Vault and Google Cloud KMS is now available in beta with support for our C#/.Net, Java, and Python drivers. More drivers will be added in the coming months. Client-Side FLE provides amongst the strongest levels of data privacy available today. By expanding our native KMS support, it is even easier for organizations to further enhance the privacy and security of sensitive and regulated workloads with multi-cloud support across ~80 geographic regions. My databases are already encrypted. What can I do with Client-Side Field Level Encryption? What makes Client-Side FLE different from other database encryption approaches is that the process is totally separated from the database server. Encryption and decryption is instead handled exclusively within the MongoDB drivers in the client, before sensitive data leaves the application and hits the network. As a result, all encrypted fields sent to the MongoDB server – whether they are resident in memory, in system logs, at-rest in storage, and in backups – are rendered as ciphertext. Neither the server nor any administrators managing the database or cloud infrastructure staff have access to the encryption keys. Unless the attacker has a compromised DBA password, privileged network access, AND a stolen client encryption key, the data remains protected, securing it against sophisticated exploits. MongoDB’s Client-Side FLE complements existing network and storage encryption to protect the most highly classified, sensitive fields of your records without: Developers needing to write additional, highly complex encryption logic application-side Compromising your ability to query encrypted data Significantly impacting database performance By securing data with Client-Side FLE you can move to managed services in the cloud with greater confidence. This is because the database only works with encrypted fields, and you control the encryption keys, rather than having the database provider manage the keys for you. This additional layer of security enforces an even finer-grained separation of duties between those who use the database and those who administer and manage the database. You can also more easily comply with “right to erasure” mandates in modern privacy legislation such as the GDPR and the CCPA . When a user invokes their right to erasure, you simply destroy the associated field encryption key and the user’s Personally Identifiable Information (PII) is rendered unreadable and irrecoverable to anyone. Client-Side FLE Implementation Client-Side FLE is highly flexible. You can selectively encrypt individual fields within a document, multiple fields within the document, or the entire document. Each field can be optionally secured with its own key and decrypted seamlessly on the client. To check-out how Client-Side FLE works, take a look at this handy animation. Client-Side FLE uses standard NIST FIPS-certified encryption primitives including AES at the 256-bit security level, in authenticated CBC mode: AEAD AES-256-CBC encryption algorithm with HMAC-SHA-512 MAC. Data encryption keys are protected by strong symmetric encryption with standard wrapping Key Encryption Keys, which can be natively integrated with external key management services backed by FIPS 140-2 validated Hardware Security Modules (HSMs). Initially this was with Amazon’s KMS, and now with Azure Key Vault and Google Cloud KMS in beta. Alternatively, you can use remote secure web services to consume an external key or a secrets manager such as Hashicorp Vault. Getting Started To learn more, download our Guide to Client-Side FLE . The Guide will provide you an overview of how Client-Side FLE is implemented, use-cases for it, and how it complements existing encryption mechanisms to protect your most sensitive data. Review the Client-Side FLE key management documentation for more details on how to configure your chosen KMS. Safe Harbor The development, release, and timing of any features or functionality described for our products remains at our sole discretion. This information is merely intended to outline our general product direction and it should not be relied on in making a purchasing decision nor is this a commitment, promise or legal obligation to deliver any material, code, or functionality.
1Data - PeerIslands Data Sync Accelerator
Today’s enterprises are in the midst of digital transformation, but they’re hampered by monolithic, on-prem legacy applications that don’t have the speed, agility, and responsiveness required for digital applications. To make the transition, enterprises are migrating to the cloud. MongoDB has partnered with PeerIslands to develop 1Data, a reference architecture and solution accelerator that helps users with their cloud modernization. This post details the challenges enterprises face with legacy systems and walks through how working with 1Data helps organizations expedite cloud adoption. Modernization Trends As legacy systems become unwieldy, enterprises are breaking them down into microservices and adopting cloud native application development. Monolith-to-microservices migration is complex, but provides value across multiple dimensions. These include: Development velocity Scalability Cost-of-change reduction Ability to build multiple microservice databases concurrently One common approach for teams adopting and building out microservices is to use domain driven design to break down the overall business domain into bounded contexts first. They also often use the Strangler Fig pattern to reduce the overall risk, migrate incrementally, and then decommission the monolith once all required functionality is migrated. While most teams find this approach works well for the application code, it’s particularly challenging to break down monolithic databases into databases that meet the specific needs of each microservice. There are several factors to consider during transition: Duration. How long will the transition to microservices take? Data synchronization. How much and what types of data need to be synchronized between monolith and microservice databases? Data translation in a heterogeneous schema environment. How are the same data elements processed and stored differently? Synchronization cadence. How much data needs syncing, and how often (real-time, nightly, etc.)? Data anti-corruption layer. How do you ensure the integrity of transaction data, and prevent the new data from corrupting the old? Simplifying Migration to the Cloud Created by PeerIslands and MongoDB, 1Data helps enterprises address the challenges detailed above. Migrate and synchronize your data with confidence with 1Data Schema migration tool. Convert legacy DB schema and related components automatically to your target MongoDB instance. Use the GUI-based data mapper to track errors. Real-time data sync pipeline. Sync data between monolith and microservice databases nearly in real time with enterprise grade components. Conditional data sync. Define how to slice the data you’re planning to sync. Data cleansing. Translate data as it’s moved. DSLs for data transformation. Apply domain-specific business rules for the MongoDB documents you want to create from your various aggregated source system tables. This layer also acts as an anti-corruption layer. Data auditing. Independently verify data sync between your source and target systems. Go beyond the database. Synchronize data from APIs, Webhooks & Events. Bidirectional data sync. Replicate key microservice database updates back to the monolithic database as needed. Get Started with Real-Time Data Synchronization With the initial version of 1Data, PeerIslands addresses the core functionality of real-time data sync between source and target systems. Here’s a view of the logical architecture: Source System. The source system can be a relational database like Oracle, where we’ll rely on CDC, or other sources like Events, API, or Webhooks. **Data Capture & Streaming.**Captures the required data from the source system and converts them into data streams using either off-the-shelf DB connectors or custom connectors, depending on the source type. 1Data implements data sharding and throttling, which enable data synchronization at scale, in this phase. Data Transformation. The core of the accelerator, when we convert the source data streams into target MongoDB document schemas. We use LISP-based Domain Specific Language to enable simple, rule-based data transformation, including user-defined rules. Data Sink & Streaming. Captures the data streams that need to be updated into the MongoDB database through stream consumers. The actual update into the target DB is done through sink connectors. Target system. The MDB database used by the microservices. Auditing. Most data that gets migrated is enterprise-critical; 1Data audits the entire data synchronization process for missed data and incorrect updates. Two-way sync. The logical architecture enables data synchronization from the MongoDB database back to the source database. We used MongoDB, Confluent Kafka and Debezium to implement this initial version of 1Data: The technical architecture is cloud agnostic, and can be deployed on-prem as well. We’ll be customizing it for key cloud platforms as well as fleshing out specific architectures to adopt for common data sync scenarios. Conclusion The 1Data solution accelerator lends itself to multiple use cases, from single view to legacy modernization. Please reach out to us for technical details and implementation assistance, and watch this space as we develop the 1Data accelerator further.
Announcing Azure Private Link Integration for MongoDB Atlas
We’re excited to announce the general availability of Azure Private Link as a new network access management option in MongoDB Atlas . MongoDB Atlas is built to be secure by default . All dedicated Azure clusters on Atlas are deployed in their own VNET. For network security controls, you already have the options of an IP Access List and VNET Peering . The IP Access List in Atlas offers a straightforward and secure connection mechanism, and all traffic is encrypted with end-to-end TLS. But it requires that you provide static public IPs for your application servers to connect to Atlas, and to list all such IPs in the Access List. And if your applications don’t have static public IPs or if you have strict requirements on outbound database access via public IPs, this won’t work for you. The existing solution to this is VNET Peering, with which you configure a secure peering connection between your Atlas cluster’s VNET and your own VNET(s). This is easy, but the connections are two way. While Atlas never has to initiate connections to your environment, some customers perceive VNET peering as extending the perceived network trust boundary anyway. Although Access Control Lists (ACLs) and security groups can control this access, they require additional configuration. MongoDB Atlas and Azure Private Link Now, you can use Azure Private Link to connect a VNET to MongoDB Atlas. This brings two major advantages: Unidirectional: connections via Private Link use a private IP within the customer’s VNET, and are unidirectional such that the Atlas VNET cannot initiate connections back to the customer's VNET. Hence, there is no extension of the network trust boundary. Transitive: connections to the Private Link private IPs within the customer’s VNET can come transitively from another VNET peered to the Private Link-enabled VNET, or from an on-prem data center connected with ExpressRoute to the Private Link-enabled VNET. This means that customers can connect directly from their on-prem data centers to Atlas without using public IP Access Lists. Azure PrivateLink offers a one-way network peering service between an Azure VNET and a MongoDB Atlas VNET Meeting Security Requirements with Atlas on Azure Azure Private Link adds to the security capabilities that are already available in MongoDB Atlas, like Client Side Field-Level Encryption , database auditing , BYO key encryption with Azure Key Vault integration , federated identity , and more. MongoDB Atlas undergoes independent verification of security and compliance controls , so you can be confident in using Atlas on Azure for your most critical workloads. Ready to try it out? Get started with MongoDB Atlas today! Sign up now
Getting started with MongoDB, PySpark, and Jupyter Notebook
Learn how to leverage MongoDB data in your Jupyter notebooks via the MongoDB Spark Connector and PySpark. We will load financial security data from MongoDB, calculate a moving average, and then update the data in MongoDB with the new data.
Finer Grained Database User Authorization in MongoDB Atlas
We’re happy to announce that it is now possible to create database users with privileges scoped to a specific set of clusters or Atlas Data Lakes in a MongoDB Atlas project. Traditionally, database users have always been created on a project level in Atlas. This provided a centralized interface for database user management as user privileges were scoped to all clusters in your Atlas project. Customers could manage and revoke user permissions confidently, without fear that the updated permissions would be applied to one cluster but not another. However, this abstraction created limitations for use cases that required creating database users scoped to specific clusters or Data Lakes. After much anticipation, we now offer the flexibility to refine database access on a more granular level. Restrict privileges for different environments of the same application One reason for scoping database users to the resource level is to restrict privileges by environment. For example, you may have a dev cluster, a test/qa cluster, and a prod cluster in the same project, but you don’t want all users to have equal access to all three. Isolate teams working on different microservices in the same project Another scenario is to be able to split up users so that certain developers only have access to specific clusters within a project. This is especially useful for customers who have networking restrictions that require them to only use one or two Atlas projects, and therefore co-locate multiple microservices or applications within the same project. With database users per cluster, those customers can now scope team A to only accessing cluster A, and team B to cluster B. Grant other users to access Atlas Data Lakes for analytics Finally, this capability also allows admins to restrict a user’s privileges to only Atlas Data Lakes within the project. This is helpful for analysts, data engineers, and other employees who need access to those Data Lakes and not live operational data in a production cluster, for example. Finer-grained authorization for database users Today, Atlas customers can already use built-in roles or custom roles to grant privileges to database users. With this update, admins get additional flexibility in the database user authorization model while keeping the authentication model exactly the same––we made sure to maintain all of the great features that database users already have. Customers can continue to pick from any of the four authentication mechanisms (SCRAM, X.509, LDAP, AWS IAM) supported by Atlas, and choose to create temporary database users that automatically expire within a user-configurable 7-day period, which can be used in conjunction with database auditing to audit any activity performed by a user with elevated privileges. Database user settings can be modified at any point, so existing users can now be scoped to the cluster or Data Lake-level. Database users with the default authorization settings will continue to have the same access to all resources within an Atlas project. If you have feedback, please add or upvote requests to our feedback portal . Already managing users programmatically? You can create and edit database users with the Atlas API or the Terraform MongoDB Atlas Provider (use this example configuration for a head start!). Ready to try it out? Get started with MongoDB Atlas today! Sign up now