Steve Jurczak

2 results

DocumentDB, MongoDB and the Real-World Effects of Compatibility

If there’s confusion in the market for document databases, it probably has to do with how the products are marketed. AWS claims that DocumentDB, its document model database, comes “with MongoDB compatibility.” But the question of how compatible DocumentDB actually is with MongoDB is worth considering. DocumentDB merely emulates the MongoDB API while running on top of AWS’s cloud-based relational database, Amazon Aurora. And it’s an inconsistent imitator at best, because it fails 62% of MongoDB API correctness tests . Even though AWS claims compatibility with MongoDB 4.0, our tests have concluded that its emulator is a mishmash of features going back to MongoDB 3.2, which we released in 2015. The result is that DocumentDB lacks many of the features that come standard in MongoDB. We’ve already published a side-by-side comparison of the feature sets for each solution. Instead of covering the same ground here, we'll explain how some of those differences play out in real-world scenarios. DocumentDB vs. MongoDB head-to-head comparison Scaling writes, partitioning data, and sharding Native sharding enables you to scale out databases horizontally, across multiple nodes and regions. Atlas offers elastic vertical and horizontal scaling to smooth consumption. DocumentDB does not scale writes or partition data beyond a single node. In order to ensure consistency, MongoDB uses concurrency control measures to prevent multiple clients from modifying the same piece of data simultaneously. Replicate and scale beyond a single region A number of factors are driving the need to distribute workloads to different geographic regions. In some cases, it’s to reduce latency by putting data closer to where it’s being used. In other cases, it’s to store data in a specific geographic zone to help meet data localization requirements. Finally, there’s the need to ensure the availability of data when there’s an outage of an entire AWS region. The flexibility to replicate and move workloads as needed is increasingly seen as a business requirement. But by default DocumentDB limits you to just 15 replicas and constrains you to a single region. Newly introduced Global Clusters may look like an answer, but much like “MongoDB compatibility,” it’s potentially misleading. The Global Clusters feature more closely resembles multi-region replication since it only allows writes to single primaries instead of being able to write to multiple regions. It also requires manual reconfiguration to recover from failures, making it a partial solution, at best. MongoDB Atlas allows true global cluster configurations so you can deliver capabilities to all your users around the world. At a click of a button, you can place the most relevant data near local application servers across more than 80 global regions to ensure low-latency reads and writes. By being able to define a geographic location for each document, your teams are able to more easily meet local privacy and compliance measures. It’s also an insurance policy against being locked into a single public cloud provider. High resilience, rapid failover, retryable writes For critical applications, every second of downtime represents a loss of revenue, trust, and reputation. Rapid failover to a different geographic area is necessary when recovery time objectives (RTO) are measured in seconds. DocumentDB failover SLAs can be as high as two minutes, and multi-region failover is not available. With MongoDB, failover time is typically five seconds, and failover to a different region or cloud provider are also options. Write errors can be as costly as downtime. If a write to increment a field is duplicated because a dropped connection failed to notify the client that the write was executed, that extra increment can be very costly depending on what it represents. With retryable writes, a write can be sent multiple times but applied exactly once. MongoDB has retryable writes. DocumentDB doesn’t. Integrated text search, geospatial processing, graph traversals Integrated text search saves time and improves performance because you can run queries across multiple sources. With DocumentDB, data must be replicated to adjacent AWS services, which increases cost and complexity. MongoDB Atlas combines integrated text search, graph traversals, and geospatial processing features into a single API and platform. Integrated search with MongoDB Atlas helps drive end user behavior by serving up relevant results based on what users are looking for or what businesses want to direct them toward. Hedged reads Geographically distributed replica sets can also be used to scale read operations and intelligently route queries to the replica set that’s closest to the user. Hedged reads is a function that automatically routes queries to the two closest nodes (measured by ping distance), returning results from the fastest replica. This helps minimize situations where queries are waiting on a node that’s already busy. DocumentDB doesn’t offer hedged reads, and it’s more restricted in terms of the number of replica sets it allows and the ability to place workloads in different regions. MongoDB gives you more flexibility when distributing data geographically for hedged reads since it leverages all of the major public cloud providers. Online Archive Putting data in cold storage can be a death knell if accessing it again is too cumbersome or slow. With online archiving, you can tier data across fully managed databases and cloud object storage and query it through a single endpoint. Online archiving automatically archives historical data while reducing operational and transactional data storage costs without compromising on query performance. MongoDB has it. DocumentDB doesn’t. Integrated querying in the cloud Running separate queries for separate data stores can drain resources and slow queries. The best solution is being able to query and analyze data across all the different databases and storage containers at once. You can do this with integrated querying, where you run a single query to analyze live cloud data and historical data together and in-place for faster insights. With DocumentDB, you have to replicate data to adjacent AWS services. With MongoDB, you can query and analyze data across cloud datastores and MongoDB Atlas in its native format. You can also run powerful, easy-to-understand aggregations through a unified API for a consistent experience across data types. On-demand materialized views When you create aggregations, the results are usually put into a new collection every time you create it. The entire collection is regenerated each time you create the aggregation. This process consumes CPU and I/O. With the $merge stage, you can just update the generated results collection rather than rebuild it completely. $merge lets you incrementally update the collection every time you run it. To update it, all you need to do is run the aggregation again and it will update all the values in place. $merge gives you the ability to create collections based on an aggregation and update those collections efficiently. This functionality allows users to create on-demand materialized views, where the content of the output collection is incrementally updated when the pipeline is run. MongoDB has this capability. DocumentDB does not. Rich data types The decimal data type is critical for storing very large or small numbers, like financial and tax computations, where it’s necessary to emulate decimal rounding exactly. DocumentDB does not support decimal data types or, in turn, lossless processing of complex numeric data, which is a problem for financial and scientific applications. MongoDB does support rich data types like Decimal128, giving you 128 bits of high precision decimal representation. Client-side field-level encryption Client-side field-level encryption (FLE) reduces the risk of unauthorized access or disclosure of sensitive data, like personally identifiable information (PII) and protected health information (PHI). Fields are encrypted before they leave the application, which protects data while in transit over the network, in database memory, at-rest in storage, in backup repositories, and in system logs. DocumentDB does not offer client-side FLE. MongoDB’s client-side FLE provides among the strongest levels of data privacy and security for regulated workloads. Platform agility In addition to the feature sets described here, one of the biggest differences between DocumentDB and MongoDB is the degree of freedom you have to move between different platforms. AWS offers seamless movement and minimal friction between services within its own ecosystem. MongoDB makes it easy to replicate data or move workloads to any cloud provider, giving you complete flexibility within the AWS platform as well as outside of it — whether it’s a self-managed MongoDB instance on cloud infrastructure, a full on-premises deployment, or just a local development instance on an engineer’s laptop. Try MongoDB Atlas for free today!

July 16, 2021

2021 Payment Trends Guide: What Corporate Clients Want From Their Bank

Payments data monetization is becoming the new battleground for financial services organizations and their corporate clients. Processing of payments data has evolved from a back-end operation to a critical space for innovation. Banks are determined to get more value out of payments data while corporate clients are willing to work with any partner that can help them do it. As one senior executive at a global bank commented: “(Data monetization) would have been very costly to do on a mainframe. The newer technologies like cloud allow you to do this in a much more efficient and effective way.” A recent survey of bank executives conducted by Celent, sponsored by MongoDB and Icon Solutions, revealed that payments data monetization is a high priority for key technology decision makers. Download the complete report here . What is payments data monetization? Payment data includes transaction records as well as data contained within messages. The monetization of payment data is relevant for a number of use cases, including: Using payments data to improve internal operations, identify clerical errors, and optimize procurement processes Improving straight-through-processing rate, which is the percentage of transactions that are passed through the system without manual intervention Incentivizing customers to make payments at different times to optimize the liquidity position of the bank Using payments data to enhance existing corporate-facing services Tracking payments and forecasting cash balances more accurately What's new in the payments industry? The time to invest in payments data monetization is now, according to a survey of hundreds of senior bank executives, corporate treasurers, and CFOs. Banks are eager to monetize their payments data, particularly as real-time payments accelerate and the push by the various regulators around the globe to adopt ISO 20022 intensifies. FedNow in the U.S. is the first true attempt to modernize the American payment landscape. At the same time, high demand for data-led services is prompting corporate clients to look beyond their existing banking partners to access new and alternative payment services. More than half of corporate clients surveyed relied on a partner other than their lead bank for cash forecasting (62%) and cash visibility (56%). The hope among banking professionals is that corporate clients would be willing to pay for service improvements and significant value-add services. But where banks see value-add features, clients see obligatory services their banking partners should offer by default. There are, however, opportunities for banks to justify additional fees, including: Using payment data to support new propositions and business models Driving value-added services through enriched third-party data sets Partnering with other organizations to launch new offerings Which services will corporate clients pay for? Corporate clients are seeking a wide range of payment services and are clear about the services they’re willing to pay for, including real-time cash balances and forecasting, better security and fraud protection, and data consolidated into a single dashboard. Services like virtual accounts, automated tracking, reconciliation of receivables, and better integration with corporate workflows are also considered high value but are seen as features that shouldn’t carry added fees, according to corporate clients. Their most sought-after services include: Consolidated real-time data from multiple banks in a single dashboard (38%) Real-time cash forecasting (37%) Better security and fraud protection (36%) Real-time cash balances (35%) Find out more about which services corporate clients are willing to pay for The push for ISO 20022 standardization The payments industry and corporate clients are in agreement about the need to adopt ISO 20022 message formats in its native JSON representation, which will allow richer data to be sent across the network and increase rates of automation without further adjustments to legacy data models in relational databases. According to the survey, banks plan to make significant investments in supporting ISO 20022 so they can offer improved services to corporate clients; 74% of banks see ISO 20022 migration as an investment opportunity in new data-led services. For their part, 32% of corporate clients want help managing ISO 20022 changes from their banking partners, and 31% said they would switch providers if it meant receiving help supporting ISO 20022 compliance. Cloud and agility Perhaps no sector of the market is more attached to monolithic legacy applications as the banking sector. But if banks wish to differentiate themselves through innovation, they’ll need to leverage modern data architectures to address customer needs with greater agility. Cloud technologies will be essential in the push to adopt modern database design because they offer the ability to create more flexible and responsive applications and microservices with on-demand scalability. According to the survey, banks have the opportunity to unlock long-term gains by combining data assets and integrating payments data into an enterprise-wide data strategy. As the survey points out, “Data monetization is a strategy, not a product.” Consolidated payment data and single-view dashboard The most sought-after payments service by finance executives — cited by 38% of corporate clients, 53% of which boast revenue of $500 million to $1 billion — is a single dashboard that provides a consolidated view of real-time data across all corporate bank partners. Although technology solutions for this service exist, implementation lags. Banks should embrace this critical need as an opportunity to differentiate themselves from competitors. Find out how MongoDB brings data agility to payment workflows

June 15, 2021