Storage-viz is a suite of web-based visualizers and new experimental database commands that may help you understand how MongoDB utilizes storage and organizes btrees. Storage-viz is now available in the MongoDB Nightly builds.
When a MongoDB collection is created, an on-disk extent is allocated to store the documents. Each time a newly created or updated document cannot fit into the existing collection’s extents, a new extent is created. Each document occupies a contiguous storage area - a record - in one of the collection’s extents. Storage-viz’ experimental storageDetails command extracts information about how the disk storage is used and the web-based visualizer generates an easy-to-read graphical representation. Storage-viz also showcases which parts of the collection’s extents are currently in RAM [NOTE: the visualizer doesn’t display how much memory is available].
MongoDB Indexing is accomplished with Btrees. Storage-viz’ _indexStats_ command and its web-based visualizer collect and display statistics related to the tree layout.
Want to try it?
Download the MongoDB Nightly Build (or 2.3.1 as soon as it’s available) from here and head to the Github repository for more information on how to use Storage-viz, submit feature requests or bug reports.
Keep in mind that the new commands are resource intensive and should be considered highly experimental for the time being. We suggest running them on a non-production server on a snapshot of your datafiles.
Storage-viz was designed, coded and tested by Andrea Lattuada, one of 10gen’s Interns. It has been an invaluable and greatly rewarding experience to work closely with 10gen’s Server Engineers and write code that is now available for everyone in the MongoDB community to use.
Living in the post-transactional database future
Given that we’ve spent decades building applications around relational databases, it’s not surprising that the first response to the introduction of NoSQL databases like MongoDB is sometimes “Why?” Developers aren’t usually the ones asking this question, because they love the approachability and flexibility MongoDB gives them. But DBAs who have built their careers on managing heavy RDBMS infrastructure? They’re harder to please. 10gen president Max Schireson estimates that 60 percent of the world’s databases are operational in nature, which is MongoDB’s market. Of those use cases, most of them are ripe for a non-relational approach. The database market is rapidly changing, and very much up for grabs. Or as Redmonk analyst James Governor puts it , “The idea that everything is relational? Those days are gone.” As useful as relational databases are (and they’re very useful for a certain class of application), they are losing relevance in a world where complex transactions are more the exception, less the rule. In fact, I’d argue that over time, the majority of application software that developers write will be in use cases that are better fits for MongoDB and other NoSQL technology, not legacy RDBMS. That’s the future. What about now? Arguably, many of the applications being built today are already post-transaction, ripe for MongoDB and poor fits for RDBMS. Consider: Amazon: its systems that process order transactions (RDBMS) are largely “done” and “stable”. Amazon’s current development is largely focusing on how to provide better search and recommendations or how to adapt prices on the fly (NoSQL). Netflix: the vast majority of it engineering is focusing on recommending better movies to you (NoSQL), not processing your monthly bill (RDBMS). Square: the easy part is processing the credit card (RDBMS). The hard part is making it location aware, so it knows where you are and what you’re buying (NoSQL). It’s easy, but erroneous, to pigeon-hole these examples as representative of an anomalous minority of enterprises. Yes, these companies represent the cutting edge of both business and technology. But no, they are not alone in building these sorts of applications. For every early-adopter Netflix there’s a sizable, growing population of mainstream companies in media (e.g., The Guardian ), finance (e.g., Intuit ), or other verticals that are looking to turn technology into a revenue-driving asset, and not simply something that helps keep the lights on and payrolls running. When what we built were websites, RDBMS worked great. But today, we’re building applications that are mobile, social, involve high volume data feeds, incorporate predictive analytics, etc. These modern applications? They don’t fit RDBMS. Andy Oliver lists 10 things never to do with a relational database , but the list is much longer, and growing. MongoDB is empowering the next generation of applications: post-transactional applications that rely on bigger data sets that move much faster than an RDBMS can handle. Yes, there will remain a relatively small sphere of applications unsuitable for MongoDB (including applications with a heavy emphasis on complex transactions), but the big needs going forward like search, log analysis, media repositories, recommendation engines, high-frequency trading, etc.? Those functions that really help a company innovate and grow revenue? They’re best done with MongoDB. Of course, given RDBMS’ multi-decade legacy, it’s natural for developers to try to force RDBMS to work for a given business problem. Take log analysis, for example. Oliver writes: Log analysis : …[T]urn on the log analysis features of Hadoop or RHQ/JBossON for a small cluster of servers. Set the log level and log capture to anything other than ERROR. Do something more complex and life will be very bad. See, this kind of somewhat unstructured data analysis is exactly what MapReduce à la Hadoop and languages like PIG are for. It’s unfortunate that the major monitoring tools are RDBMS-specific — they really don’t need transactions, and low latency is job No. 1. For forward-looking organizations, they already realize that MongoDB is an excellent fit for log management, which is why we see more and more enterprises turning to MongoDB for this purpose. I expect this to continue. As MongoDB continues to enrich its functionality , the universe of applications for which it is not merely applicable, but also better , will continue to expand, even as the universe of applications for which RDBMS is optimal will decline. Indeed, we’re already living in a post-transactional world. Some people just don’t know it yet. (Or, as William Gibson would say, “The future is already here – it’s just not very evenly distributed.”) Posted by Matt Asay , vice president of Corporate Strategy, with significant help from my inestimable colleague, Jared Rosoff . Tagged with: NoSQL, MongoDB, RDBMS, relational, James Governor, Redmonk, log analysis, Andy Oliver, transactions, Netflix, Amazon, Square, operational database, DBA
MACH Aligned for Retail: Cloud-Native SaaS
MongoDB is an active member of the MACH Alliance , a non-profit cooperation of technology companies fostering the adoption of composable architecture principles promoting agility and innovation. Each letter in the MACH acronym corresponds to a different concept that should be leveraged when modernizing heritage solutions and creating brand-new experiences. MACH stands for Microservices, API-first, Cloud-native SaaS, and Headless. In previous articles in this series, we explored the importance of Microservices and the API-first approach. Here, we will focus on the third principle championed by the alliance: Cloud-native SaaS. Let’s dive in. What is cloud-native SaaS? Cloud-native SaaS solutions are vendor-managed applications developed in and for the cloud, and leveraging all the capabilities the cloud has to offer, such as fully managed hosting, built-in security, auto-scaling, cross-regional deployment, automatic updates, built-in analytics, and more. Why is cloud-native SaaS important for retail? Retailers are pressed to transform their digital offerings to meet rapidly shifting consumer needs and remain competitive. Traditionally, this means establishing areas of improvement for your systems and instructing your development teams to refactor components to introduce new capabilities (e.g., analytics engines for personalization or mobile app support) or to streamline architectures to make them easier to maintain (e.g., moving from monolith to microservices). These approaches can yield good results but require a substantial investment in time, budget, and internal technical knowledge to implement. Now, retailers have an alternative tool at their disposal: Cloud-native SaaS applications. These solutions are readily available off-the-shelf and require minimal configuration and development effort. Adopting them as part of your technology stack can accelerate the transformation and time to market of new features, while not requiring specific in-house technical expertise. Many cloud-native SaaS solutions focused on retail use cases are available (see Figure 1), including Vue Storefront , which provides a front-end presentation layer for ecommerce, and Amplience , which enables retailers to customize their digital experiences. Figure 1: Some MACH Alliance members providing retail solutions. At the same time, in-house development should not be totally discarded, and you should aim to strike the right balance between the two options based on your objectives. Figure 2 shows pros and cons of the two approaches: Figure 2: Pros and cons of cloud-native SaaS and in-house approaches. MongoDB is a great fit for cloud-native SaaS applications MongoDB’s product suite is cloud-native by design and is a great fit if your organization is adopting this principle, whether you prefer to run your database on-premises, leveraging MongoDB Community and Enterprise Advanced , or as SaaS with MongoDB Atlas . MongoDB Atlas, our developer data platform, is particularly suitable in this context. It supports the three major cloud providers (AWS, GCP, Azure) and leverages the cloud platforms’ features to achieve cloud-native principles and design: Auto-deployment & auto-healing: DB clusters are provisioned, set up, and healed automatically, reducing operational and DBA efforts. Automatically scalable: Built-in auto-scaling capabilities enable the database RAM, CPU, and storage to scale up or down depending on traffic and data volume. A MongoDB Serverless instance allows abstracting the infrastructure even further, by paying only for the resources you need. Globally distributed: The global nature of the retail industry requires data to be efficiently distributed to ensure high availability and compliance with data privacy regulations, such as GDPR , while implementing strict privacy controls. MongoDB Atlas leverages the flexibility of the cloud with its replica set architecture and multi-cloud support, meaning that data can be easily distributed to meet complex requirements Secure from the start: Network isolation, encryption, and granular auditing capabilities ensure data is only accessible to authorized individuals, thereby maintaining confidentiality. Always up to date: Security patches and minor upgrades are performed automatically with no intervention required from your team. Major releases can be integrated effortlessly, without modifying the underlying OS or working with package files. Monitorable and reliable: MongoDB Atlas distributes a set of utilities that provides real-time reporting of database activities to monitor and improve slow queries, visualize data traffic, and more. Backups are also fully managed, ensuring data integrity. Independent Software Vendors (ISVs) increasingly rely on capabilities like these to build cloud-native SaaS applications addressing retail use cases. For example, Commercetools offers a fully managed ecommerce platform underpinned by MongoDB Atlas (see Figure 3). Their end-to-end solution provides retailers with the tools to transform their ecommerce capabilities in a matter of days, instead of building a solution in-house. Commercetools is also a MACH Alliance member, fully embracing composable architecture paradigms explored in this series. Adopting Commercetools as your ecommerce platform of choice lets you automatically scale your ecommerce as traffic increases, and it integrates with many third-party systems, ranging from payment platforms to front-end solutions. Additionally, its headless nature and strong API layer allow your front-end to be adapted based on your brands, currencies, and geographies. Commercetools runs on and natively ingests data from MongoDB. Leveraging MongoDB for your other home-grown applications means that you can standardize your data estate, while taking advantage of the many capabilities that the MongoDB data platform has to offer. The same principles can be applied to other SaaS solutions running on MongoDB. Figure 3: MongoDB Atlas and Commercetools capabilities. Find out more about the MongoDB partnership with Commercetools . Learn how Commercetools enabled Audi to integrate its in-car commerce solution and adapt it to 26 countries . MongoDB supports your home-grown applications MongoDB offers a powerful developer data platform, providing the tools to leverage composable architecture patterns and build differentiating experiences in-house. The same benefits of MongoDB’s cloud-native architecture explored earlier are also applicable in this context and are leveraged by many retailers globally, such as Conrad Electronics, running their B2B ecommerce platform on MongoDB Atlas . Summary Cloud-native principles are an essential component of modern systems and applications. They support ISVs in developing powerful SaaS applications and can be leveraged to build proprietary systems in-house. In both scenarios, MongoDB is strongly positioned to deliver on the cloud-native capabilities that should be expected from a modern data platform. Stay tuned for our final blog of this series on Headless and check out our previous blogs on Microservices and API-first .