Storage-viz is a suite of web-based visualizers and new experimental database commands that may help you understand how MongoDB utilizes storage and organizes btrees. Storage-viz is now available in the MongoDB Nightly builds.
When a MongoDB collection is created, an on-disk extent is allocated to store the documents. Each time a newly created or updated document cannot fit into the existing collection’s extents, a new extent is created. Each document occupies a contiguous storage area - a record - in one of the collection’s extents. Storage-viz’ experimental storageDetails command extracts information about how the disk storage is used and the web-based visualizer generates an easy-to-read graphical representation. Storage-viz also showcases which parts of the collection’s extents are currently in RAM [NOTE: the visualizer doesn’t display how much memory is available].
MongoDB Indexing is accomplished with Btrees. Storage-viz’ _indexStats_ command and its web-based visualizer collect and display statistics related to the tree layout.
Want to try it?
Download the MongoDB Nightly Build (or 2.3.1 as soon as it’s available) from here and head to the Github repository for more information on how to use Storage-viz, submit feature requests or bug reports.
Keep in mind that the new commands are resource intensive and should be considered highly experimental for the time being. We suggest running them on a non-production server on a snapshot of your datafiles.
Storage-viz was designed, coded and tested by Andrea Lattuada, one of 10gen’s Interns. It has been an invaluable and greatly rewarding experience to work closely with 10gen’s Server Engineers and write code that is now available for everyone in the MongoDB community to use.
Living in the post-transactional database future
Given that we’ve spent decades building applications around relational databases, it’s not surprising that the first response to the introduction of NoSQL databases like MongoDB is sometimes “Why?” Developers aren’t usually the ones asking this question, because they love the approachability and flexibility MongoDB gives them. But DBAs who have built their careers on managing heavy RDBMS infrastructure? They’re harder to please. 10gen president Max Schireson estimates that 60 percent of the world’s databases are operational in nature, which is MongoDB’s market. Of those use cases, most of them are ripe for a non-relational approach. The database market is rapidly changing, and very much up for grabs. Or as Redmonk analyst James Governor puts it , “The idea that everything is relational? Those days are gone.” As useful as relational databases are (and they’re very useful for a certain class of application), they are losing relevance in a world where complex transactions are more the exception, less the rule. In fact, I’d argue that over time, the majority of application software that developers write will be in use cases that are better fits for MongoDB and other NoSQL technology, not legacy RDBMS. That’s the future. What about now? Arguably, many of the applications being built today are already post-transaction, ripe for MongoDB and poor fits for RDBMS. Consider: Amazon: its systems that process order transactions (RDBMS) are largely “done” and “stable”. Amazon’s current development is largely focusing on how to provide better search and recommendations or how to adapt prices on the fly (NoSQL). Netflix: the vast majority of it engineering is focusing on recommending better movies to you (NoSQL), not processing your monthly bill (RDBMS). Square: the easy part is processing the credit card (RDBMS). The hard part is making it location aware, so it knows where you are and what you’re buying (NoSQL). It’s easy, but erroneous, to pigeon-hole these examples as representative of an anomalous minority of enterprises. Yes, these companies represent the cutting edge of both business and technology. But no, they are not alone in building these sorts of applications. For every early-adopter Netflix there’s a sizable, growing population of mainstream companies in media (e.g., The Guardian ), finance (e.g., Intuit ), or other verticals that are looking to turn technology into a revenue-driving asset, and not simply something that helps keep the lights on and payrolls running. When what we built were websites, RDBMS worked great. But today, we’re building applications that are mobile, social, involve high volume data feeds, incorporate predictive analytics, etc. These modern applications? They don’t fit RDBMS. Andy Oliver lists 10 things never to do with a relational database , but the list is much longer, and growing. MongoDB is empowering the next generation of applications: post-transactional applications that rely on bigger data sets that move much faster than an RDBMS can handle. Yes, there will remain a relatively small sphere of applications unsuitable for MongoDB (including applications with a heavy emphasis on complex transactions), but the big needs going forward like search, log analysis, media repositories, recommendation engines, high-frequency trading, etc.? Those functions that really help a company innovate and grow revenue? They’re best done with MongoDB. Of course, given RDBMS’ multi-decade legacy, it’s natural for developers to try to force RDBMS to work for a given business problem. Take log analysis, for example. Oliver writes: Log analysis : …[T]urn on the log analysis features of Hadoop or RHQ/JBossON for a small cluster of servers. Set the log level and log capture to anything other than ERROR. Do something more complex and life will be very bad. See, this kind of somewhat unstructured data analysis is exactly what MapReduce à la Hadoop and languages like PIG are for. It’s unfortunate that the major monitoring tools are RDBMS-specific — they really don’t need transactions, and low latency is job No. 1. For forward-looking organizations, they already realize that MongoDB is an excellent fit for log management, which is why we see more and more enterprises turning to MongoDB for this purpose. I expect this to continue. As MongoDB continues to enrich its functionality , the universe of applications for which it is not merely applicable, but also better , will continue to expand, even as the universe of applications for which RDBMS is optimal will decline. Indeed, we’re already living in a post-transactional world. Some people just don’t know it yet. (Or, as William Gibson would say, “The future is already here – it’s just not very evenly distributed.”) Posted by Matt Asay , vice president of Corporate Strategy, with significant help from my inestimable colleague, Jared Rosoff . Tagged with: NoSQL, MongoDB, RDBMS, relational, James Governor, Redmonk, log analysis, Andy Oliver, transactions, Netflix, Amazon, Square, operational database, DBA
MACH Aligned for Retail (Microservices, API-First, Cloud Native SaaS, Headless)
Across the Retail industry, MACH principles and the Mach Alliance are becoming increasingly common. What is MACH and why is it being embraced for Retail? The MACH Alliance is a non-profit organization fostering the adoption of composable architecture principles. It stands for Microservices, API-First, Cloud-Native SaaS and Headless. The MACH Alliance’s Manifesto is to: “Future proof enterprise technology and propel current and future digital experiences" The MACH Alliance and the creation of this set of principles originated in the Retail Industry. Several of the 5 co-founders of the MACH Alliance are technology companies building for retail use cases: for example commercetools is a composable commerce platform for retail (built completely on MongoDB). MongoDB has been a member of the MACH Alliance since 2020, as an “enabler” member, meaning use of our technology can enable the implementation of the MACH principles in application architectures. This is because a data layer built on MongoDB is ideal as the basis for a MACH architecture. Members of our Industry Solutions team sit on the MACH technology, growth and marketing councils, and actively are involved with furthering the adoption of MACH across the Retail Industry What is MACH, why is it important for retail? The retail industry has long been a fast adopter of technology and a forerunner in technology trends. This is because of the competitive nature of the business leading a drive towards innovation- its vital that retails are able to react quickly to new technologies (e.g. NFTs, VR, AI) to capture market share and stay ahead of the competitors. Retailers have realized that to be able to deliver new and value-add experiences to their customers, they have to cut back on operational overhead that leads to increased cost and build standard functionality that can either be bought or re-used. This is where the benefits of MACH comes in- it's all about increasing the ability to deliver innovation quickly while lowering operational costs & risk. Microservices: An approach to building applications in which business functions are broken down into smaller, self-contained components called services. These services function autonomously and are usually developed and deployed independently. This means the failure or outage of one microservice will not affect another and teams can develop in parallel, increasing efficiency. API-First: A style of development where the sharing and use of the data via API (application programming interface) is considered first and foremost in the development process. This means that services are designed to aid the easy sharing of information across the organization and simple interconnectivity of systems. Cloud-Native SaaS: Cloud-native SaaS solutions are vendor-managed applications developed in and for the cloud, and leveraging all the capabilities the cloud has to offer, such as fully managed hosting, built-in security, auto-scaling, cross-regional deployment and automatic updates. These are a good fit for a MACH architecture as adopting them can reduce operational costs and frees up developers for value-add work like new unique customer experiences. Headless: Decoupling the front end from the back-end so that front ends (or “heads”) can be created or iterated on with no dependencies on the back end. The fact that the layers are loosely coupled decreases time to market for new front ends, and encourages the re-use back-end services for multiple purposes. It also de-risks change in the long term as services can function independently. Where does MongoDB come in? MongoDB is an enabler for MACH, meaning that using MongoDB as your data layer helps retailers and retail software companies. achieve MACH compliance. Our data model, architecture and functionality empower IT organizations to build in line with these architecture principles. During a digital transformation, where a retailer is modernizing a monolith into a microservices based architecture, they're looking for a data layer which will enable speed of development & change. MongoDB is the "most wanted" database 4 years running on Stack Overflow's developer survey- this is because our document model maps to the way developers are thinking & coding, and the flexibility allows for iterative change of the data layer. When looking at API based communication, the standard format for APIs is JSON, which again maps to MongoDB's document model. The idea with API-first development is to develop with the API in mind- why not store the data the way you're going to serve it by API. This reduces complexity and increases performance. Cloud Native and SaaS products have become the norm as retailers wish to reduce maintenance and management work. MongoDB Atlas, provides a database-as-a-service, guaranteeing 99.995% uptime, automatic failover and self-healing and allowing DevOps engineers to spin up databases in minutes or by API/ script. Many retail software companies are also built on MongoDB Atlas- for example commercetools, which provides an ecommerce solution as a SaaS product. Headless architectures require a data layer that is able to adapt and change for new workloads. The ability to change the schema at runtime, with no downtime, makes MongoDB's document model ideal for this. Performance and the ability to scale for new "heads" is also important. MongoDB is known as a high performance database and can scale vertically automatically or scale out horizontally seamlessly. So MongoDB becomes a great choice for retailers choosing to adopt a MACH architecture (see figure 1 below). As a general purpose database with high performance, a rich expressive query language and secondary indexing, MongoDB is a really good fit as a data layer as it is capable of handling operational and analytical needs of the application. FIgure 1: Example of a MACH architecture Want to know more? Are you interested in a transition to MACH? Dive into our four part blog series exploring each topic in detail and how MongoDB supports each of these principles: Microservices API-First Cloud-Native SaaS Headless