Here are some great articles about MongoDB to read this weekend:
ScaleGrid: Should You Enable MongoDB Journaling?, 10/18
Business Insider: MongoDB co-founder Dwight Merriman and CEO Max Schireson were chosen for the Silicon Alley Top 100 in New York Tech roundup, October
InfoWorld: Use MongoDB to Make Your App Location-Aware, 10/24
comSysto: Getting Started with MongoSoup, 10/25
Digital Misinformation: Collections and Embedded Documents in MongoDB, 10/22
Performance Tuning MongoDB on Solidfire
This is a guest post by by Chris Merz & Garrett Clark, SolidFire We recently had a large enterprise customer implement a MongoDB sharded cluster on SolidFire as the backend for a global e-commerce system. By leveraging solid-state drive technology with features like storage virtualization, Quality of Service (guaranteed IOPS per volume), and horizontal scaling, the customer was looking to combine the benefits of dedicated storage performance with the simplicity and scalability of a MongoDB environment. During the implementation the customer reached out to us with some performance and tuning questions, requesting our assistance with the configuration. After meeting with the team and reviewing the available statistics, we discovered response times that were deemed out of range for the application’s performance requirements. Response times were ~13-20ms (with an average of 15-17 ms). While this is considered acceptable latency in many implementations, the team was targeting < 5ms average query response times. When troubleshooting any storage latency performance issue it is important to focus on two critical aspects of the end-to-end chain: potential i/o queue depth bottlenecks and the main contributors to the overall latency in the chain. A typical end-to-end sequence with attached storage can be described by: MongoDB > OS > NIC > Network > Storage > Network > NIC > OS > MongoDB First off, we looked for any i/o queue depth bottlenecks and found the first one on the operating system layer. MongoDB was periodically sending an i/o queue depth of >100 to the operating system and, by default, iSCSI could only release a 32 queue depth per iSCSI session. This drop from an i/o queue depth of >100 to 32 caused frames to be stalled on the operating system layer while they were waiting to continue down the chain. We alleviated the issue by increasing the number of iSCSI sessions to the volume from 1 to 4, which proportionally increased the queue depth exiting the operating system to 128 (32*4). This enabled all frames coming off the application layer to immediately pass through the operating system and NIC, decreased the overall latency from ~15ms to ~4ms. Despite the latency average being 4ms, performance was still rather variable. We then turned our focus to pinpointing the sources of the remaining end-to-end latency. We were able to determine the latency factors in the stack through the study of three latency loops: First, the complete chain of: MongoDB > OS > NIC > Network > Storage > Network > NIC > OS > MongoDB . This loop took an average of 3.9ms to complete. Secondly, the subset loop of: OS > NIC > Network > Storage > Network > NIC > OS . This loop took ~1.1ms to complete. We determined the latency of this loop by the output of “iostat –xk 1” then greping for the corresponding volume. The last loop segment, latency on the storage subsystem, was 0.7ms and was obtained through a polling API command issued to the SolidFire unit. Our analysis pointed to the first layers of the stack contributing the most significant percent (>70%) of the end-to-end latency, so we decided to start there and continue downstream. We reviewed the OS configuration and tuning, with an eye towards both SolidFire/iSCSI best practices and MongoDB performance. Several OS-level tunables were found that could be tweaked to ensure optimal throughput for this type of deployment. Unfortunately, none of these resulted in any major reduction in the end-to-end latency for mongo. Having eliminated the obvious, we were left with what remained: MongoDB itself. A phrase oft-quoted by the famous fictional detective, Sherlock Holmes came to mind: “when you have eliminated the impossible, whatever remains, however improbable , must be the truth.” Upon going over the collected statistics runs with a fine-toothed comb, we noticed that the latency spikes had intervals of almost exactly 60 seconds. That’s when the light bulb went off… The MongoDB flush interval. The architecture of MongoDB was developed in the context of spinning disk, a vastly slower storage technology requiring batched file syncs to minimize query latency. The syncdelay setting defaults to 60 seconds for this very reason. In the documentation , it is clearly stated “In almost every situation you should not set this value and use the default setting”. ‘Almost’ was the key to our solution, in this particular case. It should be noted that changing syncdelay is an advanced tuning, and should be carefully evaluated and tested on a per-deployment basis. Little’s Law (IOPS = Queue Depth / Latency) indicated that lowering the flush interval would reduce the variance in queue depth thereby smoothing the overall latency. In lab testing, we had found that, under maximum load, decreasing the syncdelay to 1 second would force a ‘continuous flush’ behavior usually repeating every 6-7 seconds, reducing i/o spikes in the storage path. We had seen this as a useful technique for controlling IOPS throughput variability, but had not typically viewed it as a latency reduction technique. It worked! After implementing the change, the customer excitedly reported that they were seeing average end-to-end MongoDB response times of 1.2ms, with a throughput of ~4-5k IOPS per mongod (normal for this application), and NO obvious increase in extraneous i/o. By increasing the number of iSCSI sessions, normalizing the flush rate and removing the artificial 60s buffer, we reduced average latency more than an order of magnitude, proving out the architecture at scale in a global production environment. Increasing the iSCSI sessions increased parallelism, and decreased the latency by 3.5-4x. The reduction in syncdelay had the effect of smoothing the average queue depth being sent to the storage system, decreasing latency by slightly more than 3x. This customer’s experience is a good example of how engaging the MongoDB team early on can ensure a smooth product launch. As of today, we’re excited to announce that SolidFire is a MongoDB partner. Learn more about the SolidFire and MongoDB integration on our Database Solutions page . To learn more about performance tuning MongoDB on SolidFire, register for our upcoming webinar on November 6 with MongoDB. For more information on MongoDB performance, check out Performance Considerations for MongoDB , which covers other topics such as indexes and application patterns and offers insight into achieving MongoDB performance at scale.
The Rise of the Strategic Developer
The work of developers is sometimes seen as tactical in nature. In other words, developers are not often asked to produce strategy. Rather, they are expected to execute against strategy, manifesting digital experiences that are defined by the “business.” But that is changing. With the automation of many time-consuming tasks -- from database administration to coding itself -- developers are now able to spend more time on higher value work, like understanding market needs or identifying strategic problems to solve. And just as the value of their work increases, so too does the value of their opinions. As a result, many developers are evolving, from coders with their heads-down in the corporate trenches to highly strategic visionaries of the digital experiences that define brands. “I think the very definition of ‘developer’ is expanding,” says Stephen “Stennie” Steneker, an engineering manager on the Developer Relations team at MongoDB. “It’s not just programmers anymore. It’s anyone who builds something.” Stennie notes that the learning curve needed to build something is flattening. Fast. He points to an emerging category of low code tools like Zapier, which allows people to stitch web apps together without having to write scripts or set up APIs. “People with no formal software engineering experience can build complex automated workflows to solve business problems. That’s a strategic developer.” Many other traditional developer tasks are being automated as well. At MongoDB, for example, we pride ourselves on removing the most time-consuming, low-value work of database administration. And of course, services like GitHub Copilot are automating the act of coding itself. So what does this all mean for developers? A few things: First, move to higher ground. In describing one of the potential outcomes of GitHub Copilot, Microsoft CTO Kevin Scott said, ““It may very well be one of those things that makes programming itself more approachable.” When the barriers to entry for a particular line of work start falling, standing still is not an option. It’s time to up your strategic game by offering insight and suggestions on new digital experiences that advance the objectives of the business. Second, accept more responsibility. A strategic developer is someone who can conceive, articulate, and execute an idea. That also means you are accountable for the success or failure of that idea. And as Stennie reminded me, “There are more ways than ever before to measure the success of a developer’s work.” And third, never stop skilling. Developers with narrow or limited skill sets will never add strategic value, and they will always be vulnerable to replacement. Like software itself, developers need to constantly evolve and improve, expanding both hard and soft skills. How do you see the role of the developer evolving? Any advice for those that aspire to more strategic roles within their organizations? Reach out and let me know what you think at @MarkLovesTech .