What We're Reading
Here are some great articles about MongoDB to read this weekend:
ScaleGrid: Should You Enable MongoDB Journaling?, 10/18
Business Insider: MongoDB co-founder Dwight Merriman and CEO Max Schireson were chosen for the Silicon Alley Top 100 in New York Tech roundup, October
InfoWorld: Use MongoDB to Make Your App Location-Aware, 10/24
comSysto: Getting Started with MongoSoup, 10/25
Digital Misinformation: Collections and Embedded Documents in MongoDB, 10/22
We’re happy to announce a new partnership with Soldfire. You can read more about this on Consumer Electronics and the MongoDB blog, 10/24
Performance Tuning MongoDB on Solidfire
This is a guest post by by Chris Merz & Garrett Clark, SolidFire We recently had a large enterprise customer implement a MongoDB sharded cluster on SolidFire as the backend for a global e-commerce system. By leveraging solid-state drive technology with features like storage virtualization, Quality of Service (guaranteed IOPS per volume), and horizontal scaling, the customer was looking to combine the benefits of dedicated storage performance with the simplicity and scalability of a MongoDB environment. During the implementation the customer reached out to us with some performance and tuning questions, requesting our assistance with the configuration. After meeting with the team and reviewing the available statistics, we discovered response times that were deemed out of range for the application’s performance requirements. Response times were ~13-20ms (with an average of 15-17 ms). While this is considered acceptable latency in many implementations, the team was targeting < 5ms average query response times. When troubleshooting any storage latency performance issue it is important to focus on two critical aspects of the end-to-end chain: potential i/o queue depth bottlenecks and the main contributors to the overall latency in the chain. A typical end-to-end sequence with attached storage can be described by: MongoDB > OS > NIC > Network > Storage > Network > NIC > OS > MongoDB First off, we looked for any i/o queue depth bottlenecks and found the first one on the operating system layer. MongoDB was periodically sending an i/o queue depth of >100 to the operating system and, by default, iSCSI could only release a 32 queue depth per iSCSI session. This drop from an i/o queue depth of >100 to 32 caused frames to be stalled on the operating system layer while they were waiting to continue down the chain. We alleviated the issue by increasing the number of iSCSI sessions to the volume from 1 to 4, which proportionally increased the queue depth exiting the operating system to 128 (32*4). This enabled all frames coming off the application layer to immediately pass through the operating system and NIC, decreased the overall latency from ~15ms to ~4ms. Despite the latency average being 4ms, performance was still rather variable. We then turned our focus to pinpointing the sources of the remaining end-to-end latency. We were able to determine the latency factors in the stack through the study of three latency loops: First, the complete chain of: MongoDB > OS > NIC > Network > Storage > Network > NIC > OS > MongoDB . This loop took an average of 3.9ms to complete. Secondly, the subset loop of: OS > NIC > Network > Storage > Network > NIC > OS . This loop took ~1.1ms to complete. We determined the latency of this loop by the output of “iostat –xk 1” then greping for the corresponding volume. The last loop segment, latency on the storage subsystem, was 0.7ms and was obtained through a polling API command issued to the SolidFire unit. Our analysis pointed to the first layers of the stack contributing the most significant percent (>70%) of the end-to-end latency, so we decided to start there and continue downstream. We reviewed the OS configuration and tuning, with an eye towards both SolidFire/iSCSI best practices and MongoDB performance. Several OS-level tunables were found that could be tweaked to ensure optimal throughput for this type of deployment. Unfortunately, none of these resulted in any major reduction in the end-to-end latency for mongo. Having eliminated the obvious, we were left with what remained: MongoDB itself. A phrase oft-quoted by the famous fictional detective, Sherlock Holmes came to mind: “when you have eliminated the impossible, whatever remains, however improbable , must be the truth.” Upon going over the collected statistics runs with a fine-toothed comb, we noticed that the latency spikes had intervals of almost exactly 60 seconds. That’s when the light bulb went off… The MongoDB flush interval. The architecture of MongoDB was developed in the context of spinning disk, a vastly slower storage technology requiring batched file syncs to minimize query latency. The syncdelay setting defaults to 60 seconds for this very reason. In the documentation , it is clearly stated “In almost every situation you should not set this value and use the default setting”. ‘Almost’ was the key to our solution, in this particular case. It should be noted that changing syncdelay is an advanced tuning, and should be carefully evaluated and tested on a per-deployment basis. Little’s Law (IOPS = Queue Depth / Latency) indicated that lowering the flush interval would reduce the variance in queue depth thereby smoothing the overall latency. In lab testing, we had found that, under maximum load, decreasing the syncdelay to 1 second would force a ‘continuous flush’ behavior usually repeating every 6-7 seconds, reducing i/o spikes in the storage path. We had seen this as a useful technique for controlling IOPS throughput variability, but had not typically viewed it as a latency reduction technique. It worked! After implementing the change, the customer excitedly reported that they were seeing average end-to-end MongoDB response times of 1.2ms, with a throughput of ~4-5k IOPS per mongod (normal for this application), and NO obvious increase in extraneous i/o. By increasing the number of iSCSI sessions, normalizing the flush rate and removing the artificial 60s buffer, we reduced average latency more than an order of magnitude, proving out the architecture at scale in a global production environment. Increasing the iSCSI sessions increased parallelism, and decreased the latency by 3.5-4x. The reduction in syncdelay had the effect of smoothing the average queue depth being sent to the storage system, decreasing latency by slightly more than 3x. This customer’s experience is a good example of how engaging the MongoDB team early on can ensure a smooth product launch. As of today, we’re excited to announce that SolidFire is a MongoDB partner. Learn more about the SolidFire and MongoDB integration on our Database Solutions page . To learn more about performance tuning MongoDB on SolidFire, register for our upcoming webinar on November 6 with MongoDB. For more information on MongoDB performance, check out Performance Considerations for MongoDB , which covers other topics such as indexes and application patterns and offers insight into achieving MongoDB performance at scale.
Using MongoDB Skill Scanner to Build Better Training Programs
Technology leaders know that transformation is about more than just adopting modern technologies like MongoDB. The entire organization has to rally behind change — which is no easy task. The skills that modern development teams need are evolving faster than ever, and hiring to fill skills gaps can be too time-consuming and expensive of a process for many organizations. So it’s imperative that we plan for how we want to bring our people with us on our modernization journey, and proactively upskill them on the technologies we’re betting on. Because what happens if you choose MongoDB, but your developers don’t know how to use it? CIOs know that training programs are easier said than done. EY reported that 30% of CIOs acknowledge that their training programs are ineffective, and that they’re struggling to retain talent because of it. These leaders come to us to help them build and execute their MongoDB training programs , and seek advice on two extremely common yet critical challenges: How do we get away from the less effective one-size-fits-all approach? How do we measure the ROI of our training program and connect it to business impact? How we use MongoDB Skill Scanner to overcome training challenges Our Professional Services team uses a tool called MongoDB Skill Scanner to address both of these challenges. This tool helps us provide these three benefits to our customers looking to build a training program: Improve MongoDB proficiency: Teams can use Skill Scanner to quickly and easily assess the MongoDB skill gaps of their team members and gain a comprehensive understanding of their team’s MongoDB skills baseline. Increased productivity and accuracy: When team members have a comprehensive understanding of MongoDB, they are able to work more quickly and accurately on projects, leading to increased productivity and a higher quality of work. Save time and money with targeted Training: Using Skill Scanner, customers can avoid wasting time and money on trial-and-error learning. Instead, they can focus on improving their skills in a more targeted and efficient way with right-sized training plans. By leveraging this data, our customers’ engineers can engage in the right training at the right time, targeted for their job role and specific skill shortages. When a training program is built this way, engineers maximize their knowledge retention and minimize time away from their projects. Skill Scanner includes three role-based assessments, one for developers, database administrators, and DevOps respectively. Through a series of multiple choice questions, Skill Scanner provides customers with a clear understanding of their level of expertise across a set of technical skills that are critical for success in their role. After submitting the assessment, engineers will get results in each skill area outlining if they are beginner, intermediate, or advanced. Why data-driven training programs matter We’ve learned that it’s not enough to just tell teams to go watch training videos or webinars on their own, or to place everyone in the same one-size-fits-all program. Skills gaps vary from team to team, and individual to individual. The one-size-fits-all approach of some programs may not address individual learners' needs, wasting time and making it difficult for them to acquire new skills. By using Skill Scanner, we’re able to interpret this data to help determine which training courses your team should take. But we don’t only capture this data before doing training; we use Skill Scanner again after training programs are completed to see where immediate improvements have been made. This helps technology leaders prove the impact and ROI of their training, and gives them the confidence that their teams are ready to be successful with MongoDB. Developing a Precision Learning Program To go even further, our team can work with you to build a Precision Learning Program, where we use Skill Scanner data to build learning schedules that are unique to each individual. These schedules include a variety of short, blended, learning events such as classes, technical workshops, self-paced exercises, and project coaching. We’ve seen PLP lead to higher knowledge retention and of course, measurable project results. A customer who recently concluded their PLP saw a 43% increase in knowledge retention. Getting started building a personalized training program Skill gaps aren’t a novel problem IT leaders are facing. But with new digital courses, training, and technologies, the resources to close these gaps are at your fingertips. Skill Scanner and Precision Learning Program have been specifically designed to empower teams by offering targeted training that enhances their understanding of MongoDB. These short training events are carefully crafted to close skill gaps without compromising developer productivity. We’ve seen a variety of customers use this tool to help train their team’s individual needs, from needing to upskill new hires on their teams, projects with new MongoDB products, migrating to MongoDB Atlas, and more. It also saves your business the hours developers would've wasted searching for answers (and developers don’t want to spend their time that way, either). “We need help getting from point A to point B and feel MongoDB is uniquely positioned to help” — CTO at large insurance firm If you're interested in trying out MongoDB Skill Scanner or want to explore the MongoDB Precision Learning Program further, you can reach out to your account representative or contact us directly .