MongoDB customers and community members are the people who realize GIANT ideas. We are excited to begin highlighting some of our community members, our MongoDB Giants, who are tackling challenging problems and bringing solutions to life with MongoDB.
This month’s MongoDB Giant is Doug Duncan, a DBA at Alteryx, who is making a meaningful impact to the MongoDB community. Alteryx is a leader in data preparation, blending, and advanced analytics.
Doug has worked with RDBMSs for longer than he cares to admit, focusing in both development and administration. Through his experience, Doug grew to embrace new data storage technologies. He began working with MongoDB several years ago (starting with the MongoDB 1.8 release in early 2011) both professionally and recreationally. Doug acted as an online TA the first two years following MongoDB University’s founding, working closely with the team by answering students’ questions about the M101J, M101JS and M202 courses, as well as providing questions used in the courses. For his contributions to the education team Doug was awarded the first ever MongoDB DBA certification back in November of 2013.
In his spare time, if Doug’s not reading up on MongoDB, Hadoop, or other distributed data stores, you can find him walking around the foothills of Colorado with his wife, two boys and two dogs.
Become an important part of the MongoDB community. Join our Advocacy Hub and start getting involved today.
Leaf in the Wild: SilkRoute Chooses MongoDB Over SQL Server for Critical Quality Assurance Platform
Leaf in the Wild posts highlight real world MongoDB deployments. Read other stories about how companies are using MongoDB for their mission-critical projects. MongoDB chosen for development productivity, operational efficiency with Cloud Manager, and “truly outstanding” professional services From manufacturing to retail, every part of the supply chain is starting to see the value of data. Whether it’s developing IoT quality assurance applications in manufacturing to ensure your products are defect-free or building data-driven customer loyalty programs so that brands can connect with and reward their fans, the top companies are working to improve their approach to data. SilkRoute Global is a software-as-a-service company focused on this industry. Its analytics products automate processes and present consumable, useful information to its customers. To understand the benefits they get from MongoDB, I spoke with Devin Duden, CTO of OmniSky (a division of SilkRoute) & Senior Software Engineer, and Amjad Hussain, CEO & Chief Data Scientist. Tell us a little bit about SilkRoute. SilkRoute is a passionate team of designers, machine learning scientists, and software engineers with tremendous industry knowledge of manufacturing, distribution, and retail. We live for solving big problems. Our industry-specific predictive and prescriptive analytics platform creates immense operational and strategic value for our customers. Our customer footprint is global and growing. Applied machine learning, business process automation, and mobility are woven into the fabric of everything we build. We offer a unique risk-free rapid implementation and integration approach for our customers to enjoy our solutions. Please describe how you’re using MongoDB. The application SilkRoute is building is a mobile application performing RFID inspections on industrial manufactured products. The application provides a centralized data store of customers’ products and the inspections associated with a product, and allows those customers to easily share the inspection records with others. MongoDB was chosen for this application based on: Simplified schema design Increased flexibility for modeling complex relationships (e.g., using MongoDB eliminated recursive relationships necessary in a SQL-based solution) Easier capture of user generated data Reduced development timeline Durability, scalability, and disaster recovery SilkRoute Enterprise mobile RFID inspection architecture What were you using before MongoDB? Was this a new project or did you migrate from a different database? The current version of the application is a client-server implementation using SQL Server as a cloud sharing data store and Windows CE on the mobile device. The application is a rewrite. How did you hear about MongoDB? Did you consider other alternatives, like relational or NoSQL databases? I was introduced to MongoDB three years ago when I started working at SilkRoute. We were working on a social network at the time, which was using MongoDB as its primary data store. The RFID mobile application’s technical requirements were originally to use MS SQL Server. This technical requirement was provided by the client. During our working Joint Application Design session with the client, we suggested using MongoDB, but didn’t make headway. When we attended MongoDB World 2015 , we gathered enough details about MongoDB’s capabilities, along with real-world examples of high-volume, transaction-based applications being developed on MongoDB, that we were able to persuade the client to switch from SQL Server to MongoDB. Please describe your MongoDB deployment, technology stack, and the version of MongoDB that you are running. The MongoDB deployment is a 5 node replica set using Cloud Manager for operational management and deployment. The replica set is deployed in the US East AWS region across all availability zones. At this point, we have not implemented sharding. The MongoDB replica set has been deployed in AWS following MongoDB’s best practices using Amazon Linux AMIs. Each production node will be running on EC2 instances with 16 GB memory and 4 core CPUs, with three 100GB provisioned IOPS EBS volumes. Each volume is XFS format. One volume is mapped for “data”, one volume is mapped for “log”, and one volume is mapped for “journal”. The API stack is written in .NET 5 using C# MVC/Web API framework. We are using the MongoDB .NET driver version 2.0. Are you using any tools to monitor, manage and backup your MongoDB deployment? If so what? Do you use Ops Manager / Cloud Manager? The replica set has been deployed and managed using Cloud Manager. Cloud Manager simplified and streamlined replica set deployment and operations. This solution is the first time the majority of team members used MongoDB. To reduce time spent with MongoDB replica set deployment and configuration, Cloud Manager was a great fit. Following Cloud Manager’s directions to create AWS EC2 instances made it very easy for us to create images, and build/tear down replica sets quickly. Streamlining manual tasks allowed the team to focus more time on development than deploying a fully managed MongoDB replica set. In addition to Cloud Manager, the team just started using MongoDB Compass to analyze collections and document sizes. Are you integrating MongoDB with other data analytics, BI or visualization tools? If so, can you share any details? At this point we have not integrated any BI. One of our objectives is to connect with the client’s BI system using the MongoDB Connector for BI and/or extract data from a tagged node to hydrate a SQL-based BI system. We’re planning to perform a POC on the Connector for BI, now that it has been released. How are you measuring the impact of MongoDB on your business? SilkRoute measures MongoDB’s impact by many factors, including ease of use with deployments, a code first approach, increased agile development model, reduced total cost of ownership, and reduced time to market. The ease of deployments reduces or eliminates maintenance windows when spinning up a replica set or upgrading database versions, which means higher uptime for customers and less productive time eaten up for developers. A code first approach adds to increased savings by eliminating daunting DDL script management and aids with better agile development. These factors result in reduced total cost of ownership and faster time to market. Do you use any commercial subscriptions or services to support your MongoDB deployment? SilkRoute is a MongoDB OEM partner. For the RFID application we will be embedding MongoDB Enterprise Server 3.2 and managing the deployment with Cloud Manager. We allocated a budget for MongoDB’s professional services in the early stages of the project. The professional services were tailored to the team’s skill set and agenda. With two separate onsite sessions, we covered topics from deployment, management, and recovery using Cloud Manager, to schema modeling and scaling. The value gained working hands-on directly with a MongoDB consulting engineer was twice the investment. During one session, we encountered a disaster recovery situation in a non-production environment. Unexpected though the situation was, I personally gained the most from the experience of working through the issue with a MongoDB expert in a very collaborative fashion. The professionalism and knowledge of our MongoDB consulting engineer was truly outstanding. Do you have plans to use MongoDB for other applications? If so, which ones? Yes, both internal initiatives and client initiatives. These include BI, a Warehouse Manager SaaS solution, a customer loyalty/couponing app, and client SaaS solutions, which we are not at liberty to disclose at this point. We would prefer to use MongoDB for all application and system development projects. Our preference to use MongoDB for development is based on ease of use, an emphasis on a code first approach for projects going forward, and built-in scalability and durability. Have you upgraded to MongoDB 3.2? What most excites you about this release? We’ve been developing the solution using MongoDB 3.0.x. We are actively migrating the database to version 3.2.1, and the production deployment will use 3.2.1. The most exciting features of MongoDB 3.2 for us are the BI connector, document validation, $lookup, and WiredTiger's in-memory option. We feel the biggest value add to our clients are the BI connector and the in-memory storage engine. The BI connector will allow our clients’ BI environments to integrate directly with the solution we are building, eliminating the need to write ETL processes from MongoDB to a BI environment. The in-memory storage engine will increase performance with read operations, which will reduce latency with API requests. Anything to increase overall performance is a plus. What advice would you give someone who is considering using MongoDB for their next project? I would highly recommend allocating a budget for MongoDB’s professional services to help with operations, deployment, and schema modeling. The value gained with their best practices approach really reduces learning curves and POC time. Coming from a SQL world, prepare ERDs and break the ERDs into schema designs. This approach will help bridge team members from a relational to a non-relational data store. Take a top-down development approach as it will uncover access patterns that may help with schema modeling. Thank you for sharing your MongoDB experiences with us! If you’re comparing MongoDB with relational databases, read our RDBMS to MongoDB Migration Guide to learn more. Read the RDBMS to MongoDB Migration Guide About the Author - Eric Holzhauer Eric is a Product Marketing Manager at MongoDB.
4 Steps to Success: From Surviving with Legacy Systems to Thriving with MongoDB
Legacy data migrations imply a change in the status quo. More often than not, when an organization finally undertakes a thorough analysis of its technology landscape, it arrives at the same decision: to do nothing. It is an understandably daunting task to upgrade or replace 20+ year-old applications and their database counterparts. But there are good reasons, beyond the tri-annual hardware upgrade, to propel those legacy monoliths of the 1990s into the 21st century. Companies that prevailed—and even triumphed—in the volatile spring of 2020 were those that transitioned to a more flexible usage model and were therefore able to adjust their business models more rapidly and reliably. MongoDB’s client, Sanoma, was one of the winners. Sanoma was able to scale from 3,000 to 150,000 users within 24 hours, without any service interruption. Innovation and modernization go hand in hand. However, while modernization can sadly occur without innovation, the opposite is simply not possible. A bit of history The concept of bringing data together through online data layers (ODL) or operational data stores (ODS) isn't new or specific to MongoDB. Accessing legacy systems, bringing data together, and making it all more easily accessible was a common goal even 20 years ago, and led to the search for the golden source of truth (i.e. the definitive master source for any given entity). This search proved elusive early on due to the hurdles involved with bringing data from diverse, over-structured relational constructs to a sole target called Operational Data Store (ODS) or Online Data Layer (ODL). The industry’s first attempts began with Object-oriented databases, then with the dead end of XML data stores. (In my personal opinion, Xquery and Xpath were never meant for real developers). After both endeavors failed, then came the wave of Apache efforts I like to call “Hadoop Solves the Planet,” in which companies dumped all their structured data onto a big-data treasure trove. Unfortunately, this resulted in a data desert rather than the data lake everybody was hoping for, since organizations then had to scramble to build a concept for secondary indexing, data dictionaries, and more, on top of having to rebuild the sensible structures they lost. In the 2010s, the document model, in conjunction with JSON notation , emerged as the new de facto standard. MongoDB release 3.x introduced the combination of ACID (atomicity, consistency, isolation, durability) and compliance with a broad range of data types (in BSON, for those in the know). Soon, the MongoDB team started implementing additional features of relational heritage: secondary indexing, ACID transactions, aggregations and manipulations of data in site, materialized views, joins, unions... the list goes on. Where we are now MongoDB documents can be enriched through different means and channels without touching the content — the consistency of all data and data lineage is implicitly guaranteed. A typical example is the extraction of a delivery address through a supply chain application and a billing address through an enterprise resource planning system. In many cases, those two systems have different requirements. MongoDB documents simply keep both instantiations intact and can even hold multiples of each attached to one single client profile without the need to complete loads and transformations, foreign keys, and all the other ingredients of the relational past. MongoDB simply adds and leverages other sources without destroying their context. MongoDB delivers an ODS and ODL experience while streamlining the time-consuming journey of replacing legacy application code.The data platform of true modernization and innovation has arrived! How your company can get here The entire journey can be summarized in four simple steps: Analysis: Where do I start my data journey to drive the fastest value? Scaffolding: How do I get my data out of the existing platform and bridge it to the new platform? Coding: How do I enter the world of adjusting and adapting my applications landscape? Innovation: Which are the easiest targets for my company to start achieving true innovation? The following sections answer these four questions and provide you with a starting point for your journey toward a new and improved solution landscape. Step 1: Analysis of your existing solution landscape Data Provisioning Data provisioning—the act of bringing data from source system(s) to target system—is actually the easy part of this step. Opinions may vary as to the very best approach, but most existing models for streaming data in real time make the process elegant and allow for a business-driven decision from real-time replication on one end to communicate with the batch of .CSV files on the other end. Application onboarding More exciting is the application onboarding phase, inclusive of the selection and design of initial data domains. Here, simple mechanisms derived from the classic priority concepts can assist—and yes, they existed long before computers. Data domains already exist in objects in the business logic represented through their objects in the various programming languages. But even the most talented application developer deals with constant changes which leads to compromises in those objects and can obfuscate the original clarity in their design so the objects may hide in plain sight. Unearthing those gems and aligning them to the ODS is the most important step towards true legacy modernization. The most simple solution is actually the most practical one: load an object with the existing software and persist it into a MongoDB collection. The effort of persisting the object results in two lines of code that can be easily added. The location of the two lines of code (first line one opens connection to database; second line one persists the object) does not matter as long as it is in a place after the object is built out. This is the first time you will see the beauty of MongoDB and MQL at work. You really have to do nothing for the object itself—e.g. no decomposition or abstraction layer. MongoDB takes care of it for you. When looking at the object in the MongoDB database, e.g. using MongoDB Compass, you will realize that it already looks a lot like the domain object you wanted. The actual task to map objects to domains, or subset of domains, is now mostly driven by the application use case. Tip: How to leverage application mapping to accelerate onboarding In the model below, which was taken from the financial industry but can easily be adopted across industries, we identify the data domains in various applications and map their behavior to the effort it takes to locate them as well as their importance to the app. First, each domain gets a rating for its object complexity, where “complexity” is defined by the implementation team. This is similar to the concept of “ poker ” in a development sprint. Second, each data domain must be located in the application content. Then, it’s tally time. As we can see in the example above, the concept of schedules looks quite easy but is superseded by the client profiles which have a touch more application context (spoiler: those always come out on top). Based on the combination of complexity and the number of data domains affecting an application, we can now easily achieve the model below. Agile is your friend and, assuming a certain “point capacity,” the applications fall into place for their conversion schedule in a quite neutral fashion. The development team will then start with low hanging fruit. As soon as application 1, 6 and 7 are ported, we’re in business in a new modern landscape. Along the journey, the domains will get cleaned up naturally as we do not have the static corsage of the RDBMS table designs. Step 2: Scaffolding Scaffolding is the art of building a bridge that can hold people as they cross it, then immediately dissipate once they step off. But for that critical time, it needs to hold. The same is true for the connectivity between a legacy system and a new data platform. Starting with the first sprint, we have data residing in the MongoDB data platform. If the data is limited to new applications and resides exclusively in MongoDB, nothing needs to be done. However, as shown in the client profiles example above, there may be dependencies to consider. The synchronization between the legacy database and the new MongoDB platform can be easily arranged using microservices and the same concepts used for the initial loading of data. Synchronization can also be achieved through “the gate” if only READ data is needed during the first sprint, or if you’re already dealing with WRITE and the requirement to synchronize those writes back to a legacy system. Streaming: A streaming based solution is a great option for uni-directional operations that allow read only in the most simple way. Service: Selecting a simple, tiny microservice is a good option for the use case where data needs to be selectively written. It works using the document model on the MongoDB side, but can still push necessary updates back to the legacy system, and vice-versa. The great news is that this service potentially exists already, as it requires nothing more than using the old database interface from the legacy application on one side and the new, easy-to-digest JSON document format on the MongoDB side. If both databases are ACID-compliant, any transaction is automatically treated as a normal application interaction on both sides. “Y-Loader”: Another option is a true “Y-loader,” where all transactions are written in sync to both databases in parallel, and the actual transaction is only considered committed when both systems report their commit and completion. Simple two-phase protocols (write to both, wait five seconds, read both to validate and, if in sync, commit to application) are available as ready-made services through various distributed transaction coordinators, but often it’s easier to use the existing data access in the application. In that case, the new data path to MongoDB is in parallel, and a simple redundant checkpoint (which the application logic would have had for the legacy path anyway) is expanded for this purpose. Step 3: Coding The coding with the new domain data model, as well as the MongoDB flexible document model as the underlying base, will immediately impact the coding for the business logic and application development. The operative word is immediately. As the data gets unlocked with the initial persistence of the code object to the MongoDB collection, the developer is simultaneously able to code based on business requirements. Developers will no longer be hindered by reference and requirements of object mappers. As the objects are represented through the MongoDB idiomatic drivers, each programming object resides directly in the data collection; in reverse, any changes to the business logic object will be naturally represented—code-free—in the MongoDB collection. A single blog post can't resolve all open questions and edge cases. Each application, client, and data interface is unique. Databases possess historic technical debt and implicit assumptions that become lost in generations of developers over time. “Do not touch this section—not sure what it does but last time we tried all hell broke loose…” is often-heard advice around the organizational water cooler. But the key lesson? There are many different templates available and very simple methods of quickly taking the lead to significant success. For example, a German client, who was stuck in a combination of IBM DB2 (mainframe and distributed) with a significant Hadoop footprint, was amazed when they realized they could “lift” their data one microservice at a time. This resulted in business requirements shifting from “impossible to do” for some requested queries to “completed in under one second” within a single week of a proof-of-concept. This is no exception. Cases and changes like these are made daily, reinforcing Mark Twain’s sage advice that “The secret of getting ahead is getting started." Step 4: Innovation As the migration from the legacy environment continues, innovation will be the new focus. The unlocking of previously siloed data allows immediate coupling of real-time data with machine learning platforms for various purposes: e.g. scoring for financial decision-making, personalization for retail, or optimization of production processes in the IOT context. New applications and solutions can easily be created on top of the unleashed data, even with various programming languages, direct real-time dashboards created with MongoDB Charts, and different paradigms (again, MongoDB’s idiomatic drivers do magic!) At this time, the discussion with the product owners in your squads and tribes (trying to be real modern here) begins with the question“What is the highest priority component to change?” and “What function is required to enable this change?” Is it worth waiting much longer? The real question is: why did we all not start sooner? It’s time to begin integrating the list of features you always dreamed of having, but never dared to pursue. The MongoDB team is here to help you get started. Reach out today and let’s discuss the best path forward. To learn more about modernizing to MongoDB, click here .