MongoDB Blog

Articles, announcements, news, updates and more

Turning Data Points Into Actionable Insights: Meet May Hoque

Imagine the interesting insights you could glean from combining multiple data sources with one tool that helps you easily analyze data over time. May Hoque is a senior software engineer on MongoDB’s Atlas Data Federation team where he helps create a distributed, federated query engine that can query across data stored in multiple sources. Keep reading to find out more about his experience joining MongoDB as an intern and new grad, then continuing to grow his career here over the last four and a half years. Jackie Denner: Thanks for sharing more about your experience today, May! To start, will you give an overview on your software engineering background and how you started working with MongoDB? May Hoque: I began exploring computer science in a high school class. The class was rudimentary but I had fun learning how to build programs. I chose computer science as my university major because it felt like a career I could grow with that both piqued my interest and offered long-term stability. I am currently a senior software engineer on MongoDB’s Atlas Data Federation team. I first joined MongoDB in 2017 as an intern, then returned after graduation to participate in the New Grad Program in 2018 which gave me an opportunity to rotate working between three different teams at MongoDB over our first six months. I originally joined the BI connector team, but then switched to the Atlas Data Federation team. JD: Tell me more about the Atlas Data Federation product. MH: Atlas Data Federation is a distributed, federated query engine at its core. This core enables users to query multiple data sources with a single query, from a single interface. Other MongoDB products, including Atlas Online Archive and Atlas Data Lake , use this core as a building block for their own functionality. The Atlas Data Lake product, for example, orders and organizes data to optimize for super fast queries even as the user's data sources grow in volume. The ability to perform complex queries, even across multiple data sources unlocks valuable benefits for a variety of use cases, for example maintaining the ability to easily query less frequently used data even after archiving it from pre-existing database clusters to less expensive locations. JD: What makes Atlas Data Federation unique? MH: We’re more than just a search function — we can also store your data and organize it in a way that makes it really fast to actually answer those questions. Its integration with Atlas and the larger MongoDB ecosystem widens the scope of the value users can get from their databases. It’s convenient and operationally simple to have all of your solutions to different challenges in the same place. MongoDB Atlas Data Lake allows developers to easily store and analyze large amounts of data in a cost-effective and scalable manner without having to worry about the underlying infrastructure. JD: Talk me through some example use cases your team supports. MH: The real value in large data sets lies in understanding the trends and relationships between the data points. There are endless possibilities of how organizations can use Atlas Data Federation to draw insights to motivate strategic business decisions, from answering questions about specific events, to aggregating insights across a group of data points. Atlas Data Lake stores and organizes your data in a way that makes it really fast to answer questions related to your collection of data. Teams across an organization can benefit from more insight into data learnings. A marketing team may want to know what percentage of their users have spent more than a specific amount on a single item, including supporting data like what the item was and when they purchased it. An investor may want to know how much profit an organization made over a specific time period. A product team may want to look at historical sales data from past product launches. Users can answer all of these questions and more with a query on Atlas Data Federation. JD: What projects are you currently working on? MH: I am contributing to a bigger MongoDB initiative to add more sources of data. Adding this support to Atlas Data Federation and Data Lake will make our service available to new clients who want to use the product, but currently can’t. I’m also working on a high level systems design challenge to rearchitect our systems to scale and improve our service for our customers. JD: Let’s talk about what it’s like to work at MongoDB. What makes the team and product exciting to work on? MH: The Atlas Data Federation team is primarily focused on problems relating to complex distributed systems and database engineering . These challenges aren’t often easy to work on, but the careful and rigorous thinking needed to solve them is exciting and rewarding. Plus, the solution to the data lake problem is in demand, and the projects we work on are relevant to the industry. JD: What is the overall engineering culture like at MongoDB? What opportunities have inspired you to grow here? MH: My experience on the team has contributed to my growth as an engineer. I’ve noticed a strong culture of learning, mentorship and diversity both on the Atlas Data Federation team and the company at large. I appreciate that our team has a wide spectrum of experience levels, from new grads to engineers with decades of experience. The team is collaborative and takes pride in supporting each other. Whether I work on a project independently or with a group of engineers, I’m never working solo. I always have the support of the team and people to bounce ideas off of throughout a project, which creates opportunity for growth. JD: Why should someone join the Atlas Data Federation team? MH: If you're someone who really likes technical challenges or you just want to solve really cool problems, we have no shortage of them to work on. If you’re focused on growth, we have opportunities for all levels of experience. It is possible to grow from an intern to a manager on our team because of the mentorship and breadth of projects available to work on, which I’ve seen happen for some of my colleagues. Our team environment is built on empathy and collaboration. JD: What stands out to you about your overall experience working at MongoDB compared to your past experiences? MH: After a few years on the team, I'm still consistently growing my skill set and working on interesting, fun projects – two primary reasons I continue to work at MongoDB. The problems the Atlas Data Federation team works on provide me useful experience that I can apply to future projects and challenges. If you’re looking to collaborate with forward-thinking teams and interesting use cases, MongoDB is one of the best tech companies to work for. Interested in transforming your career at MongoDB? View open roles on our teams across the globe.

February 7, 2023
Culture

MACH Aligned for Retail (Microservices, API-First, Cloud Native SaaS, Headless)

Across the Retail industry, MACH principles and the Mach Alliance are becoming increasingly common. What is MACH and why is it being embraced for Retail? The MACH Alliance is a non-profit organization fostering the adoption of composable architecture principles. It stands for Microservices, API-First, Cloud-Native SaaS and Headless. The MACH Alliance’s Manifesto is to: “Future proof enterprise technology and propel current and future digital experiences" The MACH Alliance and the creation of this set of principles originated in the Retail Industry. Several of the 5 co-founders of the MACH Alliance are technology companies building for retail use cases: for example commercetools is a composable commerce platform for retail (built completely on MongoDB). MongoDB has been a member of the MACH Alliance since 2020, as an “enabler” member, meaning use of our technology can enable the implementation of the MACH principles in application architectures. This is because a data layer built on MongoDB is ideal as the basis for a MACH architecture. Members of our Industry Solutions team sit on the MACH technology, growth and marketing councils, and actively are involved with furthering the adoption of MACH across the Retail Industry. What is MACH, why is it important for retail? The retail industry has long been a fast adopter of technology and a forerunner in technology trends. This is because of the competitive nature of the business leading a drive towards innovation- its vital that retails are able to react quickly to new technologies (e.g. NFTs, VR, AI) to capture market share and stay ahead of the competitors. Retailers have realized that to be able to deliver new and value-add experiences to their customers, they have to cut back on operational overhead that leads to increased cost and build standard functionality that can either be bought or re-used. This is where the benefits of MACH comes in- it's all about increasing the ability to deliver innovation quickly while lowering operational costs & risk. Microservices: An approach to building applications in which business functions are broken down into smaller, self-contained components called services. These services function autonomously and are usually developed and deployed independently. This means the failure or outage of one microservice will not affect another and teams can develop in parallel, increasing efficiency. API-First: A style of development where the sharing and use of the data via API (application programming interface) is considered first and foremost in the development process. This means that services are designed to aid the easy sharing of information across the organization and simple interconnectivity of systems. Cloud-Native SaaS: Cloud-native SaaS solutions are vendor-managed applications developed in and for the cloud, and leveraging all the capabilities the cloud has to offer, such as fully managed hosting, built-in security, auto-scaling, cross-regional deployment and automatic updates. These are a good fit for a MACH architecture as adopting them can reduce operational costs and frees up developers for value-add work like new unique customer experiences. Headless: Decoupling the front end from the back-end so that front ends (or “heads”) can be created or iterated on with no dependencies on the back end. The fact that the layers are loosely coupled decreases time to market for new front ends, and encourages the re-use back-end services for multiple purposes. It also de-risks change in the long term as services can function independently. Where does MongoDB come in? MongoDB is an enabler for MACH, meaning that using MongoDB as your data layer helps retailers and retail software companies. achieve MACH compliance. Our data model, architecture and functionality empower IT organizations to build in line with these architecture principles. During a digital transformation, where a retailer is modernizing a monolith into a microservices based architecture, they're looking for a data layer which will enable speed of development & change. MongoDB is the "most wanted" database 4 years running on Stack Overflow's developer survey- this is because our document model maps to the way developers are thinking & coding, and the flexibility allows for iterative change of the data layer. When looking at API based communication, the standard format for APIs is JSON, which again maps to MongoDB's document model. The idea with API-first development is to develop with the API in mind- why not store the data the way you're going to serve it by API. This reduces complexity and increases performance. Cloud Native and SaaS products have become the norm as retailers wish to reduce maintenance and management work. MongoDB Atlas, provides a database-as-a-service, guaranteeing 99.995% uptime, automatic failover and self-healing and allowing DevOps engineers to spin up databases in minutes or by API/ script. Many retail software companies are also built on MongoDB Atlas- for example commercetools, which provides an ecommerce solution as a SaaS product. Headless architectures require a data layer that is able to adapt and change for new workloads. The ability to change the schema at runtime, with no downtime, makes MongoDB's document model ideal for this. Performance and the ability to scale for new "heads" is also important. MongoDB is known as a high performance database and can scale vertically automatically or scale out horizontally seamlessly. So MongoDB becomes a great choice for retailers choosing to adopt a MACH architecture (see figure 1 below). As a general purpose database with high performance, a rich expressive query language and secondary indexing, MongoDB is a really good fit as a data layer as it is capable of handling operational and analytical needs of the application. FIgure 1: Example of a MACH architecture Want to know more? Are you interested in a transition to MACH? Dive into our four part blog series exploring each topic in detail and how MongoDB supports each of these principles: Microservices API-First Cloud-Native SaaS Headless

February 1, 2023
Applied

COSMOS SQL Migration to MongoDB Atlas

Azure Cosmos DB is Microsoft's proprietary globally distributed, multi-model database service. Cosmos DB supports SQL interface as one of the models in addition to the Cosmos MongoDB API. Even customers with the SQL interface use COSMOS for the document model and the convenience of working with a SQL interface. We have seen customers struggle with scalability issues and costs with Cosmos DB and want to move to MongoDB Atlas. Migrating an application from Cosmos DB SQL to MongoDB Atlas involves both application refactoring and data migration from Cosmos to MongoDB. The current tool set for migrating data from Cosmos SQL to MongoDB Atlas is fairly limited. While the Azure datamigration tool can be used for a 1 time export, customers frequently need zero downtime for migrations which the datamigration tool cannot satisfy. All writes into the source COSMOS SQL should be discontinued before the data migration can be performed. This puts a lot of pressure on the customer in terms of downtime requirements and planning out the migration. PeerIslands has built a COSMOS SQL migrator tool that addresses these concerns. The tool provides a way to perform COSMOS SQL migration with near zero downtime. The architecture of the tool is explained below. Initial Snapshot The tool uses the native datamigrationtool to export data as JSON files from Azure Cosmos DB SQL API. The Data Migration tool is an open-source solution that imports/exports data to/from Azure Cosmos DB. The exported data in JSON format is then imported into MongoDB Atlas using the mongoimport. Figure 1: Initial Snapshot processing stages. Change data capture Using the combination of the above tools we complete the initial snapshot. But what happens to documents that are updated or newly inserted during migration? Just prior to the initial snapshot process being started, the migration tool starts the change capture process. The migration tool listens to the ongoing changes in CosmosDB using the Kafka Source Connector provided by Azure and pushes the changes to a Kafka topic. Optionally KSQL can be used to perform any transformation required. Once the changes are in Kafka, the migration tool uses the Atlas Sink Connector to push the ongoing message to the Atlas Cluster. Below is the diagram depicting the flow of change stream messages from Cosmos SQL to MongoDB. Figure 2: The flow of change stream messages from Cosmos SQL to MongoDB The COSMOS SQL migration tool provides a GUI based point & click interface that brings together the above capabilities for handling the entire migration process. Since the tool is capable of change data capture, the tool provides a lot of flexibility for migrating your data without any downtime. Figure 3: Cosmos SQL migration tool dashboard In addition to data migration, PeerIslands can help with complete application refactoring required for migrating out of COSMOS SQL interface. Reach out to partners@mongodb.com if you need to migrate from COSMOS SQL to MongoDB Atlas.

January 31, 2023
Applied

Using Change Point Detection to Find Performance Regressions

At MongoDB, we want to (honestly) tell our users that each new version of our software is faster than the previous version. We also want to be able to explain why. We definitely do not want to learn that a release is slower (we have a performance regression) from our customers telling us after discovering it for themselves. In order to do this, we need to understand the performance of our software, detect performance changes early, and aggressively redress the root cause. We have invested significantly into building a performance testing system to achieve these goals. This includes creating a large number of performance tests, automating the running of those tests, and building tools to diagnose performance regressions when we find them. Those tools and tests are not enough by themselves: they produce an overwhelming amount of data. That data needs to be analyzed to determine if the performance changed. We could not process it all. We have developed new tools to process the data, using advanced statistical techniques to detect real performance regressions and identify the causes of those regressions. Where we started We built our original performance testing system in 2015. It ran a collection of performance tests directly in our CI system ( Evergreen ). We automated every step of running a test and collecting the results. That left the hard part: making sense of the results. Computers are fascinating things, built up from a huge number of simple and deterministic components. However, the interactions between those simple components lead to the emergence of non-deterministic behavior. As computers get more complex, the emergent behavior becomes more pronounced. The net effect is that when you run a program twice, the two executions will differ (i.e. one may take longer), even when run on the same machine. The problem gets even harder when you go from running on a single computer, to multiple computers in a distributed system. Network latencies will vary depending on the state of the network switches and other traffic on the network. The combination of each computer's variability combined with the variability of the network leads to more variability. MongoDB is a distributed system. When we test the performance of MongoDB, we have to address all of these issues. For performance tests, these differences show up as different measurements of performance. Your program may take more or less time to run. It may execute more or fewer operations within a period of time. You may see more or fewer slow operations. We call this phenomenon run to run variation or measurement noise. Run to run variation makes it harder to determine if changes to the software made the software intrinsically faster or slower. Thus, we did an enormous amount of work to limit the measurement noise in our tests, both in the original project, and in subsequent projects . Still, no matter how hard anyone tries, there will always be run to run variation. This presents a challenge when we want to interpret our performance results (or if you want to interpret your performance results). Maybe we are comparing two versions of our software and want to know which one is faster. If we have results that are 5% faster on the new version, is that due to our software being 5% faster? Or is the 5% due to run to run variation? Or worse, is the 5% change due to 10% run to run variation combined with our software actually being 5% slower? When we started, we only had a few performance tests. We manually inspected the results and could understand if and when the performance changed. However, as we added more tests, and more results per test, human inspection became less effective: we missed things and it was hard and unsatisfying work. We automated comparing the performance of one version of the software to another very early in the development of our system. We wrote software to compare the new performance results to older performance results. If the results changed more than 10%, we flagged it and had a human look at it. Using a direct comparison was common practice in the industry. It was also awful. The comparisons missed small regressions, they flagged a lot of false positives on noisier tests, and sometimes they flagged real things, but at the wrong time. The automated comparisons were much better than manual inspection, but still awful. We continually built improvements to make the system less awful. We had a system to increase the comparison threshold (from 10%) for noisier tests, and a system to reset the comparison when there was a change in performance (i.e., compare to the new normal). These changes improved the system, but they did not fundamentally overcome the challenges we faced. Solving the right problem Along the way, we realized we were trying to solve the wrong problem. Our automated comparison was answering the question: “Has measured performance changed more than 10% between these two versions of software”. What we really wanted to answer was “Which software changes altered performance (for better or worse)”. Those two questions overlap for large performance changes in low noise environments, but they differ on noisy tests or for small changes in performance. The second question (“which software changes altered performance?”) focuses on detecting changes in a measured value over time. This question maps to a known problem called change point detection . Change point detection is the problem of finding when changes in values occurred in time ( time-series ) in the presence of noise or other confounding variables. For example, it’s used to detect changes in behavior on such things as electricity consumption, population totals, local weather, and stock prices. There’s a lot of existing work on change point detection, so we just needed to pick the best existing work, implement it, and put it into production. Simple, right? Well, maybe not. We did not know what was the best existing work, and we did not know if it would fix our problems. So, we did some research, identifying likely techniques and collecting papers on them. The papers accumulated and stayed on my desk, because I didn’t have time to dive into a speculative project when there were plenty of things that needed to be done NOW . Enter an intern During the summer of 2017, two interns joined us on the performance team. They spent the summer working with us on our performance testing infrastructure. Both of them were great, giving our work an extra push forward. We encourage our interns to learn and grow. One way we do this is by explaining what we are doing and why we are doing it. We explain the larger context of the work. This naturally leads to discussing open challenges. One of our interns asked if they could read that stack of papers sitting on my desk (of course they could). Towards the end of the summer, he had completed his summer project early. Further, he had read the papers, understood them, and asked if he could make a prototype! In particular, he had gone through the complex math of the papers, and figured out how that math could be implemented in software. He built a prototype. It was limited, but it proved that the concept could work. The algorithm clearly found the changes in the sample traces we created, and did not get confused when run on sample data containing random background noise. Based on this initial success, we scheduled a larger proof of concept project to integrate the algorithm with our production system. We compared this second proof of concept with the existing comparison code, and determined it was MUCH better. We then did the work to get the algorithm in production and update our processes to use it. Our production system today When we started in 2015, we ran only a handful of tests and only a handful of people used the performance infrastructure directly. Today we run hundreds of distinct performance tests, generating over 100k distinct results per software commit. Today, everyone who develops MongoDB interacts with our performance testing infrastructure. When a developer commits a change to MongoDB, tests are run. Upon completion, change point detection is used to detect performance changes (improvements and regressions). A dedicated team triages these changes, isolates them to specific commits, and assigns these changes to developers to investigate. In the case of improvements, the developers confirm that the change was expected, or investigate the change to understand why the performance got better. Sometimes things get faster because of bugs – we have found bugs this way. Trend graph for a performance test in MongoDB. The green diamond marks the detected change point that has been triaged and confirmed. This was a recent 15% improvement in bulk insert performance for sharded clusters. Our system is good at detecting regressions and our engineers are good at fixing them. Even better than fixing a regression, is preventing a performance regression from ever being committed to our development branch. Developers can test their proposed changes before committing the changes, using something called a patch build. In this way, the developers can make sure they are not introducing new performance regressions, verify a fix, or confirm an optimization before committing their code. Advancing science! At MongoDB we take pride in developing a database and a database platform that empowers developers to make applications that change the world. We depend on our performance testing infrastructure to ensure we ship a performant database. We are proud of the performance infrastructure we have built and the impact it has had on the software we ship to our users. We do not do any of this work in a void. At MongoDB we benefit from being part of several communities, and we want to support these communities. It is for this reason that most of our database source code is publicly available and our JIRA project for database development is also public. When we developed a new way of finding performance regressions in our software, we didn’t hide it away. Instead, we shared it with the community, and will continue to do so as we learn and progress. This started with submitting a paper called “ The Use of Change Point Detection to Identify Software Performance Regressions in a Continuous Integration System ” to the International Conference on Performance Engineering (ICPE) . It has continued with more papers ( Creating a Virtuous Cycle in Performance Testing at MongoDB , Automated system performance testing at MongoDB ) and presentations. These talks and presentations have helped the community, but they have also helped us. By sharing and participating in the community, we have more people thinking about our problems. We’ve had the best minds in performance engineering in academia sharing ideas and suggestions with us on how to improve our technology! Often the ideas build on each other. One such idea led to the creation of the Data Challenge Track at ICPE in 2022. Building on our papers, we were able to open up our performance test results as a shareable artifact . The data challenge itself was simple: do something interesting with our performance test data. Researchers were thrilled to have industry data to evaluate and demonstrate their ideas. We were thrilled to have researchers working on our problems. In the end, it led to four strong papers which have impacted how we test performance at MongoDB. We continue to work on sharing our data and learnings. We have an ongoing collaboration within the SPEC Research Group to create better datasets and algorithms for detecting performance regressions. The group is combining our data with other industry datasets and curating the data. The results will enable researchers to understand the performance and accuracy of current algorithms, test new algorithms, and clearly show any improvements. All using real industry data from us and other companies. In each of these interactions, the community wins and we win. By sharing our data we enable better research, we get to take advantage of that research, and the research is better aligned with our needs. Investing in the future Of the two interns mentioned in this post, one is now a full time employee of MongoDB, and the other is pursuing a Ph.D. in computer science at Columbia. One is directly improving our software, and the other one is improving the theory and tools we use to build our software. We are very proud of both of them. The MongoDB database is faster today because of their work on our performance testing infrastructure. Thanks to that infrastructure we better understand why the database performs the way it does, why that performance changes, and when that performance changes. We continue to invest in and improve this critical piece of our infrastructure. We have teams dedicated to extending and improving it. We lean into our academic interactions to improve the state of the art for everyone. And we invest in the people who work on these systems (interns included). We hope you consider using these techniques yourself and letting us and the community know how it goes for you. If you are an academic, please improve the theoretical underpinnings of this entire space – we are happy to talk to you about it. And if the problems and software described in this post sounded interesting to you, we are hiring! Come join us and help us solve these problems. If you would like to learn more about our performance testing environment, check out some of our papers and presentations: Papers ICPE2020 The Use of Change Point Detection to Identify Software Performance Regressions in a Continuous Integration System DBTest.io 2020: Automated System Performance Testing at MongoDB ICPE 2021: Creating a Virtuous Cycle in Performance Testing at MongoDB Presentations ICPE2020 The Use of Change Point Detection to Identify Software Performance Regressions in a Continuous Integration System [ Video ] -- [ Slides ] ICPE 2021: Creating a Virtuous Cycle in Performance Testing at MongoDB [ Video ] -- [ Slides ] CMU Database Seminar Series: How to Waste Time and Money Testing the Performance of a Software Product [ video ] -- [ slides ] Performance Advisory Council 2021: Creating a Virtuous Cycle in Performance Testing [ Slides ]

January 30, 2023
Engineering Blog

New Aggregation Pipeline Text Editor Debuts in MongoDB Compass

There’s a reason why Compass is one of MongoDB’s most-loved developer tools: because it provides an approachable and powerful visual user interface for interacting with data on MongoDB. As part of this, Compass’s Aggregation Pipeline Builder abstracts away the finer points of MongoDB’s Query API syntax and provides a guided experience for developing complex queries. But what about when you want less rather than more abstraction? That’s where our new Aggregation Pipeline Text Editor comes in. Recently released on Compass, the Aggregation Pipeline Text Editor allows users to write free-form aggregations. While users could previously write and edit pipelines through a guided and structured builder organized by aggregation stage, a text-based builder can be preferable for some users. This new pipeline editor makes it easy for users to: See the entire pipeline without having to excessively scroll through the UI Stay “in the flow” when writing aggregations if they are already familiar with MongoDB’s Query API syntax Copy and paste aggregations built elsewhere (like in MongoDB’s VS Code Extension ) into Compass Use built-in syntax formatting to make pipeline text “pretty” before copying it over from Compass to other tools The Aggregation Pipeline Text Editor in Compass. Notice how toward the top right you can click on “stages” to move back to the traditional stage-based Aggregation Pipeline Builder. Ultimately, the addition of the Aggregation Pipeline Text Editor to Compass gives users more flexibility depending on how they want to build aggregations. For a more guided experience and to get result previews when adding each new stage, the existing Aggregation Pipeline Builder will work best for most users. But when writing free-form aggregations or copying and pasting aggregation text from other tools, the Aggregation Pipeline Text Editor may be preferable. It also previews the final pipeline output, rather than the stage-by-stage preview that exists today. Users will be able to access either both the traditional Aggregation Pipeline Builder and the new Pipeline Text Editor from directly within the Aggregations tab in Compass and can switch between the two views without losing their work. To get access to the new Aggregation Pipeline Text Editor, make sure to download the latest version of Compass here . And as always, we welcome your continued feedback on how to improve Compass. If you have ideas for how to improve your experience with Compass you can submit them on our UserVoice platform here . We’ll have even more great features coming in Compass soon. Keep checking back on our blog for the latest news!

January 26, 2023
Updates

5 Ways to Learn MongoDB

MongoDB offers a variety of ways for users to gain product knowledge, get certified, and advance their careers. In this guide, we'll provide an overview of the top five ways to get MongoDB training, resources, and certifications. #1: MongoDB University The best place to go to get MongoDB-certified and improve your technical skills is MongoDB University . At our last MongoDB.local London event, we announced the launch of a brand new, enhanced university experience, with new courses and features, and a seamless path to MongoDB certification to help you take your skills and career to the next level. MongoDB University offers courses, learning paths, and certifications in a variety of content types and programming languages. Some of the key features that MongoDB University offers are: Hands-on labs and quizzes Bite-sized video lectures Badges for certifications earned Study guides and materials Getting certified from MongoDB University is a great way to start your developer journey. Our education offerings also include benefits for students and educators . #2: MongoDB Developer Center For continued self-paced learning, the MongoDB Developer Center is the place to go. The Developer Center houses the latest MongoDB tutorials, videos, community forums , and code examples in your preferred languages and tools. The MongoDB Developer Center is a global community of more than seven million developers. Within the Developer Center, you can code in different languages, get access to integrate technologies you already use, and start building with MongoDB products, including: MongoDB, the original NoSQL database MongoDB Atlas , the cloud document database as a service and the easiest way to deploy, operate, and scale MongoDB MongoDB Atlas App Services , the easy way to get new apps into the hands of your users faster #3: Instructor-led training As an IT leader, you can help your team succeed with MongoDB instructor-led training taught live by expert teachers and consultants. With MongoDB’s instructor-led training offering, you can access courses aimed at various roles. Our Developer and Operations learning paths cover fundamental skills needed to build and manage critical MongoDB deployments. Beyond that, our specialty courses help learners master their skills and explore advanced MongoDB features and products. You can also modify how you want to learn. MongoDB offers public remote courses, which are perfect for individuals or teams who want to send a few learners at a time. If your goal is to upskill your entire team with MongoDB, our courses can be delivered privately, both onsite or remotely. Instructor-led training also provides the opportunity for Q&A, providing answers to your specific questions. #4: Resources Beyond formal training programs, MongoDB is committed to providing thought leadership resources for those looking to dive deeper and learn more about MongoDB and database technologies in general. Our website offers an active blog with ongoing thought leadership and how-to articles, along with additional coding documentation , guides, and drivers. You can also check out the MongoDB Podcast for information about new and emerging technology, MongoDB products, and best practices. #5: Events You can also engage with MongoDB experts at our many events, including MongoDB World, our annual conference for developers and other IT leaders. After MongoDB World, we take our show on the road with MongoDB .local events across the globe. These events give you the opportunity to learn in a hands-on fashion and meet other MongoDB users. MongoDB also hosts MongoDB days in various global regions, focusing on developer workshops and leveling up skills. Beyond that, you can keep up with our webinars and other learning opportunities through our Events page. Build your own MongoDB story Of course, many people like to learn by doing. To get started using MongoDB Atlas in minutes, register for free .

January 20, 2023
News

Hydrus Helps Companies Improve ESG Performance

More organizations are embracing workforce diversity, environmental sustainability, and responsible corporate governance in an effort to improve their Environmental, Social, and Governance (ESG) performance. As investors increasingly favor ESG in their portfolios, organizations are under greater pressure to capture, store, and verify ESG metrics. San Francisco-based startup, Hydrus, is helping companies make ESG data more usable and actionable. The platform Hydrus, a MongoDB for Startups program member, is a software platform that enables enterprises to collect, store, report, and act on their environmental, social, and governance data. ESG data includes things like: How a company safeguards the environment Its energy consumption and how it impacts climate change How it manages relationships with employees, suppliers, and customers Details about the company’s leadership, executive pay, audits, and internal controls The Hydrus platform enables organizations to collect, store, and audit diversity and environmental data, and run analytics and machine learning against that data. Hydrus offers users a first-rate UI/UX so that even non-technical users can leverage the platform. With the auditing capabilities, organizations can ensure the provenance and integrity of ESG data over time. Other solutions don't allow users to go back in time and determine who made changes to the data, why they made them, what earlier versions of the data looked like, and what time the changes were made. Hydrus gives users complete visibility into these activities. The tech stack MongoDB Atlas was the preferred database for Hydrus because of the flexibility of the data model. George Lee, founder and CEO of Hydrus, says the traditional SQL database model was too limiting for the startup's needs. MongoDB's document model eliminated the need to create tables or enforce restrictions of data fields. With MongoDB, they could simply add fields without undertaking any major schema changes. Hydrus also tapped MongoDB for access to engineers and technical resources. This enabled the company to architect its platform for all of the different types of sustainability data that exist. MongoDB technical experts helped Hydrus model data for future scalability and flexibility so it could add data fields when the need arises. On top of Atlas and MongoDB technical support, Hydrus leans heavily on MongoDB Charts , a data visualization tool for creating, sharing, and embedding visualizations from MongoDB Atlas. Charts enables Hydrus to derive insights from ESG data, giving its Fortune 200 clients better visibility into their operational efficiency. Charts uses a drag-and-drop interface that makes it easy to build charts and answer questions about ESG data. A Hydrus customer using MongoDB Charts was better able to understand the impact of their footprint from a greenhouse gas perspective and a resource usage perspective. Another customer detected a 30x increase in refrigerant usage in one of its facilities. The visual analytics generated with MongoDB Charts enabled the company to make changes to improve their ESG performance. MongoDB Charts enabled Hydrus to visualize sustainability data "MongoDB Charts enables our customers to directly report their sustainability data, customize the charts, and better tell the sustainability story in a visual format," Lee says. "It's way better than the traditional format where you have data, tables, and spreadsheets everywhere." The roadmap Hydrus seeks to take the hassle out of managing a sustainable business by streamlining data collection, reporting, and auditing processes. Its platform is designed to eliminate manual tasks for sustainability managers so they can focus on decarbonization, resource usage optimization, and being able to hit their sustainability goals. Hydrus accelerates these activities by helping companies model their sustainability data around science-based targets so they can better decarbonize and meet other ESG goals. If you're interested in learning more about how to help your organization become more sustainable, decarbonize, and succeed in your sustainability journey, visit the Hydrus website . Are you part of a startup and interested in joining the MongoDB for Startups program? Apply now . For more startup content, check out our wrap-up of the 2022 year in startups .

January 18, 2023
Applied

Predictions 2023: Modernization Efforts in the Financial Services Industry

As a global recession looms, banks are facing tough economic conditions in 2023. Lowering costs will be vital for many organizations to remain competitive in a data-intensive and highly regulated environment. Thus, it’s important that any IT investments accelerate digital transformation with innovative technologies that break down data silos, increase operational efficiency, and build personalized customer experiences. Read on to learn about areas in which banks are looking to modernize in 2023 to build better customer experiences at a lower cost and at scale. Shaping a better banking future with composable designs With banks eager to modernize and innovate, institutions must move away from the legacy systems that are restricting their ability to show progress. Placing consumers at the center of a banking experience made up of interconnected, yet independent services offers technology-forward banks the chance to reshape their business models and subsequently grow market share and increase profitability. These opportunities have brought to fruition a composable architecture design allows faster innovation, improved operational efficiency, and creates new revenue streams by extending the portfolio of services and products. Thus, banks are able to adopt the best-of-breed and perfect-fit-for-purpose software available by orchestrating strategic partnerships with relevant fintechs and software providers. This new breed of suppliers can provide everything from know your customer (KYC) services to integrated booking, load services or basic marketing and portfolio management functionalities. This approach is more cost efficient for institutions than having to build and maintain the infrastructure themselves, and it is significantly faster in terms of time to market and time to revenue. Banks adopting such an approach are seeing fintechs less as competitors and more as part of an ecosystem to collaborate with to accelerate innovation and reach customers. Operational efficiency with intelligent automation Financial institutions will continue to focus on operational efficiency and cost control through automating previous manual and paper-driven processes. Banks have made some progress digitizing and automating what were once almost exclusively paper-based, manual processes. But, the primary driver of this transformation has been compliance with local regulations rather than an overarching strategy for really getting to know the client and achieving true customer delight. The market is eager for better automated and data-driven decisions, and legacy systems can’t keep up. Creating hyper-personalized experiences that customers demand, which include things like chatbots, self-service portals, and digital forensics, is difficult for institutions using outdated technology. And, having data infrastructure in siloes prohibits any truly integrated modern experience. Using a combination of robotic process automation (RPA), machine learning (ML), and artificial intelligence (AI), financial institutions are able to streamline processes, thereby freeing the workforce to focus on tasks that drive a bigger impact for the customer and business. Institutions must not digitize without considering the human interaction that will be replaced, as customers prefer a hybrid approach. The ability to act on real-time data is the way forward for driving value and transforming customer experiences, which must be accompanied by the modernization of the underlying data architecture. The prerequisite for this goal involves the de-siloing of data and sources into a holistic data landscape. Some people call it a data mesh , some composable data sources, virtualized data. Solving ESG data challenges Along with high inflation, the cost-of-living crisis, energy turmoil, and rising interest rates, environmental, social, and governance (ESG) is also in the spotlight. There is growing pressure from regulators to provide ESG data and from investors to make sure portfolios are sustainable. The role of ESG data in conducting market analysis, supporting asset allocation and risk management, and providing insights into the long-term sustainability of investments continues to expand. The nature and variability of many ESG metrics is a major challenge facing companies today. Unlike financial datasets that are mostly numerical, ESG metrics can include both quantitative and qualitative data to help investors and other stakeholders understand a company’s actions and intentions. This complexity, coupled with the lack of a universally applicable ESG reporting standard, means institutions must consider different standards with different data requirements. To master ESG reporting, including the integration of relevant KPIs, appropriate, high-quality data is needed that is also at the right level of granularity and covers the required industries and region. Given the data volume and complexity, financial institutions are building ESG platforms underpinned by modern data platforms that are capable of consolidating different types of data from various providers, creating customized views, modeling data, and performing operations with no barriers. Digital payments - Unlocking an enriched experience Pushed by new technologies and global trends, the digital payments market is flourishing globally. With a valuation of more than $68 billion in 2021 and expectations of double-digit growth over the next decade, emerging markets are leading the way in terms of relative expansion. This growth has been driven by pandemic-induced cashless payments, e-commerce, government push, and fintechs. Digital payments are transforming the payments experience. While it was once enough for payment service providers to supply account information and orchestrate simple transactions, consumers now expect an enriched experience where each transaction offers new insights and value-added services. Meeting these expectations is difficult, especially for companies that rely on outdated technologies that were created long before transactions were carried out with a few taps on a mobile device. To meet the needs of customers, financial institutions are modernizing their payments data infrastructure to create personalized, secure, and real-time payment experiences — all while protecting consumers from fraud. This modernization allows financial institutions to ingest any type of data, launch services more quickly at a lower cost, and have the freedom to run in any environment, from on-premises to multi-cloud . Security and risk management Data is critical to every financial institution; it is recognized as a core asset to drive customer growth and innovation. As the need to leverage data efficiently increases, however, according to 57% of decision makers , the legacy technology that still underpins many organizations is too expensive and doesn’t fulfill the requirements of modern applications. Not only is this legacy infrastructure complex, it is unable to meet current security requirements. Given the huge amount of confidential client and customer data that the financial services industry deals with on a daily basis — and the strict regulations surrounding that data — security must be of highest priority. The perceived value of this data also makes financial services organizations a primary target for data breaches. Fraud protection, risk management, and anti-money laundering are high priorities for any new data platform according to Forrester’s What’s Driving Next-Generation Data Platform Adoption in Financial Services study. To meet these challenges, adoption of next-generation data platforms will continue to grow as financial institutions realize their full potential to manage costs, maximize security, and foster innovation. Download Forrester’s full study — What’s Driving Next-Generation Data Platform Adoption in Financial Services — to learn more.

January 17, 2023
Applied

How Startups Stepped Up in 2022

After muddling through the global pandemic in 2021, entrepreneurs emerged in 2022 ready to transform the way people live, learn, and work. Through the MongoDB for Startups program, we got a close-up view of their progress. What we observed was a good indication of how critical data is to delivering the transformative experiences users expect. Data access vs. data governance The increasing importance of data in the digital marketplace has created a conflict that a handful of startups are working to solve: Granting access to data to extract value from it while simultaneously protecting it from unauthorized use. In 2022, we were excited to work with promising startups seeking to strike a balance between these competing interests. Data access service provider Satori enables organizations to accelerate their data use by simplifying and automating access policies while helping to ensure compliance with data security and privacy requirements. At most organizations, providing access to data is a manual process often handled by a small team that's already being pulled in multiple directions by different parts of the organization. It's a time-consuming task that takes precious developer resources away from critical initiatives and slows down innovation. Data governance is a high priority for organizations because of the financial penalties of running afoul of data privacy regulations and the high cost of data breaches. While large enterprises make attractive targets, small businesses and startups in particular need to be vigilant because they can less afford financial and reputational setbacks. San Francisco-based startup Vanta is helping companies scale security practices and automate compliance for the most prevalent data security and privacy regulatory frameworks. Its platform gives organizations the tools they need to automate up to 90% of the work required for security audits. Futurology The Internet of Things (IoT), artificial intelligence (AI), virtual reality (VR), and natural language processing (NLP) remain at the forefront of innovation and are only beginning to fulfill their potential as transformative technologies. Through the MongoDB for Startups program, we worked with several promising ventures that are leveraging these technologies to deliver game-changing solutions for both application developers and users. Delaware-based startup Qubitro helps companies bring IoT solutions to market faster by making the data collected from mobile and IoT devices accessible anywhere it's needed. Qubitro creates APIs and SDKs that let developers activate device data in applications. With billions of devices producing massive amounts of data, the potential payoff in enabling data-driven decision making in modern application development is huge. London-based startup Concured uses AI technology to help marketers know what to write about and what's working for themselves and their competitors. It also enables organizations to personalize experiences for website visitors. Concured uses NLP to generate semantic metadata for each document or article and understand the relationship between articles on the same website. Another London-based startup using AI and NLP to deliver transformative experiences is Semeris . Analyzing legal documents is a tedious, time-consuming process, and Semeris enables legal professionals to reduce the time it takes to extract information from documentation. The company’s solution creates machine learning (ML) models based on publicly available documentation to analyze less seen or more private documentation that clients have internally The language we use in day-to-day communication says a lot about our state of mind. Sydney-based startup Pioneera looks at language and linguistic markers to determine if employees are stressed out at work or at risk for burnout. When early warning signs are detected, the person gets the help they need to reduce stress, promote wellness, and improve productivity confidentially and in real time. Technologies like AR and VR are transforming learning for students. Palo Alto-based startup Inspirit combines 3D and VR instruction to create an immersive learning experience for middle and high school students. The platform helps students who love science engage with the subject matter more deeply and those who dislike it to experience it in a more compelling format. No code and low code The startup space is rich with visionary thinkers and ideas. But the truth is that you can't get far with an idea if you don't have access to developer talent, which is scarce and costly in today's job market. We've worked with a couple of companies through the MongoDB for Startups program that are helping entrepreneurs breathe life into their ideas with low- and no-code solutions for building applications and bringing them to market. Low- and no-code platforms enable users with little or no coding background to satisfy their own development needs. For example, Alloy Automation is a no-code integration solution that integrates with and automates ecommerce services, such as CRM, logistics, subscriptions, and databases. Alloy can automate SMS messages, automatically start a workflow after an online transaction, determine if follow-up action should be taken, and automate actions in coordination with connected apps. Another example is Thunkable , a no-code platform that makes it easy to build custom mobile apps without any advanced software engineering knowledge or certifications. Thunkable's mission is to democratize mobile app development. It uses a simple drag-and-drop design and powerful logic blocks to give innovators the tools they need to breathe life into their app designs. The startup journey Although startups themselves are as diverse as the people who launch them, all startup journeys begin with the identification of a need in the marketplace. The MongoDB for Startups program helps startups along the way with free MongoDB Atlas credits, one-on-one technical advice, co-marketing opportunities, and access to a vast partner network. Are you a startup looking to build faster and scale further? Join our community of pioneers by applying to the MongoDB for Startups program. Apply now .

January 16, 2023
Applied

Improving Building Sustainability with MongoDB Atlas and Bosch

Every year developers from more than 45 countries head to Berlin to participate in the Bosch Connected Experience (BCX) hackathon — one of Europe’s largest AI and Internet of Things (AIoT) hackathons. This year, developers were tasked with creating solutions to tackle a mix of important problems, from improving sustainability in commercial building operations and facility management to accelerating innovation of automotive-grade, in-car software stacks using a variety of hardware and software solutions made available through Bosch, Eclipse, and their ecosystem partners. MongoDB also took part in this event and even helped one of the winning teams build their solution on top of MongoDB Atlas. I had the pleasure of connecting with a participant from that winning team, Jonas Bruns, to learn about his experience building an application for the first time with MongoDB Atlas. Ashley George: Tell us a little bit about your background and why you decided to join this year's BCX hackathon? Jonas Bruns: I am Jonas, an electrical engineering student from Friedrich Alexander University in Erlangen Nürnberg. Before I started my master’s program, I worked in the automotive industry in the Stuttgart area. I was familiar with the BCX hackathon from my time in Stuttgart and, together with two friends from my studies, decided to set off to Berlin this year to take part in this event. The BCX hackathon is great because there are lots of partners on site to help support the participants and provide knowledge on both the software and hardware solutions available to them — allowing teams to turn their ideas into a working prototype within the short time available. We like being confronted with new problems and felt this was an important challenge to take on, so participation this year was a must for us. AG: Why did you decide to use MongoDB Atlas for your project? JB: We started with just the idea of using augmented reality (AR) to improve the user experience (UX) of smart devices. To achieve this goal, we needed not only a smartphone app but also a backend in which all of our important data is stored. Due to both limited time and the fact that no one on our team had worked with databases before, we had to find a solution that would grow with our requirements and allow us to get started as easily as possible. Ideally, the solution would also be fully managed as well to eliminate us having to take care of security on our own. After reviewing our options, we quickly decided on using MongoDB Atlas . AG: What was it like working with MongoDB Atlas, especially having not worked with a database solution before? JB: The setup was super easy and went pretty fast. Within just a short time, we were able to upload our first set of data to Atlas using MongoDB Compass . As we started to dive in and explore Atlas a bit more we discovered the trigger functionality (Atlas Triggers), which we were able to use to simplify our infrastructure. Originally, we planned to use a server connected to the database, which would react to changed database entries. This would then send a request to control the desired periphery. The possibility to configure triggers directly in the database made a server superfluous and saved us a lot of time. We configured the trigger so that it executes a JavaScript function when a change is made to the database. This evaluates data from the database and executes corresponding requests, which directly control the periphery. Initially, we had hit a minor roadblock in determining how to handle the authentication needs (creating security tokens), which the periphery needs and expects during a request. To solve for this, we stored the security tokens on an AWS server which listens to an HTTP request. From Atlas, we then just have to call the URL and the AWS instance does the authentication and control of the lights. After we solved this problem, we were thrilled with how little configuration was needed and how intuitive Atlas is. The next steps, like connecting Atlas to the app, were easy. We achieved this by sending data from Flutter to Atlas over HTTPs with the Atlas Data API . AG: How did Atlas enable you to build your winning application? JB: By the end of the challenge, we had developed our idea into a fully functional prototype using Google ARcore, Flutter, MongoDB Atlas, and the Bosch Smart Home Hardware (Figure 1). We built a smartphone application that uses AR to switch on and off a connected light in a smart building. The position and state of the light (on or off) are stored in the database. If the state of the light should change, the app manipulates the corresponding value in the database. The change triggers a function that then sets the light to the desired state (on or off). The fact that we were able to achieve this within a short time without sufficient prior knowledge is mainly due to the ease and intuitive nature of Atlas. The simple handling allowed us to quickly learn and use the available features to build the functionality our app needed. Figure 1: Tech stack for the projects prototype. AG: What additional features within Atlas did you find the most valuable in building your application? JB: We created different users to easily control the access rights of the app and the smart devices. By eliminating the need for another server to communicate with the smart devices and using the trigger function of Atlas, we were able to save a lot of time on the prototype. In addition, the provided preconfigured code examples in various languages facilitated easy integration to our frontend and helped us avoid errors. Anyone who is interested can find the results of our work in the GitHub repo . AG: Do you see yourself using Atlas more in the future? JB: We will definitely continue to use Atlas in the future. The instance from the hackathon is still online, and we want to get to know the other functionalities that we haven't used yet. Given how intuitive Atlas was in this project, I am also sure that we will continue to use it for future projects as well. Through this project, Jonas and team were able to build a functional prototype that can help commercial building owners have more control over their buildings and take the steps to help reduce CO₂ emissions.

January 12, 2023
Applied

Introducing MongoDB Connector for Apache Kafka version 1.9

Today, MongoDB released version 1.9 of the MongoDB Connector for Apache Kafka! This article highlights the key features of this new release! Pre/Post document states In MongoDB 6.0, Change Streams added the ability to retrieve the before and after state of an entire document . To enable this functionality on the collection you can set it as a parameter in the createCollection command such as: db.createCollection( "temperatureSensor", { changeStreamPreAndPostImages: { enabled: true } } ) Alternatively, for existing collections, use colMod as shown below: db.runCommand( { collMod: <collection>, changeStreamPreAndPostImages: { enabled: <boolean> } } ) Once the collection is configured for pre and post images, you can set the change.stream.full.document.before.change source connector parameter to include this extra information in the change event. For example, consider this source definition: { "name": "mongo-simple-source", "config": { "connector.class": "com.mongodb.kafka.connect.MongoSourceConnector", "connection.uri": "<< MONGODB CONNECTION STRING >>", "database": "test", "collection": "temperatureSensor", "change.stream.full.document.before.change":"whenavailable" } } When the following document is inserted: db.temperatureSensor.insertOne({'sensor_id':1,'value':100}) Then an update is applied: db.temperatureSensor.updateOne({'sensor_id':1},{ $set: { 'value':105}}) You can see the change stream event written to Kafka topic is as follows: { "_id": { "_data": "82636D39C8000000012B022C0100296E5A100444B0F5E386F04767814F28CB4AAE7FEE46645F69640064636D399B732DBB998FA8D67E0004" }, "operationType": "update", "clusterTime": { "$timestamp": { "t": 1668102600, "i": 1 } }, "wallTime": { "$date": 1668102600716 }, "ns": { "db": "test", "coll": "temperatureSensor" }, "documentKey": { "_id": { "$oid": "636d399b732dbb998fa8d67e" } }, "updateDescription": { "updatedFields": { "value": 105 }, "removedFields": [], "truncatedArrays": [] }, "fullDocumentBeforeChange": { "_id": { "$oid": "636d399b732dbb998fa8d67e" }, "sensor_id": 1, "value": 100 } } Note the fullDocumentBeforeChange key includes the original document before the update occurred. Starting the connector at a specific time Prior to version 1.9, when the connector starts as a source, it will open a MongoDB change stream and any new data will get processed by the source connector. To copy all the existing data in the collection first before you begin processing the new data, you specify the “ copy.existing ” property. One frequent user request is to start the connector based upon a specific timestamp versus when the connector starts. In 1.9 a new parameter called startup.mode was added to specify when to start writing data. startup.mode=latest (default) “Latest” is the default behavior and starts processing the data when the connector starts. It ignores any existing data when the connector starts. startup.mode=timestamp “timestamp” allows you to start processing at a specific point in time as defined by additional startup.mode.timestamp.* properties. For example, to start the connector from 7AM on November 21, 2022, you set the value as follows: startup.mode.timestamp.start.at.operation.time=’2022-11-21T07:00:00Z’ Supported values are an ISO-8601 format string date as shown above or as a BSON extended string format. startup.mode=copy.existing Same behavior as the existing as the configuration option, “copy.existing=true”. Note that “copy.existing” as a separate parameter is now deprecated. If you defined any granular copy.existing parameters such as copy.existing.pipeline, just prepend them with “startup.mode.copy.existing.” property name. Reporting MongoDB errors to the DLQ Kafka supports writing errors to a dead letter queue . In version 1.5 of the connector, you could write all exceptions to the DLQ through the mongo.error.tolerance=’all’ . One thing to note was that these errors were Kafka generated errors versus errors that occurred within MongoDB. Thus, if the sink connector failed to write to MongoDB due to a duplicate _id error, for example, this error wouldn’t be written to the DLQ. In 1.9, errors generated within MongoDB will be reported to the DLQ. Behavior change on inferring schema Prior to version 1.9 of the connector, if you are inferring schema and insert a MongoDB document that contains arrays with different value data types, the connector is naive and would simply set the type for the whole array to be a string. For example, consider a document that resembles: { "myfoo": [ { "key1": 1 }, { "key1": 1, "key2": "dogs" } ] } If we set output.schema.infer.value . to true on a source connector, the message in the Kafka Topic will resemble the following: … "fullDocument": { … "myfoo": [ "{\"key1\": 1}", "{\"key1\": 1, \"key2\": \"dogs\"}" ] }, … Notice the array items contain different values. In this example, key1 is a subdocument with a single value the number 1, the next item in the “myfoo” array is a subdocument with the same “key1” field and value of an integer, 1, and another field, “key 2” that has a string as a value. When this scenario occurs the connector will wrap the entire array as a string. This behavior can also apply when using different keys that contain different data type values. In version 1.9, the connector when presented with this configuration will not wrap the arrays, rather it will create the appropriate schemas for the variable arrays with different data type values. The same document when run in 1.9 will resemble: "fullDocument": { … "myfoo": [ { "key1": 1, }, { "key1": 1, "key2": "DOGS" } ] }, Note that this behavior is a breaking change and that inferring schemas when using arrays can cause performance degradation for very large arrays using different data type values. Download the latest version of the MongoDB Connector for Apache Kafka from Confluent Hub! To learn more about the connector read the MongoDB Online Documentation . Questions? Ask on the MongoDB Developer Community Connectors and Integrations forum!

January 12, 2023
Updates

Top 3 Wins and Wants from the Latest TDWI Modernization Report

We recently reported that analyst and research firm TDWI had released its latest report on IT modernization: Maximizing the Business Value of Data: Platforms, Integration, and Management . The report surveyed more than 300 IT executives, data analysts, data scientists, developers, and enterprise architects to find out what their priorities, objectives, and experiences have been in terms of IT modernization. In many ways, organizations have made great progress. From new data management and data integration capabilities to smarter processes for higher business efficiency and innovations, IT departments have helped organizations get more value from the data they generate. In other cases, organizations are still stuck in data silos and struggling with improving data quality as data distribution increases due to the proliferation of multi-cloud environments. In this article, we'll summarize the top three areas where organizations are winning and the top three ways that organizations are left wanting when it comes to digital transformation and IT modernization. Download the complete report, Maximizing the Business Value of Data: Platforms, Integration, and Management , and find out the latest strategies, trends, and challenges for businesses seeking to modernize. Wins 1. Cloud migration Moving legacy applications to the cloud is essential for organizations seeking to increase operational efficiency and effectiveness, generate new business models through analytics, and support automated decision-making — the three biggest drivers of modernization efforts. And, most organizations are succeeding. Seventy-two percent of respondents in the TDWI survey reported being very or somewhat successful moving legacy applications to cloud services. Migrating to the cloud is one thing, but getting data to the right people and systems at the right time is another. For organizations to get full value of their data in the cloud, they also need to ensure the flow of data into business intelligence (BI) reports, data warehouses, and embedded analytics in applications. 2. 24/7 operations The ability to run continuous operations is a widely shared objective when organizations take on a transformation effort. Increasingly global supply chains, smaller and more dispersed office locations, and growing international customer bases are major drivers of 24/7 ops. And, according to the TDWI survey, more than two-thirds of organizations say they've successfully transitioned to continuous operations. 3. User satisfaction Organizations are also winning the race to match users' needs when provisioning data for BI, analytics, data integration, and the data management stack. Eighty percent of respondents said their users were satisfied with these capabilities. Additionally, 72% trusted in the quality of data and how it's governed, and 68% were satisfied that role-based access controls were doing a good job of ensuring that only authorized users had access to sensitive data. Wants 1. Artificial intelligence, machine learning, and predictive intelligence Machine learning (ML) and artificial intelligence (AI) comprise a key area where organizations are left wanting. While 51% of respondents were somewhat or very satisfied with their use of AI and ML data, almost the same number (49%) said they were neither satisfied nor dissatisfied, somewhat dissatisfied, or very dissatisfied. Similar results were also reported for data-driven predictive modeling. The report notes that provisioning data for AI/ML is more complex and varied than for BI reporting and dashboards, but that cloud-based data integration and management platforms for analytics and AI/ML could increase satisfaction for these use cases. 2. More value from data Perhaps related to the AI/ML point, the desire to get more value out of their data was cited as the biggest challenge organizations face by almost 50% of respondents. Organizations today capture more raw, unstructured, and streaming data than ever, and they're still generating and storing structured enterprise data from a range of sources. One of the big challenges organizations reported is running analytics on so many different data types. According to TDWI, organizations need to overcome this challenge to inform data science and capitalize modern, analytics-infused applications . 3. Easier search A big part of extracting more value from data is making it easy to search. Traditional search functionality, however, depends on technically challenging SQL queries. According to the TDWI report, 19% of users were dissatisfied with their ability to search for data, reports, and dashboards using natural language. Unsurprisingly, frustration with legacy technologies was cited as the third biggest challenge facing organizations, according to the survey. The way forward "In most cases, data becomes more valuable when data owners share data," the TDWI report concludes. Additionally, the key to making data more shareable is moving toward a cloud data platform , one that makes data more available while simultaneously governing access when there's a need to protect the confidentiality of sensitive data. Not only does a cloud data platform make data more accessible and shareable for users, it also creates a pipeline for delivering data to applications that can use it for analytics, AI, and ML. Read the full TDWI report: Maximizing the Business Value of Data: Platforms, Integration, and Management .

January 11, 2023
News

Ready to get Started with MongoDB Atlas?

Start Free