MongoDB Blog

Articles, announcements, news, updates and more

Security in Government Solutions: Why Secure By Default is Essential

Data security in government agencies is table stakes at this point. Everyone knows it’s essential, both for compliance and data protection purposes. However, most government agencies are working with solutions that require frequent security patches or built-on tools to protect their data. Today, the federal government is pushing its agencies to move to modernize their solutions and improve their security posture. For example, the DHS and Cybersecurity and Infrastructure Security Agency’s recently issued technical rule for modernization of the Protected Critical Information Infrastructure program – a program that provides legal protections for cyber and physical infrastructure information submitted to DHS. “The PCII Program is essential to CISA’s ability to gather information about risks facing critical infrastructure,” said Dr. David Mussington, Executive Assistant Director for Infrastructure Security. “This technical rule modernizes and clarifies important aspects of the Program, making it easier for our partners to share information with DHS. These revisions further demonstrate our commitment to ensuring that sensitive, proprietary information shared with CISA remains secure and protected.” So how can government agencies modernize their data infrastructure and find solutions that not only protect data but also power innovation? Let’s look into a few different strategies. 1. Why secure by default is key Secure by default means that any piece of software uses default security settings that are configured for the highest possible security out of the box. CISA Director Jen Easterly has addressed how using solutions that are secure by default is critical for any organization. “We have to have [multi-factor authentication] by default. We can't charge extra for security logging and [single sign-on],” Easterly said . “We need to ensure that we're coming together to really protect the technology ecosystem instead of putting the burden on those least able to defend themselves.” “The American people have accepted the fact that they’re constantly going to have to update their software,” she said. “The burden is placed on you as the user and that’s what we have to collectively stop.” Easterly is right. Secure-by-design solutions are vital to the success of data protection. The expectation should alway be that solutions have built-in, not bolt-on security features. One approach that’s gaining traction both in the public and private sectors is zero trust environments. In a zero trust environment, the perimeter is assumed to have been breached. There are no trusted users, and no user or device gains trust simply because of its physical or network location. Every user, device, and connection must be continually verified and audited. As the creator of zero trust, security expert John Kindervag, summed it up: “Never trust, always verify.” For government agencies, that means the underlying database must be secure by default, and it needs to limit users’ opportunities to make it less secure. 2. Security isn't just on-prem anymore; cloud is secure, too Cloud can be a scary word for public sector organizations. Trusting your sensitive data to the cloud might feel risky for those who handle some of the country’s most sensitive data. But, cloud providers are stepping up to meet the security needs of government agencies. There is no need to fear the cloud anymore. Government agencies and other public sector organizations nationwide are navigating cloud modernization through the lens of increased cybersecurity requirements outlined in the 2021 Executive Order on Improving the Nation’s Cybersecurity . “The Federal Government must adopt security best practices; advance toward Zero Trust Architecture; accelerate movement to secure cloud services, including Software as a Service (SaaS), Infrastructure as a Service (IaaS), and Platform as a Service (PaaS); centralize and streamline access to cybersecurity data to drive analytics for identifying and managing cybersecurity risks; and invest in both technology and personnel to match these modernization goals.” Also, the major cloud providers are well established, purpose-built options for government users. AWS GovCloud, for example, is more than a decade old and was “ the first cloud provider to build cloud infrastructure specifically designed to meet U.S. government security and compliance needs.” This push by the federal government toward cloud modernization and increased cybersecurity will be a catalyst in upcoming years for rapid cloud adoption and greater dependence on cloud solutions designed specifically for government users. 3. Security features purpose-built for goverment needs are essential Government agencies are held to a higher standard than those in the private sector. From data used in sometimes life-or-death missions to data for students building their futures in educational institutions (and everything in between), security has real-world consequences. Today, security is non-negotiable and like we explored above, it’s especially crucial that public sector entities have built-in security measures to keep data protected. So, what built-in features should you look for? Network isolation and access It’s critical that your data and underlying systems are fully isolated from other organizations using the same cloud provider. Database resources should be associated with a user group, which is contained in its own Virtual Private Cloud (VPC), and access should be granted by IP access lists, VPC peering, or private endpoints. Encyption in flight, at rest, and in use Encryption should be the standard. For example, when using MongoDB Atlas, all network traffic is encrypted using Transport Layer Security (TLS). Encryption for data at rest is automated using encrypted storage volumes. Customers can use field-level encryption to encrypt sensitive workloads which enables you to encrypt data in your application before you send it over the network to MongoDB clusters. Users can bring their own encryption keys for an additional level of control. Granular database auditing Granular database auditing allows administrators to answer detailed questions about systems activity by tracking all commands against the database. This ensures you always know who has access to what data and how they’re using it. Multi-factor authentication User credentials should always be stored using industry-standard and audited one-way hashing mechanisms, with multi-factor authentication options including SMS, voice call, a multi-factor app, or a multi-factor device, ensuring only approved users have access to your data. MongoDB Atlas for Government: Purpose-built for public sector As we’ve discussed, solutions that are purpose-built with built-in security are ideal for government agencies, and choosing the right one is the best way to keep sensitive data protected. MongoDB Atlas for Government on AWS GovCloud recently secured its FedRAMP Moderate authorization thanks to these security measures built into the solution. FedRAMP is a government-wide program that provides a standardized approach to security assessment, authorization, and continuous monitoring for cloud products and services. To ensure the utmost levels of security, Atlas for Government is an independent, dedicated environment for the U.S. public sector, as well as ISVs looking to build U.S. public sector offerings. Public Sector organizations carry a heavy burden when it comes to keeping data protected. However, with the right data platform underpinning modern applications – a platform with built-in security features – progress doesn’t mean you have to compromise on security. Want to learn more about data protection best practices for public sector organizations? Attend our upcoming webinar on April 12 for deeper insight .

March 28, 2023
Applied

Women Leaders at MongoDB: Lena Smart Discusses Clarity and Goal Setting

March is Women’s History Month. Our women leaders series highlights MongoDB women who are leading teams and empowering others to own their career development and build together. Lena Smart, Chief Information Security Officer, explains why words matter, shares her thoughts on leadership, and discusses MongoDB’s internal mentorship program “MentorHER”. Tell me a bit about your team. My team is responsible for all aspects of security of MongoDB’s global offices and employees. Within my organization I have Governance, Risk and Compliance (GRC) and InfoSec (security engineering, physical security, etc.) under me. We are dedicated to making every effort to protect customer data, including continually improving security processes and controls. On top of that, we are committed to delivering the highest levels of standards conformance and regulatory compliance as part of our ongoing mission to address the most demanding security and privacy requirements of our customers. My goal is to build the best security and GRC team in the world. What characteristics make a good leader? In my opinion, good leaders are decisive and leave little room for ambiguity. They are understanding and know that people are depending on them for their careers, dreams, and aspirations. They make their work matter every day, are focused on continuous learning, and do not “rest on their laurels.” What has your experience been like as a woman growing your career in leadership? My experience as a woman leader has depended on the environment. It was very difficult as a CIO and CISO in the power industry. Every day I felt like I was being undermined by my peers, who (because they were all “power industry engineers”) felt they were the experts in everything to do with security (they were not). It was exhausting. I finally left and joined a FinTech company. That was better, but I still felt I could find an environment where women were actively encouraged to lead. Hence my move to MongoDB. I could not be happier here and love working with all our teams. Tell us about the biggest lesson you’ve learned throughout your career. The biggest lesson I’ve learned is that words matter. As a leader, people can interpret your words in many ways. Be clear in your message. I have a mantra on our team: “one voice, one message”. I encourage my team members to have all the internal discussion they want, but we do not ever air our dirty laundry in public. We stand with “one voice, one message”. People in general like clarity, and we try hard to enforce that idea within my team by encouraging each person to own what they do. What’s your advice for building and developing a team? I believe cultivating a supportive and positive team culture stems from the top down. I really embrace and follow MongoDB’s company values . Build Together is the main value I follow because, to me, people are everything. You need the right people in the right places owning what they do. I am also a huge advocate for people taking the initiative and building upon their own careers. I make sure to set aside a budget for training and certification programs for my team. This allows them to enhance their knowledge and helps them grow and develop into even stronger security and GRC professionals. I also started the Security Champions Program at MongoDB almost four years ago, a volunteer-based program that allows anyone who has an interest in security to join monthly meetings to learn more. Can you tell us a bit about the MentorHER program at MongoDB? I am honored to be the Executive Sponsor for a very important internal program called MentorHER. MentorHER aims to create diverse teams, develop female leaders, drive organizational changes, and enhance MongoDB’s reputation as an employer of choice. I’ve had a couple of mentors who made a positive impact on my career. I cherished our time together and made sure to have a clear understanding of the mentoring program I signed up for. There were goals, regular meetings, and a lot of positivity generated by mentoring. I hope we can replicate that at MongoDB with our MentorHER program. We have a very strong team leading the program, and I feel very confident that we will meet our goals and embrace the different experiences and perspectives of the women around us. What is your advice to women looking to grow their careers as leaders? My advice to other women is this: be clear and honest in what you want from a leadership role. At the C-suite level you will be pulled in many directions. Control that, from the start, where possible. It’s important to be intellectually honest and have clear goals that will help your team grow and mature, and the business flourish. Join a team that builds together every day. View open career opportunities at MongoDB.

March 27, 2023
Culture

MongoDB Releases “Focus Mode” in Compass GUI

We’re excited to announce an improvement to the aggregation-building experience in MongoDB Compass. Compass already makes it easy to view and manage your MongoDB databases, and with the addition of Focus Mode you now have the option to dial in on specific stages within your aggregation pipeline. Overview MongoDB's Query API and Aggregation Pipelines enable easy retrieval and processing of data from collections. They also facilitate complex operations such as filtering, grouping, and transforming, making computation and analysis effortless. MongoDB Compass' intuitive interface simplifies the process of building aggregations by enabling developers to easily create, test, and refine aggregation pipelines, and the introduction of Focus Mode takes this a step further. When constructing pipelines, having to simultaneously view and consider multiple stages can make it challenging to analyze the impact of a specific stage, leading to increased cognitive load. Now, developers can toggle Focus Mode on stages, opening a view that focuses exclusively on the contents of the specific stage they are working on. This view can also be used to view sample input (before the aggregation stage is applied) and output (after the stage is applied) documents, aiding in the understanding, troubleshooting, and optimizing of the data pipeline. Developers can also switch between different stages by accessing a drop-down menu at the top of their screen. This makes identifying inefficiencies and optimizing performance easier, as well as providing deeper insights from the output documents for data-driven decision making. Focus Mode offers a streamlined and distraction-free environment for working with stages, improving the efficiency and precision of testing, debugging, and analyzing the impact of each stage on the data, ultimately simplifying the creation and management of pipelines. Conclusion The addition of Focus Mode is part of our continued refresh of the query and aggregation experience in Compass. These improvements are made possible thanks to the feedback of our developer community, so we encourage you to try out this new feature and let us know what you think! To learn more about Aggregation Pipeline Builder in Compass, visit our documentation .

March 21, 2023
Updates

Women Leaders at MongoDB: Why Kanika Khurana is Leading with Transparency

March is Women’s History Month. Our women leaders series highlights MongoDB women who are leading teams and empowering others to own their career development and build together. Kanika Khurana, Technical Services Manager, shares how she leads with transparency, the importance of taking smart risks, and enabling team members to have the “courage to fall and rise again”. Tell me a bit about your team. I oversee the Cloud Technical Services team in India. Our team provides technical advice and support to MongoDB customers by acting as subject matter experts to clear blockers and recommend best practices, enabling customers to build next-generation applications. What characteristics make a good leader? I think that a good leader comes to know and value their employees' unique skills and abilities. They determine how to capitalize on their team’s strengths and tweak the environment to meet their larger goals. By taking the time to understand each employee, a great manager shows that they see their people for who they are. Have you faced any challenges as a woman growing your career in leadership? One of the criticisms I’ve faced over the years is that I’m an emotional thinker, which somehow hampers my decision-making. However, while I tend to be a more relationally-oriented decision maker, I’ve used this characteristic to help advance my career. Listening to and involving team members in essential conversations has enabled me to make more logical, reasonable, and healthier decisions. What is the biggest lesson you’ve learned throughout your career? The best leaders are transparent. They admit mistakes, ask for forgiveness, and make bad situations right. These “failures” aren’t signs of weakness but rather strengths. Mistakes are inevitable, and what we learn from them is what determines the course of our success. Trying to look perfect isn’t authentic, creates stress, and models unhealthy perfectionism. Through transparency, you build stronger relationships and an environment where a commitment to doing the right thing impacts the culture and the bottom line. The best thing you could do is to offer an Eden to your team, which allows them to grow and thrive, rather than creating an environment where the fear of making a mistake overtakes the courage to fall and rise again. What’s your advice to other women looking to grow their careers as leaders? I advise other women to be brave and take risks. Sticking to the safest option can be tempting, but you are unlikely to achieve growth and innovation if you’re not open to new steps or strategies. Of course, risks should be calculated, but carefully considering risks can progress your career. Be a little risky, take a leap, give it a try, speak up, and be kind but convicted in your effort to take a seat at the table. Join us to make an impact on your career and the future of technology. Find open roles on our careers site today.

March 21, 2023
Culture

Submit Your Nominations for the 2023 MongoDB Innovation Awards

Nominations are now open for the 2023 MongoDB Innovation Awards. These awards aim to celebrate and recognize organizations that dream big and are pioneering new ways to use data, expanding the limits of technology, and enhancing their businesses with MongoDB. We invite you to nominate an organization that is building something dynamic, interesting, or innovative with MongoDB. Submit Your Nomination Past recipients include 7-Eleven, American Airlines, Barclays, Bosch, Comcast, Epic Games, IBM, LinkedIn, Pioneera, and Sogei. Read more about last year’s winners here . This year, we’re excited to offer a robust prize package. Our 2023 winners will receive*: An Innovation Award trophy 10 passes (per organization) to a MongoDB.local event of your choosing Inclusion in the MongoDB Innovation Awards announcement materials and social media Digital badge to display all year long A customer feature story on MongoDB.com MongoDB Atlas credits A tailored MongoDB Day, designed to enable your technical team members to deliver solutions better and faster Submissions will be accepted through April 21, 2023 and winners will be notified by the MongoDB team by the end of May 2023 . Read more about each of the award categories below. Award categories Optimizing for Impact This will be awarded to an organization that realized tremendous business benefits by leveraging MongoDB, with an impact on its bottom line (time savings, cost savings, and/or reduction in operational complexity). Industry Transformation This will be awarded to a change-maker who moved their business to the next level and disrupted their industry by identifying new technologies, applying new skills, or increasing operational efficiency. Inspiring Innovation This will be awarded to an organization that is using MongoDB to make a better world possible. They are creatively expanding the limits of technology to solve societal, community, medical, or educational challenges. Building the Next Big Thing This will be awarded to a small- or medium- sized business that has been building its core offering/service on MongoDB Atlas from the beginning. They are leveraging MongoDB's data developer platform to build and scale some of the world's most innovative projects in data. * View terms and conditions We look forward to receiving your nominations!

March 20, 2023
Events

Visualizing Your MongoDB Atlas Data with Atlas Charts

MongoDB Atlas is the leading multi-cloud developer data platform. We see some of the world’s largest companies in manufacturing , healthcare , telecommunications , and financial services all build their businesses with Atlas at their foundation. Every company comes to MongoDB with a need to safely store operational data. But all companies also have a need to analyze data to gain insights into their business and data visualization is core to establishing that real-time business visibility. Data visualization enables the insights required to take action, whether that’s on key sales data, production and operations data, or product usage to improve your applications. The best way to do this as an Atlas user is by using Atlas Charts – MongoDB’s first-class data visualization tool, built natively into MongoDB Atlas. Why choose Charts First, Charts is natively built for the document model. If you’re familiar with MongoDB, you should be familiar with documents. The document model is a data model made for the way developers think. And with Charts, you can take your data from documents and collections in Atlas, and visualize them with no ETL, data movement or duplication. This speeds up your ability to discover insights. Second, Charts supports all cluster configurations you can create in Atlas, including dedicated clusters, serverless instances, data stored in Online Archive, as well as federated data in Atlas Data Federation. Typically when you learn about a company’s integrated products and services, you find some “gotchas” or limitations that make any benefits come at a significant cost. In the case of a MongoDB Atlas customer, that could come in the form of someone finding out that a cluster configuration option isn’t supported by Charts. But that will never be the case. If you create and manage your application data in Atlas, you can visualize it in Charts. That’s it. Third, Charts is a robust data visualization tool with a variety of chart types, extensive customization options, and interactivity. Compared to other options in the business intelligence market, you get the same key benefits, without all the complexity. You can learn how to use Charts in a few hours and you can easily teach your team. It’s the simplest data visualization solution for most teams. Fourth, the value of Charts can extend beyond individual use cases, with sharing and embedding . This lets you both flexibly share charts and dashboards with your team, as well as embed them into contexts that matter most to your data consumers, such as in a blog post or inside your company’s wiki. Finally, Charts is free for Atlas users up to 1GB per project per month, which covers moderate usage for most teams. There are no seat-based licensing fees associated with Charts, so no matter how many team members you have, Charts will remain a low-cost, if not zero cost solution for your data visualization needs. Beyond the included free usage, it’s just $1/GB transferred per month. You can check out more pricing details here . How to use Charts The best way to learn how to use Charts is to simply give it a try. It’s free to use and we have a variety of sample dashboards you can use to get started. But let’s walk through some basics to help illustrate the kinds of visualizations that Charts can enable. Charts makes visualizing your data easy by automatically making your Atlas deployments (any cluster configuration) available for visualization. If you’re a project owner, you can manage permissions to data sources in Charts. We could write an entire blog post on data sources, but if you’re just getting started, just know that your data is made easily available in Charts unless your project owner intentionally hides it. Create a dashboard Everything in Charts starts with a dashboard and creating a dashboard is easy. Simply select the Add Dashboard button at the top right of the Charts page in Atlas . From there, you’ll fill in some basic information like a title and optional description, and you’re on your way. Here’s what one of our new sample dashboards looks like. They are a great place to start: Build a chart Once you have a dashboard created, you can add your first chart. The chart builder gives you a simple and powerful drag and drop interface to help you quickly construct charts. The first step is selecting your data source: Once you have a data source selected, simply add desired fields into your chart and start customizing. The example below uses our IoT sample dashboard dataset to create a bar chart displaying the total distance traveled by different users. From there you can add filters and further customize your chart by adding custom colors, data labels, and more. The chart builder even allows you to write, save, and share queries and aggregation pipelines as shown below. You can learn more in our documentation. Play around with the chart builder to get familiar with all of its functionality. Share and embed A chart can be useful in itself to individual users, but we see users get the most benefit out of Charts when sharing visualizations with others. Once you have created a dashboard with one or more charts, we offer a variety of options letting you share your dashboards with your team, your organization, or via a public link if your data is not sensitive. If you would rather embed a chart or dashboard where your team is already consuming information, check out Charts embedding functionality. Charts lets you embed a chart or dashboard via iframe or SDK, depending on your use case. Check out our embedding documentation to learn more. That was just a brief overview of how to build your first charts and dashboards in Atlas Charts, but there’s a lot more functionality to explore. For a full walkthrough, watch our product demo here: Atlas Charts is the only native data visualization tool built for the document model and it’s the quickest and easiest way to get started visualizing data from Atlas. We hope this introduction helps you get started using Charts to gain greater visibility into your application data, helping you to make better decisions on your data. Get started with Atlas Charts today by logging into or signing up for MongoDB Atlas , deploying or selecting a cluster, and navigating to the Charts tab to activate for free.

March 16, 2023
Updates

Why MongoDB’s Partner Team is Focused like a Laser, Not a Flashlight

Four years ago, I wrote an article about how our Partner and Sales teams work together to ensure success. Since then, our Partner organization has grown five times in size and become even more of a competitive differentiator for MongoDB. As we continue to build lasting relationships with our partners and become even more strategic in how we leverage our partnerships, I’m reflecting on how far the Partner organization has come and where we’re headed. The Partner organization is the x-factor for MongoDB It starts with the customers, but more specifically, developers. Developers are creating some of the most innovative and modern applications with MongoDB, but our developer data platform is only one component of their tech stack. That’s why it’s essential to have an ecosystem of companies who help developers write or modernize their software faster. For MongoDB, this could be system integrators, cloud providers, ISVs who embed MongoDB into their products, technology partners who want to integrate with us, or resellers who enable us to sell MongoDB in new markets and regions. Most companies have a strategy for each and a team that manages these relationships, but there are a few things that make MongoDB’s Partner organization different. First, the people we hire. We look for individuals who have a sales-first mentality, are willing and able to generate pipeline, and can position the value of MongoDB. It’s extremely important for our Partner team to show ROI to our Sales teams, and I’d argue that if your Partner organization can’t do that, you might not need them. As part of the Partner team at MongoDB, you have the opportunity to master your sales skills and be rewarded for your success in finding new partnerships. One of our core MongoDB values is “Own What You Do” and it’s embodied every day on the Partner team. We demand excellence from ourselves. We take accountability for our actions and our success. We are empowered to make things happen. The second thing that sets MongoDB apart is that we manage partnerships like a laser, not a flashlight. We do not measure success by the number of partners we have. We prefer to deeply invest resources in a handful of alliances while we create an ecosystem funnel to drive the next wave of investments. We look for partnerships with organizations that our customers have told us they’d like us to work better with. Though we have over 1,000 partners, we put most of our horsepower into the top 50 based on this feedback. Lastly, the opportunity at MongoDB is enormous. If you are looking to work with a product that people love, and you believe there is an opportunity to be well-compensated for selling and building full solutions around a product, you’ll find that at MongoDB. Driving focus via the Partner Specialist teams At the beginning of this year, we created dedicated specialist teams for Cloud, System Integrator, ISV, VAR, and Tech partners. Customers have told us time and time again that they wanted us to become more intimate with their use cases and the associated ecosystem, and we listened. For example, we now have specialized teams for each cloud partner who know their products inside out and focus on strengthening the relationship by sourcing new opportunities for our sales force. This isn’t something you find in most Partner organizations, as it’s more common for teams to be generalists opposed to specialists. We began experimenting with specialization in 2021, and a highlight of this specialization is our partnership with Amazon Web Services (AWS). In the past, MongoDB and AWS were viewed as competitors rather than partners. In 2021, both sides realized that it’s better to work together and decided to dedicate individuals to build a partnership that has since resulted in an incredible number of co-sell wins. AWS has leaned into MongoDB and continues to position MongoDB Atlas as a preferred database for customers. This puts MongoDB as one of the top three data partners that AWS has globally, and AWS is now MongoDB’s largest partnership in the world . Scaling without diluting impact MongoDB’s Partner organization has quintupled in size since 2019. We have partners in almost every major location around the world and teams who provide regional coverage. With the ROI we’ve seen from specialization, we’ve invested in more specialists and therefore can provide more dedicated resources to each partner. MongoDB’s Partner organization is known as a place with a winning culture where people consistently deliver results. We’ve had many internal transfers from employees who joined MongoDB in Sales, Sales Development, or Marketing and decided to transition into a role on the Partner team. Similarly, our team is focused on providing opportunities for growth. The number of individuals who joined the Partner team as individual contributors and have since been promoted into Director and VP roles is extraordinary. For example, our VP of System Integrator Partner Specialists, Global Lead of Accenture Partner Specialists, RVP of Capgemini Partner Specialists, RVP of Cloud Programs, Global Lead of AWS Partner Specialists, and RVP of Azure Partner Specialists all began their careers as individual contributors here at MongoDB. As we grow our Partner organization, diversity of background, thought, and experiences will continue to be a key differentiator for us. We value different perspectives and view diversity as a way to better serve our customers. Diversity drives a culture of innovation and investing in inclusion helps us serve customers in all markets, giving us a competitive advantage. The future of MongoDB's Partner organization I’m very excited about our coming year. We continue to look for the next partnership to break records with. Whether it's Alibaba , IBM, Databricks , Carahsoft, Microsoft, or Google , working with partners to find new workloads is key to MongoB’s success. MongoDB plans to continue to invest directly in partners via MongoDB ventures as part of this strategy. We also take great pride in promoting folks into leadership positions and we expect even more of that in the year ahead. Our leaders and I live by one of John McMahon’s mottos: "Too many companies think culture is ping-pong, foosball, and beer taps. Helping people win is a culture. Teaching them how to win on their own is a culture. If people aren’t learning, earning, growing, and being promoted, they’re not staying around for the pool table.” This is why we hope you are interested in joining us. We have great products, specialized partnerships, and most importantly, a winning team of fantastic leaders. Want to be part of a team that takes ownership and makes their work matter? View our open roles today .

March 15, 2023
Culture

How Much is Your Data Model Costing Your Business?

Economic volatility is creating an unpredictable business climate, forcing organizations to stretch their dollars further and do more with less. Investments are under the microscope, and managers are looking to wring every ounce of productivity out of existing resources. IT spend is a concern and many IT decision-makers aren't sure what's driving costs. Is it overprovisioning? Cloud sprawl? Shadow IT? One area that doesn't get a lot of attention is how the data is modeled in the database. That's unfortunate because data modeling can have a major impact in terms of the cost of database operations, the instance size necessary to handle workloads, and the work required to develop and maintain applications. Pareto patterns Data access patterns are often an illustration of the Pareto Principle at work, where the majority of effects are driven by a minority of causes. Modern OLTP applications tend to work with data in small chunks. The vast majority of data access patterns (the way applications access and use data) work with either a single row of data or a range of rows from a single table. At least that's what we found at Amazon , looking at 10,000 services across all the various RDBMS based services we deployed. Normalized data models are quite efficient for these simple single table queries, but the less frequent complex patterns require the database to join tables to produce a result, exposing RDBMS inefficiencies. The high time complexity associated with these queries meant significantly more infrastructure was required to support them. The relational database hides much of this overhead behind the scenes. When you send a query to a relational database, you don't actually see all the connections opening up on all the tables, or all the objects merging. Even though 90% of the access patterns at Amazon were for simple things, the 10% that were doing more complex things were burning through CPU to the point that my team estimated they were driving ~50% of infrastructure cost. This is where NoSQL data modeling can be a game-changer. NoSQL data models are designed to eliminate expensive joins, reduce CPU utilization, and save on compute costs. Modeling for efficiency in NoSQL There are two fundamental approaches to modeling relational data in NoSQL databases: Embedded Document - All related data is stored in a single rich document which can be efficiently retrieved when needed. Single Collection - Related data is split out into multiple documents to efficiently support access patterns that require subsets of a larger relational structure. Related documents are stored in a common collection and contain attributes that can be indexed to support queries for various groupings of related documents. The key to building an efficient NoSQL data model and reducing compute costs is using the workload to influence the choice of data model. For example, a read-heavy workload like a product catalog that runs queries like, "get all the data for a product" or "get all the products in a category," will benefit from an embedded document model because it avoids overhead of reading multiple documents. On the other hand, a write-heavy workload where writes are updating bits and pieces of a larger relational structure would run more efficiently with smaller documents stored in a single collection which can be accessed independently and indexed to support efficient retrieval when all the data is needed. The final choice depends on the frequency and nature of the write patterns and whether or not there's a high velocity read pattern that's operating concurrently. If your workload is read-intensive, you want to get as much as you can in one read. For a write-intensive workload, you don't want to have to rewrite the full document every time it changes. Joins increase time complexity. In NoSQL databases, depending on the access pattern mix, all the rows from the relational tables are stored either in a single embedded document or as multiple documents in one collection that are linked together by indexes. Storing multiple related documents in a common collection means there is no need for joins. As long as you're indexing on a common dimension across documents, you can query for related documents very efficiently. Now imagine a query that joins three tables in a relational database and your machine needs to do 1,000 of them. You would need to read at least 3,000 objects from multiple tables in order to satisfy the 1,000 queries. With the document model, by embedding all the related data in one document, the query would read only 1,000 objects from a single collection. Machine wise, having to merge 3,000 objects from three tables versus reading 1,000 from one collection will require a more powerful and expensive instance. With relational databases, you don't have as much control. Some queries may result in a lot of joins, resulting in higher time complexity which translates directly into more infrastructure required to support the workload. Mitigate what matters In a NoSQL database, you want to model data for the highest efficiency where it hurts the most in terms of cost. Analytical queries tend to be low frequency. It doesn't matter as much if they come back in 100 ms or 10 ms. You just want to get an answer. For things that run once an hour, once a day, or once a week, it's okay if they're not as efficient as they might be in a normalized relational database. Transactional workloads that are running thousands of transactions a second need to process as efficiently as possible because the potential savings are far greater. Some users try to practice these data modeling techniques to increase efficiency in RDBMS platforms since most now support document structures similar to MongoDB. This might work for a small subset of workloads. But columnar storage is designed for relatively small rows that are the same size. They do work well for small documents, but when you start to increase the size of the row in a relational database, it requires off-row storage. In Postgres this is called TOAST (The Oversized-Attribute Storage Technique). This circumvents the size limit by putting the data in two places, but it also decreases performance in the process. The row based storage engines used by modern RDBMS platforms were not designed for large documents, and there is no way to configure them to store large documents efficiently. Drawing out the relationship The first step we recommend when modeling data is to characterize the workload by asking a few key questions: What is the nature of the workload? What is the entity relationship diagram (ERD)? What are the access patterns? What is the velocity of each pattern? Where are the most important queries that we need to optimize? Identifying the entities and their relationships to each other is going to form the basis of our data model. Once this is done we can begin to distill the access patterns. If it's a read heavy workload like the product catalog you'll most likely be working with large objects, which is fine. There are plenty of use cases for that. However, if you're working with more complex access patterns where you're accessing or updating small pieces of a larger relational structure independently, you will want the data separated into smaller documents so you can efficiently execute those high velocity updates. We teach many of these techniques in our MongoDB University course, M320: MongoDB Data Modeling . Working with indexes Using indexes for high-frequency patterns will give you the best performance. Without an index, you have to read every document in the collection and examine it to determine which documents match the query conditions. An index is a B-tree structure that can be parsed quickly to identify documents that match conditions on the indexed attributes specified by the query. You may choose to not index uncommon patterns for various reasons. All indexes incur cost as they must be updated whenever a document is changed. You might have a high velocity write pattern that runs consistently and a low velocity read that happens at the end of the day, in which case you'll accept the higher cost of the full collection scan for the read query rather than incur the cost of updating the index on every write. If you are writing to a collection 1,000 times a second and reading once a day, the last thing you want to do is add an index update for every single write just to make the read efficient. Again, it depends on the workload. Indexes in general should be created for high-velocity patterns, and your most frequent access patterns should be covered by indexes to some extent, either partially or fully. Remember that an index still incurs cost even if you don't read it very much or at all. Always make sure when you define an index that there is a good reason for it, and that good reason should be that you have a high frequency access pattern that needs to use it to be able to read the data efficiently. Data modeling and developer productivity Even after you've optimized your data model, cost savings will continue to accrue downstream as developers find that they can develop, iterate, and maintain systems far more efficiently than in a relational database. Specific document design patterns and characteristics of NoSQL can reduce maintenance overhead and in many cases eliminate maintenance tasks altogether. For example, document databases like MongoDB support flexible schema which eliminates the need for maintenance windows related to schema migrations and refactoring of a catalog as with RDBMS. A schema change in a relational database almost always impacts ORM data adapters that would need to be refactored to accommodate the change. That's a significant amount of code maintenance for developers. With a NoSQL database like MongoDB, there's no need for cumbersome and fragile ORM abstraction layers. Developers can store object data in its native form instead of having to normalize it for a tabular model. Updating data objects in MongoDB requires almost zero maintenance. The application just needs to be aware documents may have new properties, and how to update them to the current schema version if they don’t. MongoDB will lower license fees and infrastructure costs significantly, but possibly the biggest savings organizations experience from moving away from RDBMS will come from reduced development costs. Not only is there less code overall to maintain, but the application will also be easier to understand for someone who didn't write the code. MongoDB makes migrations far simpler and less prone to failure and downtime. Applications can be updated more frequently, in an easier fashion, and without stressing about whether a schema update will fail and require a rollback. Overall, maintaining applications over their lifetime is far easier with NoSQL databases like MongoDB. These efficiencies add up to significant savings over time. It's also worth mentioning that a lot of up-and-coming developers see relational databases as legacy technology and not technology they prefer to use. With MongoDB it is easier to attract top talent, a critical factor in any organization's ability to develop best-of-breed products and accelerate time-to-value. Uplevel your NoSQL data modeling skills If you want to start reining in the hidden costs in your software development lifecycle by learning how to model data, MongoDB University offers a special course, M320: MongoDB Data Modeling . There are also dozens of other free courses, self-paced video lessons, on-demand labs, and certifications with digital badges to help you master all aspects of developing with MongoDB.

March 15, 2023
Applied

Digital Payments - Latin America Focus

Pushed by new technologies and global trends, the digital payments market is flourishing all around the world. With a valuation at over USD 68 billion in 2021 and expectations to grow to double digits over the next decade, emerging markets are leading the way in terms of relative expansion. A landscape once dominated by incumbents - big banks and credit card companies - is now being attacked by disruptors that are interested in capturing a market share. According to a McKinsey study , there are four major factors at the core of this transformation: Pandemic-induced cashless payments adoption E-commerce Government push for digital payments Fintechs Interestingly, the pandemic has been a big catalyst in the rise of financial inclusion by encouraging alternative means of payment and new ways of borrowing and saving. These new digital services are in fact easier to access and to consume. In Latin America and the Caribbean (LAC), Covid spurred a dramatic increase in cashless payments, 40% of adults made an online purchase, 14% of which did it for the first time in their life. E-commerce has experienced a stellar growth, with a penetration that will likely exceed 70% of the population in 2022, domestic and global players including Mercado Libre and Falabella are pushing digital payment innovation to provide an ever smoother customer experience on their platforms. Central banks are promoting new infrastructure for near real-time payments, with the goal of providing a cheaper and faster technology for money transfer both for citizens and businesses. PIX is probably the biggest success story. An instant payment platform developed by Banco Central do Brasil (Brazil Central Bank), it began operating in November 2020, and within 18 months, over 75% of adult Brazilians had used it at least once. The network processes around $250 Billion in annualized payments, about 20% of total customer spend. Users (including self employed workers) can send and receive real-time payments through a simple interface, 24/7 and free of charge. Businesses have to pay a small fee. In the United States, the Federal Reserve has announced it will be launching FedNow in mid 2023, a payment network with characteristics similar to PIX. These initiatives aim to solve issues such as slow settlements and low interoperability between parties Incumbent banks still own the lion’s share of the digital payment market, however, fintechs have been threatening this dominance by leveraging their agility to execute fast and cater to customer needs in innovative and creative ways. Without the burden of legacy systems to weigh them down, or business models tied to old payment rails, fintechs have been enthusiastic testers and adopters of new technologies and payment networks. Their mobile and digital first approach is helping them capture and retain the younger segment of the market, which expect integrated real-time experiences they can consume at the touch of a button. An example is Paggo, a Guatemalan fintech that helps businesses streamline payments by enabling them to share a simple QR code that customers can scan to transfer money. The payment landscape is not only affected by external forces, changes coming from within the industry are also reshaping the customer experience and enabling new services: ISO 20022 is a flexible standard for data interchange that is being adopted by most financial industry institutions to standardize the way they communicate between each other, thus streamlining interoperability. Thanks to the adoption of ISO 20022, it’s more straightforward for banks to read and process messages, this translates into smoother internal processes and easier automatization. For end users this means faster and potentially cheaper payments, as well as richer and more integrated financial apps. 3DS2 is being embraced by the credit and debit card payments ecosystem. It essentially is a payment authentication solution that serves online shopping transactions. Similarly to ISO 20022, the end user won’t even be aware of the underlying technology, but will only experience a smoother and frictionless checkout. 3DS2 avoids the user being redirected to their banking app for confirmation when buying an item online, now it’s all happening on the website or app of the seller. This is all done while also enhancing fraud detection and prevention; this new solution makes it harder to use one’s credit or debit card without authorization. 3DS2 adoption benefit is twofold: on the one hand the user has increased confidence, on the other hand merchants are happier because of a lower customer abandonment rate, in fact fear of fraud at checkout is usually one of the main reasons for ditching an online purchase. This solution is especially beneficial for the LAC region, where, despite wide adoption of e-commerce, people are still reluctant to transact online. One of the factors contributing to this oddity is fear of fraud, Cybersource reported that in 2019, a fifth of e-commerce transactions were flagged as potentially fraudulent and 20% were blocked, that’s over 6 times the global average. It is evident how online shoppers’ trust will be encouraged by the platforms’ adoption of 3DS2. It is worth also mentioning the role played by blockchain and cryptocurrencies. Networks such as Ethereum or Lightning are effectively a decentralized alternative to the more traditional payment rails. Over the last few years more and more people have started to use this technology because of its unique features: low fees, fast processing time and global reach. Latin America has seen an explosion in adoption due to several factors, remittances and stablecoin payments being highly prominent. Traditional remittance service providers are in fact slower and more expensive than blockchain networks. Especially in Argentina, an increasing number of autonomous workers are demanding to be paid in USDC or USDT, two stablecoins pegged to the value of the dollar, thus being able to stave off inflation. It is clear that the payment landscape is rapidly evolving, on the one end customers expect products and services that integrate seamlessly with every aspect of their digital lives. Whenever an app is perceived as slow, poorly designed or simply missing some features, the user can easily switch to a competitor’s alternative. On the other hand, the number of players contending for their share in the digital payments market is expanding, driving down margins of traditional products. The only way to successfully navigate this complex environment is investing in innovation and in creating new business models. There’s no unique approach to face such challenges, but there’s no doubt that every successful business needs to harness the power of data and technology to provide its customers with the personalized and real-time experience they demand. We at MongoDB believe that a solid foundation to achieve that is represented by a highly flexible and scalable developer data platform, allowing companies to innovate faster and better monetize their payment data. Visit our Financial Services web page to learn more!

March 14, 2023
Applied

Women Leaders at MongoDB: Raising the Bar with May Petry

March is Women’s History Month. Our women leaders series highlights MongoDB women who are leading teams and empowering others to own their career development and build together. May Petry, Vice President of Digital and Growth Marketing, discusses the importance of defining your values, being authentic, and “getting comfortable with being uncomfortable.” Tell me a bit about your team. The Digital and Growth Marketing team is focused on finding the next best customer for MongoDB, helping them be wildly successful on Atlas, and accelerating their future growth on our platform. Our growth goals include driving awareness in net new audiences, generating revenue through our self-serve channel, delivering new digital experiences, and growing sales opportunities. What characteristics make a good leader? Good leaders have a clear set of personal values that guide their decisions and define their leadership style. They find joy in not just what their team does but how. A good leader is a ‘bar raiser’ and demonstrates mastery of all the company values. I value authenticity, integrity, empathy, accomplishment, and advocacy in leaders. What has your experience been like as a woman growing your career in leadership? There have been many occasions where I am the only woman and person of color in the room. Early in my career, this was intimidating and lonely, but finding allies helped. I also remember being told to “use my voice.” I was. I just wasn’t being heard. Focusing on how to speak so others listen is a skill to develop. The stakes just get higher as you advance your career. Tell us about some of the biggest lessons you’ve learned throughout your career. I’ll share two. First, I don’t have to be the best at what my team does. I have to be the best in helping my team do what they do best and excel at arranging their outputs, so it’s amplified, highly efficient, and ridiculously impactful. The second is that imposter syndrome doesn’t ever go away. It gets worse - use it to fuel your curiosity and empathy, drive collaboration, and help others grow. What’s your advice for building and developing a team? As a leader developing a team, you need to be a role model. Be authentic and vulnerable. Don’t just talk about learning and development - do something about it. Does everyone in your organization have an individual growth plan? Do they know what raising the bar looks like? Do they have regular conversations with their managers for feedback and recognition? That said, everyone is responsible for their own personal and professional growth. Take charge of your destiny by looking for mentors, coaches, and allies. What’s one piece of advice you have for women looking to grow their careers as leaders? Get comfortable with being uncomfortable. Find a good circle of people to share, brainstorm, laugh, or cry with. We are our own worst critics, so be kind to yourself, stop apologizing, and go shine! Together, there’s nothing we can’t build. View current openings on our careers site.

March 13, 2023
Culture

Clear: Enabling Seamless Tax Management for Millions of People with MongoDB Atlas

Building India's largest tax and financial services software platform trusted by more than six million Indians With India’s large population and growing middle class, the country’s tax-paying population has been rising steadily. At the end of the financial year 2021-22, about 5.83 crore (58.3 million ) individuals filed tax returns with the Indian Income Tax department. In addition, India also has about 13.8 million registered Goods and Services Tax (GST) taxpayers. When juxtaposed with growing digitization in India, this opens up massive demand for a convenient and effective platform to manage tax returns. Clear realized this need early on and launched as a SaaS offering for ITR filing to individuals in 2011 that is currently trusted by more than six million Indians. It is second only to the Indian IT Department’s portal in terms of registered users. More recently, Clear has been focused on expanding its B2B portfolio, including launching an e-invoicing system. Today, the system supports about 50,000 tax professionals, one million small businesses, and 4000 enterprises in GST filing. How to ensure a seamless experience for all users at scale Clear built the initial version of its B2B e-invoicing system on MySQL. However, as adoption grew, the team started to see the limits of the systems tested. Certain batches of invoices were taking upwards of 25 minutes to process, an issue for the time-sensitive nature of tax filing. If any Clear customer failed to file in time, that customer could be given a penalty and labeled as non-compliant by the Indian government. The team knew they needed to take a step back and reevaluate the core structure of their system. The Clear team started the system rework by outlining a set of required capabilities. The new database system would need to be able to scale up quickly to handle periods of peak demand and down when traffic was low to save on costs. Tax professionals need to be able to see multiple cuts of the data at different levels, so the database would need to be able to support quick and complex aggregations. Lastly, the team knew that didn’t want to be accountable for the management of the system themselves. They needed a fully-managed option. MongoDB Atlas chosen for best in class scale and performance The company ran a proof of concept (POC) study comparing MySQL’s performance with other competitive offerings, including MongoDB. It found that, in terms of the time taken to execute different batch sizes of data, MongoDB was considerably faster in all instances. For example, MongoDB’s processing time was 122% faster than the closest competitor and 767% faster than the farthest competitor. Comparison of performance among databases Given the document-based nature of invoices, the results of the POC made sense. With MongoDB, the Clear team could store invoice data together instead of splitting it across tables. This minimized the number of costly joins required to obtain data, leading to faster reads. MongoDB also allowed the team to easily split reads and writes in use cases where the system experienced high volumes of reads and where reading slightly stale data was permissible. Clear’s aggregation needs were also easily met with MongoDB’s aggregation pipeline. The combination of aggregation support and MongoDB’s full-text search capabilities meant that the Clear team could easily build filterable and searchable dashboards on top of their invoice data. Lastly, the team also loved the easy-to-use nature of MongoDB Atlas, MongoDB’s fully-managed developer data platform. With Atlas, the team could easily scale up and down their clusters on a schedule to match fluctuations in user traffic. Achieving a 2900% jump in processing speed along with cost savings After Clear replatformed from MySQL to MongoDB Atlas on AWS, their customers were shocked by the improvement. Pranesh Vittal, Director Of Engineering, ClearTax India said, “We have achieved considerable optimization with MongoDB. Our customers are often surprised by the pace of execution. There is a significant improvement in the performance, with as much as a 2900% jump in processing speed in some instances.” Comparing the performance of the new MongoDB powered platform On top of increased speeds, the team is also saving money. “We’ve generated over 20 crore invoices to date running on a single sharded cluster with a 4TB disc,” said Pranesh Vittal. “The ability to store older data in cold storage [with Online Archive] helped us achieve this.” Atlas Triggers also help the team automatically scale down their clusters each night and scale them up in the morning. The triggers are fully-managed and schedule-based, so it’s as easy as setting them up and letting them run. This automatic right-sizing is saving the team upwards of $7000 each month ($700 per cluster for 10 clusters). After seeing such positive results, the team has since decided to replatform multiple other products onto MongoDB. “Here, MongoDB’s live support and consultation have proved very useful,” said Pranesh Vittal. Now, Clear manages 25+ clusters and over 10TB of data on MongoDB Atlas.

March 6, 2023
Applied

Build a ML-Powered Underwriting Engine in 20 Minutes with MongoDB and Databricks

The insurance industry is undergoing a significant shift from traditional to near-real-time data-driven models, driven by both strong consumer demand, and the urgent need for companies to process large amounts of data efficiently. Data from sources such as connected vehicles and wearables are utilized to calculate precise and personalized premium prices, while also creating new opportunities for innovative products and services. As insurance companies strive to provide personalized and real-time products, the move towards sophisticated and real-time data-driven underwriting models is inevitable. To process all of this information efficiently, software delivery teams will need to become experts at building and maintaining data processing pipelines. This blog will focus on how you can revolutionize the underwriting process within your organization, by demonstrating how easy it is to create a usage-based insurance model using MongoDB and Databricks. This blog is a companion to the solution demo in our Github repository . In the GitHub repo, you will find detailed step-by-step instructions on how to build the data upload and transformation pipeline leveraging MongoDB Atlas platform features, as well as how to generate, send, and process events to and from Databricks. Let’s get started. Part 1: the Use Case Data Model Part 2: the Data Pipeline Part 3: Automated Decision Support with Databricks Part 1: The use case data model Figure 1: Entity relationship diagram - Usage-based insurance example Imagine being able to offer your customers personalized usage-based premiums that take into account their driving habits and behavior. To do this, you'll need to gather data from connected vehicles, send it to a Machine Learning platform for analysis, and then use the results to create a personalized premium for your customers. You’ll also want to visualize the data to identify trends and gain insights. This unique, tailored approach will give your customers greater control over their insurance costs while helping you to provide more accurate and fair pricing. A basic example data model to support this use case would include customers, the trips they take, the policies they purchase, and the vehicles insured by those policies. This example builds out three MongoDB collections, as well two Materialized Views . The full Hackloade data model which defines all the MongoDB objects within this example can be found here . Part 2: The data pipeline Figure 2: The data pipeline - Usage-based insurance The data processing pipeline component of this example consists of sample data, a daily materialized view, and a monthly materialized view. A sample dataset of IoT vehicle telemetry data represents the motor vehicle trips taken by customers. It’s loaded into the collection named ‘customerTripRaw’ (1) . The dataset can be found here and can be loaded via MongoImport , or other methods. To create a materialized view, a scheduled Trigger executes a function that runs an Aggregation Pipeline. This then generates a daily summary of the raw IoT data, and lands that in a Materialized View collection named ‘customerTripDaily’ (2) . Similarly for a monthly materialized view, a scheduled Trigger executes a function that runs an Aggregation Pipeline that, on a monthly basis, summarizes the information in the ‘customerTripDaily’ collection, and lands that in a Materialized View collection named ‘customerTripMonthly’(3). For more info on these, and other MongoDB Platform Features: MongoDB Materialized Views Building Materialized View on TimeSeries Data MongoDB Scheduled Triggers Cron Expressions Part 3: Automated decisions with Databricks Figure 3: The data pipeline with Databricks - Usage-based insurance The decision-processing component of this example consists of a scheduled trigger and an Atlas Chart. The scheduled trigger collects the necessary data and posts the payload to a Databricks ML Flow API endpoint (the model was previously trained using the MongoDB Spark Connector on Databricks). It then waits for the model to respond with a calculated premium based on the miles driven by a given customer in a month. Then the scheduled trigger updates the ‘customerPolicy’ collection, to append a new monthly premium calculation as a new subdocument within the ‘monthlyPremium’ array. You can then visualize your newly calculated usage-based premiums with an Atlas Chart! In addition to the MongoDB Platform Features listed above, this section utilizes the following: MongoDB Atlas App Services MongoDB Functions MongoDB Charts Go hands on Automated digital underwriting is the future of insurance. In this blog, we introduced how you can build a sample usage-based insurance data model with MongoDB and Databricks. If you want to see how quickly you can build a usage-based insurance model, check out our GitHub repository and dive right in!

March 6, 2023
Applied

Ready to get Started with MongoDB Atlas?

Start Free