MongoDB Blog

Announcements, updates, news, and more

Teach & Learn with MongoDB: Professor Abdussalam Alawini, University of Illinois at Urbana-Champaign

In this series of interviews, we talk to students and educators around the world who are using MongoDB to make their classes more engaging and relevant. By exploring their stories, we uncover how MongoDB’s innovative platform and resources are transforming educational landscapes and empowering the next generation of tech-savvy professionals. From creative teaching approaches to advanced classroom solutions, the MongoDB for Educators program can help you transform your classroom with cutting-edge technology and free resources. It can help you provide students with an interactive and dynamic learning environment that bridges the gap between theoretical knowledge and practical application. The program includes a variety of free resources for educators crafted by MongoDB experts to prepare learners with in-demand database skills and knowledge. Program participants have access to MongoDB Atlas credits, curriculum materials, certifications, and membership in a global community of educators from over 700 universities. From theory to practice: Hands-on MongoDB Teaching Professor Abdussalam Alawini is known for his creative use of MongoDB in his courses. He heavily uses MongoDB's free cluster to demonstrate MongoDB concepts during classes, and his students also use the free cluster for their projects, giving them hands-on experience with real-world applications. Currently, a Teaching Associate Professor at the University of Illinois Urbana-Champaign, Professor Alawini’s research interests span databases, applied machine learning, and education. He is particularly focused on applying machine learning methods to enhance classroom experiences and education. His work also includes developing next-generation data management systems, such as data provenance, citation, and scientific management systems. He recently received the U of I’s 2024 Campus Excellence in Undergraduate Education award, which highlights his commitment to teaching and the impact he’s had on his students. Professor Alawini is currently collaborating with colleagues on research to map how databases, data systems, data management, and related courses are taught in introductory computer science undergraduate courses worldwide. Professor Alawini’s story offers valuable insights for educators eager to enhance their teaching and prepare students for a tech-driven future. Check out how MongoDB Atlas has revolutionized his teaching by simplifying database deployment, management, and scaling, allowing students to focus more on learning MongoDB concepts. Tell us about your educational journey and what sparked your interest in databases. My educational journey began with a bachelor's degree in Computer Science from the University of Tripoli in 2002. I then spent over six years in the industry as a database administrator, lead software developer, and IT Manager. In 2011, I returned to academia and earned two master's degrees in Computer Science and Engineering and Technology Management from Portland State University, followed by a Ph.D. in Computer Science in 2016. Subsequently, I joined the University of Pennsylvania for a two-year postdoctoral training. My interest in databases was sparked during my time as a database administrator at PepsiCo, where I enjoyed maintaining the company's databases and building specialized reports to improve business operations. I was particularly fascinated by database systems’ ability to optimize queries and handle millions of concurrent user requests seamlessly. This experience led me to focus my doctoral studies on building data management systems for scientific applications. What courses are you currently teaching at the University of Illinois Urbana-Champaign? Currently, I teach Database Systems and Data Management in the Cloud courses at the University of Illinois Urbana-Champaign. In addition, I also teach a course to University High School students to introduce them to data management and database basics. My intention with teaching databases to high schoolers is to use data management as a gateway to lower entry barriers into computing fields for non-computer science students and to recruit underrepresented minorities to computing. What inspired you to start teaching MongoDB? I was inspired to start teaching MongoDB after seeing several surveys indicating that it is the most used database in web development and one of the leading document-oriented databases. MongoDB offers several unique features that set it apart from other databases, including the aggregation pipeline, which simplifies data processing and transformation. Additionally, MongoDB's flexible schema design allows for easier handling of unstructured data, and its horizontal scalability ensures robust performance as data volumes grow. These features make MongoDB an essential tool for modern web development, and I wanted to equip my students with the skills to leverage this powerful technology. How do you design your course content to effectively integrate MongoDB and engage students in practical learning? In all my data management courses, I focus on teaching students the concept of data models, including relational, document, key-value, and graph. In my Database Systems course, I teach MongoDB alongside SQL and Neo4J to highlight the unique features and capabilities of each data model. This comparative approach helps students appreciate the importance and applications of different databases, ultimately making them better data engineers. In my Data Management in the Cloud course, I emphasize the system's side of MongoDB, particularly its scalability. Understanding how MongoDB is built to handle large volumes of data efficiently provides students with practical insights into managing data in a cloud environment. To effectively integrate MongoDB and engage students in practical learning, I use a hybrid flipped-classroom approach. Students watch recorded lectures before class, allowing us to dedicate class time to working through examples together. Additionally, students form teams to work on various data management scenarios using a collaborative online assessment tool called PrairieLearn. This model fosters peer learning and collaboration, enhancing the overall educational experience. How has MongoDB supported you in enhancing your teaching methods and upskilling your students? I would like to sincerely thank MongoDB for Academia for the amazing support and material they provided to enhance my course design. The free courses offered at MongoDB University have significantly improved my course delivery, allowing me to provide more in-depth and practical knowledge to my students. I heavily use MongoDB's free cluster to demonstrate MongoDB concepts during classes, and my students also use the free cluster for their projects, which gives them hands-on experience with real-world applications. MongoDB Atlas has been a game-changer in my teaching methods. As a fully managed cloud database, it simplifies the process of deploying, managing, and scaling databases, allowing students to focus on learning and applying MongoDB concepts without getting bogged down by administrative tasks. The flexibility and reliability of MongoDB Atlas make it an invaluable tool for both educators and students in the field of data management. Could you elaborate on the key findings from your ITiCSE paper on students' experiences with MongoDB and how these insights can help other educators? In my ITiCSE paper, we conducted an in-depth analysis of students' submissions to MongoDB homework assignments to understand their learning experiences and challenges. The study revealed that as students use more advanced MongoDB operators, they tend to make more reference errors, indicating a need for a better conceptual understanding of these operators. Additionally, when students encounter new functionalities, such as the $group operator, they initially struggle but generally do not repeat the same mistakes in subsequent problems. These insights suggest that educators should allocate more time and effort to teaching advanced MongoDB concepts and provide additional support during the initial learning phases. By understanding these common difficulties, instructors can better tailor their teaching strategies to improve student outcomes and enhance their learning experience. What advice would you give to fellow educators who are considering implementing MongoDB in their own courses to ensure a successful and impactful experience for their students? Implementing MongoDB in your courses can be highly rewarding. Here’s some advice to ensure success: Foundation in Data Models: Teach MongoDB alongside other database types to highlight unique features and applications, making students better data engineers. Utilize MongoDB Resources: Leverage support from MongoDB for Academia, free courses from MongoDB University, and free clusters for hands-on projects. Practical Learning: Use MongoDB Atlas to simplify database management and focus on practical applications. Focus on Challenges: Allocate more time for advanced MongoDB concepts. Address common errors and use tools like PrairieLearn that capture students' interactions and learning progress to identify learning patterns and adjust instruction. Encourage Real-World Projects: Incorporate practical projects to enhance skills and relevance. Continuous Improvement: Gather feedback to iteratively improve course content and share successful strategies with peers. MongoDB is always evolving so make sure to stay tuned with their updates and new features. These steps will help create an engaging learning environment, preparing students for real-world data management. Apply to MongoDB for Educators program and explore free resources for educators crafted by MongoDB experts to prepare learners with in-demand database skills and knowledge.

July 10, 2024
Applied

Building Gen AI with MongoDB & AI Partners | June 2024

Even for those of us who work in AI, keeping up with the latest news in the AI space can be head-spinning. In just the last few weeks, OpenAI introduced their newest model (GPT-4o), Anthropic continued to develop Claude with the launch of Claude 3.5 Sonnet, and Mistral launched Mixtral 8x22B, their most efficient open model to date. And those are only a handful of recent releases! In such an ever-changing space, partnerships are critical to combining the strengths of organizations to create solutions that would be challenging to develop independently. Also, it can be overwhelming for any one business to keep track of so much change. So there’s a lot of value in partnering with industry leaders and new players alike to bring the latest innovations to customers. I’ve been at MongoDB for less than a year, but in that time our team has already built dozens of strategic partnerships that are helping companies and developers build AI applications faster and safer. I love to see these collaborations take off! A compelling example is MongoDB’s recent work with Vercel. Our team developed an exciting sample application that allows users to deploy a retrieval-augmented generation (RAG) application on Vercel in just a few minutes. By leveraging a MongoDB URI and an OpenAI key, users can one-click deploy this application on Vercel. Another recent collaboration was with Netlify. Our team also developed a starter template that implements a RAG chatbot on top of their platform using LangChain and MongoDB Atlas Vector Search capabilities for storing and searching the knowledge base that powers the chatbot's responses. These examples demonstrate the power of combining MongoDB's robust database capabilities with other deployment platforms. They also show how quickly and efficiently users can set up fully functional RAG applications, and highlight the significant advantages that partnerships bring to the AI ecosystem. And the best part? We’re just getting started! Stay tuned for more information about the MongoDB AI Applications Program later this month. Welcoming new AI partners Speaking of partnerships, in June we welcomed seven AI partners that offer product integrations with MongoDB. Read on to learn more about each great new partner. AppMap is an open source personal observability platform to help developers keep their software secure, clear, and aligned. Elizabeth Lawler, CEO of AppMap, commented on our joint value for developers. “AppMap is thrilled to join forces with MongoDB to help developers improve and optimize their code. MongoDB is the go-to data store for web and mobile applications, and AppMap makes it easier than ever for developers to migrate their code from other data stores to MongoDB and to keep their code optimized as their applications grow and evolve.” Read more about our partnership and how to use AppMapp to improve the quality of code running with MongoDB. Mendable is a platform that automates customer services providing quick and accurate answers to questions without human intervention. Eric Ciarla, co-founder of Mendable, highlighted the importance of our partnership. "Our partnership with MongoDB is unlocking massive potential in AI applications, from go to market copilots to countless other innovative use cases,” he said. “We're excited to see teams at MongoDB and beyond harnessing our combined technologies to create transformative AI solutions across all kinds of industries and functions." Learn how Mendable and MongoDB Atlas Vector Search power customer service applications. OneAI is an API-first platform built for developers to create and manage trusted GPT chatbots. Amit Ben, CEO of One AI, shared his excitement about the partnership. "We're thrilled to partner with MongoDB to help customers bring trusted GenAI to production. OneAI's platform, with RAG pipelines, LLM-based chatbots, goal-based AI, anti-hallucination guardrails, and language analytics, empowers customers to leverage their language data and engage users even more effectively on top of MongoDB Atlas." Check out some One AI’s GPT agents & advanced RAG pipelines built on MongoDB. Prequel allows companies to sync data to and from their customers' data warehouses, databases, or object storage so they get better data access with less engineering effort. "Sharing MongoDB data just got easier with our partnership,” celebrated Charles Chretien, co-founder of Prequel. “Software companies running on MongoDB can use Prequel to instantly share billions of records with customers on every major data warehouse, database, and object storage service.” Learn how you can share MongoDB data using Prequel. Qarbine complements summary data visualization tools allowing for better informed decision-making across teams. Bill Reynolds, CTO of Qarbine, mentioned the impact of our integration to distill better insights from data: “We’re excited to extend the many MongoDB Atlas benefits upward in the modern application stack to deliver actionable insights from publication quality drill-down analysis. The native integrations enhance in-app real-time decisions, business productivity and operational data ROI, fueling modern application innovation.” Want to power up your insights with MongoDB Atlas and Qarbine? Read more . Temporal is a durable execution platform for building and scaling invincible applications faster. "Organizations of all sizes have built AI applications that are ‘durable by design’ using MongoDB and Temporal. The burden of managing data and agent task orchestration is effortlessly abstracted away by Temporal's development primitives and MongoDB's Atlas Developer Data Platform”, says Jay Sivachelvan, VP of Partnerships at Temporal. He also highlighted the benefits of this partnership. “These two solutions, together, provide compounding benefits by increasing product velocity while also seamlessly automating the complexities of scalability and enterprise-grade resilience." Learn how to build microservices in a more efficient way with MongoDB and Temporal. Unstructured is a platform that connects any type of enterprise data for use with vector databases and any LLM framework. Read more about enhancing your gen AI application accuracy using MongoDB and Unstructured. But wait, there's more! To learn more about building AI-powered apps with MongoDB, check out our AI Resources Hub , and stop by our Partner Ecosystem Catalog to read about our integrations with MongoDB’s ever-evolving AI partner ecosystem.

July 9, 2024
Artificial Intelligence

Elevate Your Python AI Projects with MongoDB and Haystack

MongoDB is excited to announce an integration with Haystack, enhancing MongoDB Atlas Vector Search for Python developers. This integration amplifies our commitment to providing developers with cutting-edge tools for building AI applications centered around semantic search and Large Language Models (LLMs). We’re excited to partner with MongoDB to help developers build top-tier LLM applications. The new Haystack and MongoDB Atlas integration lets developers seamlessly use MongoDB data in Haystack, a reliable framework for creating quality LLM pipelines for use cases like RAG, QA, and agentic pipelines. Whether you're an experienced developer or just starting, your gen AI projects can quickly progress from prototype to adoption, accelerating value for your business and end-users. Malte Pietsch, co-founder and CTO, deepset Simplifying AI app development with Haystack Haystack is an open-source Python framework that simplifies AI application development. It enables developers to start their projects quickly, experiment with different AI models, and to efficiently scale their applications. Indeed, Haystack is particularly effective for building applications requiring semantic understanding and natural language processing (NLP), such as chatbots and question-answering systems. Haystack’s core features include: Components: Haystack breaks down complex NLP tasks into manageable components, such as document retrieval or text summarization. With the new MongoDB-Haystack integration, MongoDB becomes the place where all your data lives, ready for Haystack to use. Pipelines: Haystack lets you link components together into pipelines for more complex tasks. With this integration, your MongoDB data flows through these pipelines. Agents: Haystack Agents use LLMs to resolve complex queries. They can decide which tools (or components) to use for a given question, leveraging MongoDB data to deliver smarter answers. Atlas Vector Search: Enhance AI development with Haystack At the heart of the new integration is MongoDB Atlas Vector Search , transforming how applications search and retrieve data. By leveraging vector embeddings, Atlas Vector Search goes beyond mere keyword matching: it interprets the intent behind queries, enabling applications to provide highly relevant, context-aware responses. This is a breakthrough for Python developers who aim to build applications that think and understand like humans. Building on this foundation, the Atlas Vector Search and Haystack integration gives Python developers a powerful toolkit for navigating the complexities of AI application development. MongoDB becomes a dynamic document store within Haystack's framework, optimizing data storage, processing, and retrieval. Additionally, the integration eases the use of advanced AI models from leading providers such as OpenAI and Cohere into your applications. Developers can thus create applications that do more than just answer queries—they grasp and act on the underlying intent, ensuring responses are both accurate and contextually relevant. What this means for Python developers For Python developers, this integration means: Faster development: Developers can focus on building and innovating rather than spending time configuring and managing infrastructure. MongoDB's integration with Haystack means you can get up and running quickly, leveraging the best of both technologies to accelerate your development cycles. Smarter applications: By utilizing Haystack's powerful Natural Language Processing tooling in combination with MongoDB Atlas Vector Search’s efficient data handling, developers can create applications that understand and process natural language more effectively. This results in applications that can provide more accurate and contextually relevant responses that resonate with user intent. Access to pre-trained AI models: With seamless integration of leading generative AI models from providers like OpenAI, Anthropic, Cohere, Hugging Face, and AWS Bedrock, Python developers can easily incorporate advanced AI functionalities into their projects. This means developers can quickly adopt state-of-the-art models without the need for extensive training or fine-tuning, saving time and resources. Flexible and scalable pipelines: Haystack's modular approach to building AI applications, through its use of components and pipelines, allows developers to create flexible and scalable solutions. With MongoDB data seamlessly flowing through these pipelines, you can easily adapt and expand your applications to meet growing demands and new challenges. Robust search capabilities: Atlas Vector Search transforms the way applications retrieve and interpret data, going beyond simple keyword searches. It enables applications to perform high-precision searches that return more relevant and semantically rich results. This advanced search capability is crucial for developing applications that require high levels of semantic understanding and accuracy. By integrating MongoDB with Haystack, Python developers are equipped with a powerful toolkit that not only simplifies the AI development process but also significantly enhances the intelligence and functionality of their applications. Whether you are building chatbots, search engines, or other AI-driven applications, this integration provides the tools you need to create innovative and impactful solutions. Get started now Start leveraging the MongoDB and Haystack integration for your AI development. Explore our tutorial , documentation , or check out our github repository to begin building smarter, more intuitive Python projects today!

July 8, 2024
Updates

Nokia Corteca Scales Wi-Fi Connectivity to Millions of Devices With MongoDB Atlas

Nokia’s home Wi-Fi connectivity cloud platform was launched in 2019 as the Nokia WiFi Cloud Controller (NWCC). In 2023, it was renamed and relaunched as the Corteca Home Controller, becoming part of the Corteca software suite that delivers smarter broadband for a better experience. The Corteca Home Controller can be hosted on Amazon Web Services, Google Cloud, or Microsoft Azure, and is the industry’s first platform to support three management services—device management, Wi-Fi management, and application management. Supporting TR-369 (a standardized remote device management protocol) also allows the Home Controller to work in a multi-vendor environment, managing both Nokia broadband devices and third-party broadband devices. By solving connectivity issues before the end-user detects them, and by automatically optimizing Wi-Fi performance, the Home Controller helps deliver excellent customer experiences to millions of users, 24/7. During the five years that Nokia Corteca has been a MongoDB Atlas customer, the Home Controller has successfully scaled from 500,000 devices to over 4.5 million. There are now 75 telecommunications customers of Home Controller spread across all regions of the globe. Having the stability, efficiency, and performance to scale Nokia Corteca's solution is end-to-end, from applications embedded in the device, through the home, and into the cloud. Algorithms assess data extracted from home networks, based on which performance parameters automatically adjust as needed—changing Wi-Fi channels to avoid network interference, for example—thereby ensuring zero downtime. The Home Controller processes real-time data sent from millions of devices, generating massive volumes of data. With a cloud optimization team tasked with deploying the solution across the globe to ever more customers, the Home Controller needed to store and manage its vast dataset and to onboard new telecommunication organizations more easily without incurring any downtime. Prior to Nokia Corteca moving to MongoDB Atlas, its legacy relational database lacked stability and required both admin and application teams to manage operations. A flexible model with time series capabilities That's where MongoDB Atlas came in. Nokia was familiar with the MongoDB Atlas database platform, having already worked with it as part of a previous company acquisition and solution integration. As Nokia's development team had direct experience with the scalability, manageability, and ease of use offered by MongoDB Atlas, they knew it had the potential to address the Home Controller’s technical and business requirements. There was another key element: Nokia wanted to store time-series data—a sequence of data points in which insights are gained by analyzing changes over time. MongoDB Atlas has the unique ability to store operational and time series data in parallel and provides robust querying capabilities on that data. Other advantages include MongoDB's flexible schema, which helps developers store data to match the application's needs and adapt as data changes over time. MongoDB Atlas also provides features such as Performance Advisor that monitors the performance of the database and makes intelligent recommendations to optimize and improve the performance and resource consumption Fast real time data browsing and scalability made easy Previously, scaling the database had been time-consuming and manual. With MongoDB Atlas, the team can easily scale up as demand increases with very little effort and no downtime. This also means it is much more straightforward to add new clients, such as large telecommunications companies. Having started with 100GB of data, the team now has more than 1.3 terabytes, and can increase the disc space in a fraction of a second, positioning the team to be able to scale with the business. As the Home Controller grows and onboards more telcos, the team anticipates a strengthening relationship with MongoDB. “We have a very good relationship with the MongoDB team,” said Jaisankar Gunasekaran, Head of Cloud Hosting and Operations at Nokia. “One of the main advantages is their local presence—they’re accessible, they’re friendly, and they’re experts. It makes our lives easier and lets us concentrate on our products and solutions.” To learn more about how MongoDB can help drive innovation and capture customer imaginations, check out our MongoDB for Telecommunications page.

July 2, 2024
Applied

Building Gen AI-Powered Predictive Maintenance with MongoDB

In today’s fast-evolving industrial landscape, digital transformation has become a necessity. From manufacturing plants to connected vehicles, the push towards predictive maintenance excellence is driving organizations to embrace smarter, more efficient ways of managing operations. One of the most compelling advancements in this domain is predictive maintenance powered by generative AI , a cutting-edge approach that will revolutionize how industries maintain and optimize their equipment. For manufacturers seeking maintenance excellence, a unified data store and a developer data platform are key enablers. These tools provide the foundation for integrating AI applications that can analyze sensor data, predict failures, and optimize maintenance schedules. MongoDB Atlas is the only multi-cloud developer data platform available that is designed to streamline and speed up developers' data handling. With MongoDB Atlas, developers can enhance end-to-end value chain optimization through AI/ML, advanced analytics, and real-time data processing, supporting cutting-edge mobile, edge, and IoT applications. In this post, we’ll explore the basics of predictive maintenance and how MongoDB can be used for maintenance excellence. Understanding the need for predictive maintenance Predictive maintenance is about anticipating and addressing equipment failures before they occur, ensuring minimal disruption to operations. Traditional maintenance strategies, like time-based or usage-based maintenance, are less effective than predictive maintenance because they don’t account for the varying conditions and complexities of machinery. Unanticipated equipment breakdown can result in line stoppage and substantial throughput losses, potentially leading to millions of dollars in revenue loss. Since the pandemic, many organizations have begun significant digital transformations to improve efficiency and resilience. However, a concerning gap exists between tech adoption and return on investment. While 89% of organizations have begun digital and AI transformations, only 31% have seen the expected revenue lift, and only 25% have realized the expected cost savings. These numbers highlight the importance of implementing new technologies strategically. Manufacturers need to carefully consider how AI can address their specific challenges and then integrate them into existing processes effectively. Predictive maintenance boosts efficiency and saves money Predictive maintenance uses data analysis to identify problems in machines before they fail. This allows organizations to schedule maintenance at the optimal time, maximizing machine reliability and efficiency. Indeed, according to Deloitte , predictive maintenance can lead to a variety of benefits, including: 3-5% reduction in new equipment costs 5-20% increase in labor productivity 15-20% reduction in facility downtime 10-30% reduction in inventory levels 5-20% reduction in carrying costs Since the concept was introduced, predictive maintenance has constantly evolved. We've moved beyond basic threshold-based monitoring to advanced techniques like machine learning (ML) models. These models can not only predict failures but also diagnose the root cause, allowing for targeted repairs. The latest trend in predictive maintenance is automated strategy creation. This involves using AI to not only predict equipment breakdowns but also to generate repair plans, ensuring the right fixes are made at the right time. Generative AI in predictive maintenance To better understand how gen AI can be used to build robust predictive maintenance solutions, let's dig into the characteristics of organizations that have successfully implemented AI. They exhibit common traits across five key areas: Identifying high-impact value drivers and AI use cases: Efforts should be concentrated on domains where artificial intelligence yields maximal utility rather than employing it arbitrarily. Aligning AI strategy with data strategy: Organizations must establish a strong data foundation with a data strategy that directly supports their AI goals. Continuous data enrichment and accessibility: High-quality data, readily available and usable across the organization, is essential for the success of AI initiatives. Empowering talent and fostering development: By equipping their workforce with training and resources, organizations can empower them to leverage AI effectively. Enabling scalable AI adoption: Building a strong and scalable infrastructure is key to unlocking the full potential of AI by enabling its smooth and ongoing integration across the organization. Implementing predictive maintenance using MongoDB Atlas When combined with a robust data management platform like MongoDB Atlas, gen AI can predict failures with remarkable accuracy and suggest optimal maintenance schedules. MongoDB Atlas is the only multi-cloud developer data platform designed to accelerate and simplify how developers work with data. Developers can power end-to-end value chain optimization with AI/ML, advanced analytics, and real-time data processing for innovative mobile, edge, and IoT applications. MongoDB Atlas offers a suite of features perfectly suited for building a predictive maintenance system, as shown in Figure 1 below. Its ability to handle both structured and unstructured data allows for comprehensive condition monitoring and anomaly detection. Here’s how you can build a generative AI-powered predictive maintenance software using MongoDB Atlas: Machine prioritization: This stage prioritizes machines for the maintenance excellence program using a retrieval-augmented generation (RAG) system that takes in structured and unstructured data related to maintenance costs and past failures. Generative AI revolutionizes this process by reducing manual analysis time and minimizing investment risks. At the end of this stage, the organization knows exactly which equipment or assets are well-suited for sensorization. Utilizing MongoDB Atlas, which stores both structured and unstructured data, allows for semantic searches that provide accurate context to AI models. This results in precise machine prioritization and criticality analysis. Failure prediction: MongoDB Atlas provides the necessary tools to implement failure prediction, offering a unified view of operational data, real-time processing, integrated monitoring, and seamless machine learning integration. Sensors on machines, like milling machines, collect data (e.g., air temperature and torque) and process it through Atlas Stream Processing , allowing continuous, real-time data handling. This data is then analyzed by trained models in MongoDB, with results visualized using Atlas Charts and alerts pushed via Atlas Device Sync to mobile devices, establishing an end-to-end failure prediction system. Repair plan generation: To implement a comprehensive repair strategy, generating a detailed maintenance work order is crucial. This involves integrating structured data, such as repair instructions and spare parts, with unstructured data from machine manuals. MongoDB Atlas serves as the operational data layer, seamlessly combining these data types. By leveraging Atlas Vector Search and aggregation pipelines , the system extracts and vectorizes information from manuals and past work orders. This data feeds into a large language model (LLM), which generates the work order template, including inventory and resource details, resulting in an accurate and efficient repair plan. Maintenance guidance generation: Generative AI is used to integrate service notes and additional information with the repair plan, providing enhanced guidance for technicians. For example, if service notes in another language are found in the maintenance management system, we extract and translate the text to suit our application. This information is then combined with the repair plan using a large language model. The updated plan is pushed to the technician’s mobile app via Atlas Device Sync. The system generates step-by-step instructions by analyzing work orders and machine manuals, ensuring comprehensive guidance without manually sifting through extensive documents. Figure 1: Achieving end-to-end predictive maintenance with MongoDB Atlas Developer Data Platform In the quest for operational excellence, predictive maintenance powered by generative AI and MongoDB Atlas stands out as a game-changer. This innovative approach not only enhances the reliability and efficiency of industrial operations but also sets the stage for a future where AI-driven insights and actions become the norm. By leveraging the advanced capabilities of MongoDB Atlas, manufacturers can unlock new levels of performance and productivity, heralding a new era of smart manufacturing and connected systems. If you would like to learn more about generative AI-powered predictive maintenance, visit the following resources: [Video] How to Build a Generative AI-Powered Predictive Maintenance Software [Whitepaper] Generative AI in Predictive Maintenance Applications [Whitepaper] Critical AI Use Cases in Manufacturing and Motion: Realizing AI-powered innovation with MongoDB Atlas

June 27, 2024
Artificial Intelligence

Unlock PDF Search in Insurance with MongoDB & SuperDuperDB

As industries go, the insurance industry is particularly document-driven. Insurance professionals, including claim adjusters and underwriters, spend considerable time handling documentation with a significant portion of their workday consumed by paperwork and administrative tasks. This makes solutions that speed up the process of reviewing documents all the more important. Retrieval-augmented generation (RAG) applications are a game-changer for insurance companies, enabling them to harness the power of unstructured data while promoting accessibility and flexibility. This is especially true for PDFs, which despite their prevalence are difficult to search, leading claim adjusters and underwriters to spend hours reviewing contracts, claims, and guidelines in this common format. By combining MongoDB and SuperDuperDB you can build a RAG-powered system for PDF search, thus bringing efficiency and accuracy to this cumbersome task. With a PDF search application, users can simply type a question in natural language and the app will sift through company data, provide an answer, summarize the content of the documents, and indicate the source of the information, including the page and paragraph where it was found. In this blog, we will dive into the architecture of how this PDF search application can be created and what it looks like in practice. Why should insurance companies care about PDF Search? Insurance firms rely heavily on data processing. To make investment decisions or handle claims, they leverage vast amounts of data, mostly unstructured. As previously mentioned, underwriters and claim adjusters need to comb through numerous pages of guidelines, contracts, and reports, typically in PDF format. Manually finding and reviewing every piece of information is time-consuming and can easily lead to expensive mistakes, such as incorrect risk estimations. Quickly finding and accessing relevant content is key. Combining Atlas Vector Search and LLMs to build RAG apps can directly impact the bottom line of an insurance company. Behind the scenes: System architecture and flow As mentioned, MongoDB and SuperDuperDB underpin our information retrieval system. Let’s break down the process of building it: The user adds the PDFs that need to be searched. A script scans them, creates the chunks, and vectorizes them (see Figure 1). The chunking step is carried out using a sliding window methodology, which ensures that potentially important transitional data between chunks is not lost, helping to preserve continuity of context. Vectors and chunk metadata are stored in MongoDB, and an Atlas Vector Search index is created (see Figure 3). The PDFs are now ready to be queried. The user selects a customer, asks a question, and the system returns an answer, where it was found and highlights the section with a red frame (see Figure 3). Figure 1: PDF chunking, embedding creation, and storage orchestrated with SuperDuperDB Each customer has a guidelines PDF associated with their account based on their residency. When the user selects a customer and asks a question, the system runs a Vector Search query on that particular document, seamlessly filtering out the non-relevant ones. This is made possible by the pre-filtering field included in the search query. Atlas Vector Search also takes advantage of MongoDB’s new Search Nodes dedicated architecture, enabling better optimization for the right level of resourcing for specific workload needs. Search Nodes provide dedicated infrastructure for Atlas Search and Vector Search workloads, allowing you to optimize your compute resources and fully scale your search needs independent of the database. Search Nodes provide better performance at scale, delivering workload isolation, higher availability, and the ability to optimize resource usage. Figure 2: PDF querying flow, orchestrated with SuperDuperDB SuperDuperDB SuperDuperDB is an open-source Python framework for integrating AI models and workflows directly with and across major databases for more flexible and scalable custom enterprise AI solutions. It enables developers to build, deploy, and manage AI on their existing data infrastructure and data, while using their preferred tools, eliminating data migration and duplication. With SuperDuperDB, developers can: Bring AI to their databases, eliminate data pipelines and moving data, and minimize engineering efforts, time to production, and computation resources. Implement AI workflows with any open and closed source AI models and APIs, on any type of data, with any AI and Python framework, package, class or function. Safeguard their data by switching from APIs to hosting and fine-tuning your own models, on your own existing infrastructure, whether on-premises or in the cloud. Easily switch between embedding models and LLMs, to other API providers as well as hosting your own models, on HuggingFace, or elsewhere just by changing a small configuration. Build next-generation AI apps on your existing database SuperDuperDB provides an array of sample use cases and notebooks that developers can use to get started, including vector search with MongoDB, embedding generation, multimodal search, retrieval-augmented generation (RAG), transfer learning, and many more. The demo showcased in this post is adapted from an app previously developed by SuperDuperDB. Let's put it into practice To show you how this could work in practice, let’s look at, an underwriter handling a specific case. The underwriter is seeking to identify the risk control measures as shown in Figure 3 below but needs to look through documentation. Analyzing the guidelines PDF associated with a specific customer helps determine the loss in the event of an accident or the new premium in the case of a policy renewal. The app assists by answering questions and displaying relevant sections of the document. Figure 3: Screenshot of the UI of the application, showing the question asked, the LLM’s answer, and the reference document where the information is found By integrating MongoDB and SuperDuperDB, you can create a RAG-powered system for efficient and accurate PDF search. This application allows users to type questions in natural language, enabling the app to search through company data, provide answers, summarize document content, and pinpoint the exact source of the information, including the specific page and paragraph. If you would like to learn more about Vector Search powered apps and SuperDuperDB, visit the following resources: PDF Search in Insurance Github repository Search PDFs at Scale with MongoDB and Nomic SuperDuperDB Github, includes notebooks and examples

June 24, 2024
Applied

Atlas Vector Search Once Again Voted Most Loved Vector Database

The 2024 Retool State of AI report has just been released, and for the second year in a row, MongoDB Atlas Vector Search was named the most loved vector database. Atlas Vector Search received the highest net promoter score (NPS), a measure of how likely a user is to recommend a solution to their peers. This post is also available in: Deutsch , Français , Español , Português , Italiano , 한국어 , 简体中文 . The Retool State of AI report is a global annual survey of developers, tech leaders, and IT decision-makers that provides insights into the current and future state of AI, including vector databases, retrieval-augmented generation (RAG) , AI adoption, and challenges innovating with AI. MongoDB Atlas Vector Search commanded the highest NPS in Retool’s inaugural 2023 report, and it was the second most widely used vector database within just five months of its release. This year, Atlas Vector Search came in a virtual tie for the most popular vector database, with 21.1% of the vote, just a hair behind pgvector (PostgreSQL), which received 21.3%. The survey also points to the increasing adoption of RAG as the preferred approach for generating more accurate answers with up-to-date and relevant context that large language models ( LLMs ) aren't trained on. Although LLMs are trained on huge corpuses of data, not all of that data is up to date, nor does it reflect proprietary data. And in those areas where blindspots exist, LLMs are notorious for confidently providing inaccurate "hallucinations." Fine-tuning is one way to customize the data that LLMs are trained on, and 29.3% of Retool survey respondents leverage this approach. But among enterprises with more than 5,000 employees, one-third now leverage RAG for accessing time-sensitive data (such as stock market prices) and internal business intelligence, like customer and transaction histories. This is where MongoDB Atlas Vector Search truly shines. Customers can easily utilize their stored data in MongoDB to augment and dramatically improve the performance of their generative AI applications, during both the training and evaluation phases. In the course of one year, vector database utilization among Retool survey respondents rose dramatically, from 20% in 2023 to an eye-popping 63.6% in 2024. Respondents reported that their primary evaluation criteria for choosing a vector database were performance benchmarks (40%), community feedback (39.3%), and proof-of-concept experiments (38%). One of the pain points the report clearly highlights is difficulty with the AI tech stack . More than 50% indicated they were either somewhat satisfied, not very satisfied, or not at all satisfied with their AI stack. Respondents also reported difficulty getting internal buy-in, which is often complicated by procurement efforts when a new solution needs to be onboarded. One way to reduce much of this friction is through an integrated suite of solutions that streamlines the tech stack and eliminates the need to onboard multiple unknown vendors. Vector search is a native feature of MongoDB's developer data platform, Atlas, so there's no need to bolt on a standalone solution. If you're already using MongoDB Atlas , creating AI-powered experiences involves little more than adding vector data into your existing data collections in Atlas. If you're a developer and want to start using Atlas Vector Search to start building generative AI-powered apps, we have several helpful resources: Learn how to build an AI research assistant agent that uses MongoDB as the memory provider, Fireworks AI for function calling, and LangChain for integrating and managing conversational components. Get an introduction to LangChain and MongoDB Vector Search and learn to create your own chatbot that can read lengthy documents and provide insightful answers to complex queries. Watch Sachin Smotra of Dataworkz as he delves into the intricacies of scaling RAG (retrieval-augmented generation) applications. Read our tutorial that shows you how to combine Google Gemini's advanced natural language processing with MongoDB, facilitated by Vertex AI Extensions to enhance the accessibility and usability of your database. Browse our Resources Hub for articles, analyst reports, case studies, white papers, and more. Want to find out more about recent AI trends and adoption? Read the full 2024 Retool State of AI report .

June 21, 2024
News

Exact Nearest Neighbor Vector Search for Precise Retrieval

With its ability to efficiently handle high-dimensional, unstructured data, vector search delivers relevant results even when users don’t know what they’re looking for and uses machine learning models to find similar results across any data type. Rapidly emerging as a key technology for modern applications, vector search empowers developers to build next-generation search and generative AI applications faster and easier. MongoDB Atlas Vector Search goes beyond the approximate nearest neighbor (ANN) methods with the introduction of exact nearest neighbor (ENN) vector search . This innovative capability guarantees retrieval of the absolute closest vectors to your query, eliminating the accuracy limitations inherent in ANN. In sum, ENN vector search can help you unleash a new level of precision for your search and generative AI applications, improving benchmarking and moving to production faster. When exact nearest neighbor (ENN) vector search benefits developers While ANN shines in searching across large datasets, ENN vector search offers advantages in specific scenarios: Small-scale vector data: For datasets under 10,000 vectors, the linear time complexity of ENN vector search makes it a viable option, especially considering the added development complexity of tuning ANN parameters. Recall benchmarking of ANN queries: ANN queries are fast, particularly as the scale of your indexed vectors increases, but it may not be easy to know whether the retrieved documents by vector relevance correspond to the guaranteed closest vectors in your index. Using ENN can help provide that exact result set for comparison with your approximate result set, using jaccard similarity or other rank-aware recall metrics. This will allow you to have much greater confidence that your ANN queries are accurate since you can build quantitative benchmarks as your data evolves. Multi-tenant architectures: Imagine a scenario with millions of vectors categorized by tenants. You might search for the closest vectors within a specific tenant (identified by a tenant ID). In cases where the overall vector collection is large (in the millions) but the number of vectors per tenant is small (a few thousand), ANN's accuracy suffers when applying highly selective filters. ENN vector search thrives in this multi-tenant scenario, delivering precise results even with small result sets. Example use cases The small dataset size allows for exhaustive search within a reasonable timeframe, making exact nearest neighbor approach a viable option for finding the most similar data point, improving accuracy confidence in a number of use cases, such as: Multi-tenant data service: You might be building a business providing an agentic service that understands your customers’ data and takes actions on their behalf. When retrieving relevant proprietary data for that agent, it is critical that the right metadata filter be applied and that ENN be executed to retrieve the right sets of documents only corresponding to the appropriate data tenant IDs. Proof of concept development: For instance, a new recommendation engine might have a limited library compared to established ones. Here, ENN vector search can be used to recommend products to a small set of early adopters. Since the data is limited, an exhaustive search becomes practical, ensuring the user gets the most relevant recommendations from the available options. How ENN vector search works on MongoDB Atlas The ENN vector search feature in Atlas integrates seamlessly with the existing $vectorSearch stage within your Atlas aggregation pipelines. Its key characteristics include: Guaranteed accuracy: Unlike ANN, ENN always returns the closest vectors to your query, adhering to the specified limit. Eventual consistency: Similar to approximate vector search, ENN vector search follows an eventual consistency model. Simplified configuration: Unlike approximate vector search, where tuning numCandidates is crucial, ENN vector search only requires specifying the desired limit of returned vectors. Scalable recall evaluation: Atlas allows querying a large number of indexed vectors, facilitating the calculation of comprehensive recall sets for effective evaluation. Fast query execution: ENN vector search query execution can maintain sub-second latency for unfiltered queries up to 10,000 documents. It can also provide low-latency responses for highly selective filters that restrict a broad set of documents into 10,000 documents or less, ordered by vector relevance. Build more with ENN vector search ENN vector search can be a powerful tool when building a proof of concept for retrieval-augmented generation (RAG), semantic search, or recommendation systems powered by vector search. It simplifies the developer experience by minimizing overhead complexity and latency while giving you the flexibility to implement and benchmark precise retrieval. Explore more use cases and build applications faster, start experimenting with ENN vector search.

June 20, 2024
Updates

Unified Namespace Implementation with MongoDB and MaestroHub

In the complex world of modern manufacturing, a crucial challenge has long persisted: how to seamlessly connect the physical realm of industrial control systems with the digital landscape of enterprise operations. The International Society of Automation's ISA-95 standard, often visualized as the automation pyramid, has emerged as a guiding light. As shown below, this five-level hierarchical model empowers manufacturers to bridge the gap between these worlds, unlocking a path toward smarter, more integrated operations. Figure 1: In the automation pyramid, data moves up or down one layer at a time, using point-to-point connections. Manufacturing organizations face a number of challenges when implementing smart manufacturing applications due to the sheer volume and variety of data generated. An average factory produces terabytes of data daily, including time series data from machines stored in process historians and accessed by supervisory control and data acquisition (or SCADA) systems. Additionally, manufacturing execution systems (MES), enterprise resource planning (ERP) systems, and other operations software generate vast amounts of structured and unstructured data. Globally, the manufacturing industry generates an estimated 1.9 petabytes of data annually . Manufacturing leaders are eager to leverage their data for AI and generative AI projects, but a Workday Global Survey reveals that only 4% of the survey’s respondents believe their data is fully accessible for such applications. Data silos are a significant hurdle, with data workers spending an average of 48% of their time on data search and preparation. A popular approach to making data accessible is consolidating it in a cloud data warehouse and then adding context. However, this can be costly and inefficient, as dumping data without context makes it difficult for AI developers to understand its meaning and origin, especially for operational technology time series data. Figure 2: Pushing uncontextualized data to a data warehouse and then adding context is expensive and inefficient. All these issues underscore the need for a new approach—one that not only standardizes data across disparate shop floor systems, but also seamlessly weaves context into the fabric of this data. This is where the Unified Namespace (UNS) comes in. Figure 3: Unified Namespace provides the right data and context to all the applications connected to it. Unified Namespace is a centralized, real-time repository for all production data. It provides a single, comprehensive view of the business's current state. Using an event-driven architecture, applications publish real-time updates to a central message broker, which subscribers can consume asynchronously. This creates a flexible, decoupled ecosystem where applications can both produce and consume data as needed. Figure 4: UNS enables all the enterprise systems to have one centralized location to get the data they need for what they want to accomplish. MaestroHub and MongoDB: Solving the UNS challenge Initially introduced in 2011 at the Hannover Fair of Industrial Technologies, the core idea behind Industry 4.0 is to establish seamless connectivity and interoperability between disparate systems used in manufacturing. And UNS aims to solve this. Over the past five years, we have seen interest in UNS ramping up steadily, and now manufacturers are looking for practical ways to implement it. In particular, a question we’re frequently asked is where does UNS actually live. To answer that question, we need to look at popular architecture patterns, and the pros and cons of each. The most common pattern is implementing UNS in an MQTT broker. An MQTT broker will act as an intermediary entity that receives messages published by clients, filters the messages by topic, and distributes them to subscribers. The reason most manufacturers choose MQTT is it is an open architecture that is easy to implement. However, the challenge with just using the MQTT broker is that the clients don't get historical data access (which will be required to build the analytical and AI applications). Another approach can be to just dump all the data in a data warehouse and then add context to it. This can solve the problem of historical data access but it is an inefficient approach to standardize messages after they have been landed in the data warehouse in the cloud. A superior solution for comprehensive, real-time data access is combining a single source of truth (SSoT) Unified Namespace platform like MaestroHub with a flexible multi-cloud data platform like MongoDB Atlas. MaestroHub creates SSoT for industrial data, resulting in an up to 80% reduction in integration effort for brownfield facilities. Figure 5: MaestroHub SSoT creates a unified data integration layer, saving up to 50% of time in data contextualization (Source: MaestroHub). MaestroHub provides the connectivity layer to all data sources on the factory floor, along with contextualization and data orchestration. This makes it easy to connect the data needed for the UNS, enrich it with more context, and then publish it to consumers using the protocol that works best for them. Under the hood, MaestroHub stores metadata of connections, instances, and flows, and uses MongoDB as the database to store all this data. MongoDB’s flexible data modeling patterns reduce the complexity of mapping and transforming data when it's shared across different clients in the UNS. Additionally, scalable data indexing overcomes performance concerns as the UNS grows over time. Figure 6: MaestroHub and MongoDB together enable a real-time UNS plus long-term storage. MongoDB: The foundation for intelligent industrial UNS In the quest to build a unified namespace system (UNS) for the modern industrial landscape, the choice of database becomes paramount. So why turn to MongoDB? Scalability and high availability: It scales effortlessly, both vertically and horizontally (sharding), to handle the torrent of data from sensors, machines, and processes. Operational Technology (OT) systems generate vast amounts of data from these sources, and MongoDB ensures seamless management of that information. Document data model: Its adaptable document model accommodates diverse data structures, ensuring a harmonious flow of information. A Unified Namespace (UNS) must handle data from any factory source, accommodating structure variations. MongoDB's flexible schema design allows different data models to coexist in a single database, with schema extensibility at runtime. This flexibility facilitates the seamless integration of new data sources and types into the UNS. Real-time data processing: MongoDB Change Streams and Atlas Device Sync empower third-party applications to access real-time data updates. This is essential for monitoring, alerting, and real-time analysis within a UNS, enabling prompt responses to critical events. Gen AI application development with ease: Atlas Vector Search efficiently performs semantic searches on vector embeddings stored in MongoDB Atlas. This capability seamlessly integrates with large language models (LLMs) to provide relevant context in retrieval-augmented generation (RAG) systems. Given that the Universal Name Service (UNS) functions as a single source of truth for industrial applications, connecting gen AI apps to retrieve context from the UNS database ensures accurate and reliable information retrieval for these applications. With the foundational database established, let's explore MaestroHub, a platform designed to leverage the power of MongoDB in industrial settings. The MaestroHub platform MaestroHub is a provider of a SSoT for industrial data, specifically tailored for manufacturers. It achieves this through: Data connectors: MaestroHub connects to diverse data sources using 38 different industrial communication protocols, encompassing OT drivers, files, databases (SQL, NoSQL, time series), message brokers, web services, cloud systems, historians, and data warehouses. The bi-directional nature of 90% of these protocols ensures comprehensive data integration, leaving no data siloed. Data contextualization based on ISA-95: Leveraging ISA-95 Part 2, MaestroHub employs a semantic hierarchy and a clear naming convention for easy navigation and understanding of data topics. The contextualization of the payload is not just limited to the unit of measure AND definitional but also contains Enterprise/Site/Area/Line/Cell details, which are invaluable for analytics studies. Data contextualization is an important feature of a UNS platform. Logic flows/rule engine: Adhering to the UNS principle "Do not make any assumptions on how the data will be consumed," the data should flow flexibly from sources to brokers and from brokers to consumers in terms of rules, frequencies, and multiple endpoints. MaestroHub allows you to set rules such as Always, OnChange, OnTrue, and WhileTrue, where you can dynamically determine the conditions using events and inputs via JavaScript. Insights created by MaestroHub: MaestroHub provides real-time diagnostics of data health by leveraging Prometheus, Elasticsearch, Fluentd, and Kibana. Network problems, changed endpoints, and changed data types are automatically diagnosed and reported as insights. Additionally, MaestroHub uses NATS for queue management and stream analytics, buffering data in the event of a network outage. This allows IT and OT teams to monitor, debug, and audit logs with full data lineage. Conclusion The ISA-95 automation pyramid presents significant challenges for the manufacturing industry, including a lack of flexibility, limited scalability, and difficulty integrating new technologies. By adopting a Unified Namespace architecture with MaestroHub and MongoDB, manufacturers can overcome these challenges and achieve real-time visibility and control over their operations, leading to increased efficiency and improved business outcomes. Read more on how MongoDB enables Unified Namespace via its multi-cloud developer data platform. We are actively working with our clients on solving Unified Namespace challenges. Take a look at our Manufacturing and Industrial IoT page for more stories or contact us through the web form in the link.

June 18, 2024
Applied

Announcing MongoDB Server 8.0 Platform Support Improvements

Last month at MongoDB.local NYC 2024, we announced the preview of MDB 8.0 , the next evolution of MongoDB’s modern database. With MongoDB 8.0, we’re focused on delivering the unparalleled performance, scalability, and operational resilience necessary to support the creation of next-generation applications. For that to be possible, users must be able to deploy MongoDB on industry-standard operating systems. As a result, we are updating our Server Platform Policy to ensure that customers have the best possible experience when using MongoDB. Starting in MongoDB 8.0, there will be two new changes: When a new major version of MongoDB is released, we will only release it operating system (OS) versions that are fully supported by the vendor for the duration of the MongoDB version’s life. In short, we will support an operating system if the operating system’s Extended Lifecycle Support (ELS) date is after the MongoDB Server’s End of Life (EOL) date. We will release new MongoDB Server versions (both major and minor) on the minimum supported minor version of the OS (defined by the OS vendor). Once an OS minor version is no longer supported by the vendor, we will update future MongoDB Server versions to the next supported OS minor version. As always, MongoDB reserves the right to discontinue support for platforms based on lack of user demand and/or technical difficulties (e.g., if a platform doesn’t support required libraries or compiler features). Ensuring best-in-class security MongoDB routinely updates our documentation to indicate which platforms a new version of the MongoDB Server will be available on with the general availability release of that new server version. To ensure that MongoDB customers can meet strong regulatory and security requirements, our software is developed, released, and distributed in accordance with industry security best practices. Given the mission-critical nature of MongoDB’s business—providing a highly secure, performant data platform to tens of thousands of customers in over 100 countries—we strive to provide strong and consistent security assurances across all of our products. In addition, MongoDB partners also need guarantees about the security development lifecycle of our products so they can provide the best experience to their customers. By ensuring that our software runs only on platform versions that are receiving security patches, we aim to limit the vulnerabilities that might be introduced by customers running EOL operating systems. The significance of this change With every major server release, MongoDB determines the supported builds for that general availability (GA) release according to the planned vendor platform’s end of life date —meaning the MongoDB major release will not support the operating system if the operating system’s extended lifecycle support ends before the MongoDB EOL date. This also applies to server container images delivered to our customers. Furthermore, to guarantee security assurances for operating systems that have a minimum minor version, we will only build new versions of MongoDB Server software on a vendor-supported major/minor version of the operating system. Concretely, we will build new versions of MongoDB on a minimum minor version until it hits a maintenance event (defined on a per-vendor basis), and at that point future MongoDB server builds will be updated to the new supported minor version. Separately, when a vendor publishes a new major version of an operating system after a given version of MongoDB reaches GA, we will evaluate whether the latest MongoDB release will run on this new OS version, or we will wait for the next major MongoDB release before documenting formal platform support on our website. Walkthrough: How it could work for you Consider the RHEL 9 Planning Guide below and the hypothetical release cadence of MongoDB version X.0. As long as version X.0 is released three years before the end of RHEL 9 support, which as noted by RHEL is 2032 , we will provide support on RHEL 9. This means that 2029 will be the last year that MongoDB releases a server version on RHEL 9. Next, consider that version X.0 will be released at the end of 2025. Following the Extended Update Support Plan, we will build version X.0 on RHEL 9.6 until the start of 2026 when RHEL 9.8 becomes available. And then for future versions, MDB X.Y will begin being built on RHEL 9.8 until we require the minimum version to be 9.10 in 2027. RHEL 9 planning guide Building the future Overall, these coming changes to the MongoDB Server Platform Policy underscore MongoDB’s commitment to helping developers innovate quickly and easily while providing an even more highly secure and performant data platform. Stay tuned for additional updates about MongoDB 8.0—which will provide optimal performance by dramatically increasing query performance, improving resilience during periods of heavy load, making scalability easier and more cost-effective, and making time series collections faster and more efficient. For more information about the Server Platform Policy updates, please refer to our documentation .

June 17, 2024
Updates

AI-Powered Media Personalization: MongoDB and Vector Search

In recent years, the media industry has grappled with a range of serious challenges, from adapting to digital platforms and on-demand consumption, to monetizing digital content, and competing with tech giants and new media upstarts. Economic pressures from declining sources of revenue like advertising, trust issues due to misinformation, and the difficulty of navigating regulatory environments have added to the complexities facing the industry. Additionally, keeping pace with technological advancements, ensuring cybersecurity, engaging audiences with personalized and interactive content, and addressing globalization issues all require significant innovation and investment to maintain content quality and relevance. In particular, a surge in digital content has saturated the media market, making it increasingly difficult to capture and retain audience attention. Furthermore, a decline in referral traffic—primarily from social media platforms and search engines—has put significant pressure on traditional media outlets. An industry survey from a sample of more than 300 digital leaders from more than 50 countries and territories shows that traffic to news sites from Facebook fell 48% in 2023, with traffic from X/Twitter declining by 27%. As a result, publishers are seeking ways to stabilize their user bases and to enhance engagement sustainably, with 77% looking to invest more in direct channels to deal with the loss of referrals. Enter artificial intelligence: generative AI-powered personalization has become a critical tool for driving the future of media channels. The approach we discuss here offers a roadmap for publishers navigating the shifting dynamics of news consumption and user engagement. Indeed, using AI for backend news automation ( 56% ) is considered the most important use of the technology by publishers. In this post, we’ll walk you through using MongoDB Atlas and Atlas Vector Search to transform how content is delivered to users. The shift in news consumption Today's audiences rarely rely on a single news source. Instead, they use multiple platforms to stay informed, a trend that's been driven by the rise of social media, video-based news formats, and skepticism towards traditional media due to the prevalence (or fear) of "fake news." This diversification in news sources presents a dilemma for publishers, who have come to depend on traffic from social media platforms like Facebook and Twitter. However, both platforms have started to deprioritize news content in favor of posts from individual creators and non-news content, leading to a sharp decline in media referrals. The key to retaining audiences lies in making content personalized and engaging. AI-powered personalization and recommendation systems are essential tools for achieving this. Content suggestions and personalization By drawing on user data, behavior analytics, and the multi-dimensional vectorization of media content, MongoDB Atlas and Atlas Vector Search can be applied to multiple AI use cases to revolutionize media channels and improve end-user experiences. By doing so, media organizations can suggest content that aligns more closely with individual preferences and past interactions. This not only enhances user engagement but also increases the likelihood of converting free users into paying subscribers. The essence of leveraging Atlas and Vector Search is to understand the user. By analyzing interactions and consumption patterns, the solution not only grasps what content resonates but also predicts what users are likely to engage with in the future. This insight allows for crafting a highly personalized content journey. The below image shows a reference architecture highlighting where MongoDB can be leveraged to achieve AI-powered personalization. To achieve this, you can integrate several advanced capabilities: Content suggestions and personalization: The solution can suggest content that aligns with individual preferences and past interactions. This not only enhances user engagement but also increases the likelihood of converting free users into paying subscribers. By integrating MongoDB's vector search to perform k-nearest neighbor (k-NN) searches , you can streamline and optimize how content is matched. Vectors are embedded directly in MongoDB documents, which has several advantages. For instance: No complexities of a polyglot persistence architecture. No need to extract, transform, and load (ETL) data between different database systems, which simplifies the data architecture and reduces overhead. MongoDB’s built-in scalability and resilience can support vector search operations more reliably. Organizations can scale their operations vertically or horizontally, even choosing to scale search nodes independently from operational database nodes, flexibly adapting to the specific load scenario. Content summarization and reformatting: In an age of information overload, this solution provides concise summaries and adapts content formats based on user preferences and device specifications. This tailored approach addresses the diverse consumption habits of users across different platforms. Keyword extraction: Essential information is drawn from content through advanced keyword extraction, enabling users to grasp key news dimensions quickly and enhancing the searchability of content within the platform. Keywords are fundamental to how content is indexed and found in search engines, and they significantly influence the SEO (search engine optimization) performance of digital content. In traditional publishing workflows, selecting these keywords can be a highly manual and labor-intensive task, requiring content creators to identify and incorporate relevant keywords meticulously. This process is not only time-consuming but also prone to human error, with significant keywords often overlooked or underutilized, which can diminish the content's visibility and engagement. With the help of the underlying LLM, the solution extracts keywords automatically and with high sophistication. Automatic creation of Insights and dossiers: The solution can automatically generate comprehensive insights and dossiers from multiple articles. This feature is particularly valuable for users interested in deep dives into specific topics or events, providing them with a rich, contextual experience. This capability leverages the power of one or more Large Language Models (LLMs) to generate natural language output, enhancing the richness and accessibility of information derived from across multiple source articles. This process is agnostic to the specific LLMs used, providing flexibility and adaptability to integrate with any leading language model that fits the publisher's requirements. Whether the publisher chooses to employ more widely recognized models (like OpenAI's GPT series) or other emerging technologies, our solution seamlessly incorporates these tools to synthesize and summarize vast amounts of data. Here’s a deeper look at how this works: Integration with multiple sources: The system pulls content from a variety of articles and data sources, retrieved with MongoDB Atlas Vector Search. Found items are then compiled into dossiers, which provide users with a detailed and contextual exploration of topics, curated to offer a narrative or analytical perspective that adds value beyond the original content. Customizable output: The output is highly customizable. Publishers can set parameters based on their audience’s preferences or specific project requirements. This includes adjusting the level of detail, the use of technical versus layman terms, and the inclusion of multimedia elements to complement the text. This feature significantly enhances user engagement by delivering highly personalized and context-rich content. It caters to users looking for quick summaries as well as those seeking in-depth analyses, thereby broadening the appeal of the platform and encouraging deeper interaction with the content. By using LLMs to automate these processes, publishers can maintain a high level of productivity and innovation in content creation, ensuring they remain at the cutting edge of media delivery. Future directions As media consumption habits continue to evolve, AI-powered personalization stands out as a vital tool for publishers. By using AI to deliver tailored content and to automate back end processes, publishers can address the decline in traditional referrals and build stronger, more direct relationships with their audiences. If you would like to learn more about AI-Powered Media Personalization, visit the following resources: AI-Powered Personalization to Drive Next-Generation Media Channels AI-Powered Innovation in Telecommunications and Media GitHub Repository : Create a local version of this solution by following the instructions in the repository

June 13, 2024
Artificial Intelligence

Helping MongoDB Customers Unlock Potential with Industry Solutions

Gabriela Preiss is a senior manager within Industry Solutions at MongoDB and was instrumental in building out the team in Barcelona. She’s now relocated to Austin, Texas to build a team in Mexico City and continue expanding the Industry Solutions footprint. In this article, Gabriela shares more about Industry Solutions at MongoDB and how they’re making a difference for both customers and our internal go-to-market teams. The impact of industry solutions teams is not lost on the tech industry. I see our competitors and Big Tech alike becoming more verticalized and providing industry-specific solutions to meet their customers’ needs. In 2019, MongoDB established an Industry Solutions team to understand and address our customers’ industry-specific needs and challenges. In the past two years, our Industry Solutions team has increased by roughly 380% in size and touched over 1,100 customer accounts around the globe. Our tailored, industry-specific solutions and messaging give MongoDB a competitive advantage and lead to higher market penetration, higher customer retention, and increased sales. Not to mention, industry-specific insights gleaned from the field drive internal innovation and product development to propel MongoDB forward. Industry Solutions at MongoDB The Industry Solutions team messages MongoDB as a solution or part of a solution to specific industries. We speak the customer's language and understand their industry needs, industry roadblocks, market trends, and competitors. Our industry experts have been in the shoes of the customers and know how to guide them through modernization. We're an extremely cross-functional team, constantly collaborating with sales, marketing, product, product marketing, and engineering. We work in parallel with sales, and everything we do is ultimately to support them and help drive revenue. This means helping with account prepping, speaking one-on-one with customers, and a lot of sales enablement. Our team holds frequent sales trainings in the form of internal content, office hours, and weekly sessions to coach on industry knowledge. We also create content to help drive MongoDB’s go-to-market messaging externally. This means highlighting MongoDB as an industry solution through blogs, white papers, video content, and, most frequently, interactive solution demos that allow customers to really get their hands on our products. Sharing industry knowledge at MongoDB.local events Skills for success While you don’t necessarily need to be a MongoDB expert to join our Industry Solutions team (we’ll train you on that); it’s beneficial to have foundational technical skills and knowledge, like understanding the concept of a database and how it works. As long as you have the will to learn, we’ll shape you into a MongoDB subject matter expert. I often look for people who have a technical knowledge base, but are also interested in the business solution space. Our team is open for anyone who wants to be part of industry solutions and has the willingness to learn all of the ins and outs of it. In terms of specific skills, I personally think soft skills are the most essential for success. For example, having a willingness to learn, boundless curiosity, a sense of urgency, and a true passion for your work. However, things like strong project management and time management skills come in handy as our team covers many different industries and regions. Each industry has its own nuances, and each team member is involved in many different projects. The ability to own your projects from end-to-end and juggle multiple projects at any given time is crucial. Working with AI AI has become a hot topic, and it's not going away. As a team, we work with industries from financial services to manufacturing, automotive to airline, insurance to telecommunications, and everything in between. AI is affecting every industry, and it's impacting every region. It’s our job to keep our finger on the pulse of industry trends to better enable our sales team and have discussions of modernization with customers. As customers look to implement gen AI applications, it’s important for our team to be able to confidently answer their questions and create relevant content. For the Industry Solutions team, working with gen AI is all about educating ourselves and keeping up with the industry trends to create a competitive advantage for MongoDB. Collaboration never ends! Opportunities for learning and growth There are so many, I wouldn't even know where to begin or end. As far as opportunities for learning, you’ll certainly become a subject matter expert in MongoDB. You’ll have the chance to speak with different customers, participate as a presenter at MongoDB and third-party events, hold internal sales enablement trainings, hone in on your content creation skills, and more. You’ll learn some teaching skills if you join the Industry Solutions team, too. A big part of the role is explaining technical concepts to different audiences and sharing information in a way that enables people to learn. At the end of the day, you’ll be building your brand as an industry expert, building your skills, and building your resume. In my personal experience, I started as a consultant on the team and am now a senior manager. I was able to build out our team in Barcelona, Spain, and I've recently relocated to Austin, Texas to build out a team in Mexico City, which I'm so excited for. It's been amazing to see the team grow. In terms of career advancement, there are tons of opportunities for our team members to grow linearly as individual contributors or into team lead and people manager roles. Plus, because you're getting exposure to so many different teams, it's not uncommon for folks from Industry Solutions to transfer into other departments within MongoDB, or vice versa. It’s something that’s not taboo within the company; it's just a matter of having a conversation with your manager. Rooftop views from MongoDB Barcelona Building a team in Mexico City and beyond It’s a really exciting time as we aim to replicate the team that we've created in Barcelona in Mexico City. For prospective candidates, this is a great opportunity to help us build and shape a team from scratch while working on a diverse set of projects in key regions of the business. I would love to see the Mexico City office grow in the same way our Barcelona office has, which, a few years ago was only 12 people and now contains hundreds. We’ll continue to hire for our team in Barcelona and other locations around the world, too. I’m looking forward to bringing on people who will add unique perspectives to our team and grow their careers at MongoDB for years to come. Join our talent community to stay up to date with MongoDB culture content and career opportunities.

June 12, 2024
Culture

Ready to get Started with MongoDB Atlas?

Start Free