Using Generative AI and MongoDB to Tackle Cybersecurity’s Biggest Challenges

Mat Keep and Lena Smart
March 13, 2024 | Updated: October 6, 2025

This post is also available in: Deutsch, Français, Español, Português, Italiano, 한국어, 简体中文.

In the ever-evolving landscape of cybersecurity, organizations face a multitude of challenges that demand innovative solutions harnessing cutting-edge technologies.

One of the most pressing issues is the increasing sophistication of cyber threats, including malware, ransomware, and phishing attacks, which are becoming more difficult to detect and mitigate. Additionally, the rapid expansion of digital infrastructures has widened the attack surface, making it harder for security teams to monitor and protect every entry and egress point. Another significant challenge is the shortage of skilled cybersecurity professionals — estimated by independent surveys to number around 4 million staff worldwide¹ — which leaves many organizations vulnerable to attack.

These challenges underscore the need for advanced technologies that can augment human efforts to secure digital assets and data.

How can generative AI help?

Generative AI (gen AI) has emerged as a powerful tool in addressing these cybersecurity challenges. By leveraging large language models (LLMs) to generate new data or patterns based on existing datasets, generative AI can provide innovative solutions in several key areas:

Enhanced threat detection and response

Generative AI can be used to create simulations of cyber threats, including sophisticated malware and phishing attacks. These simulations can help in training machine learning models to detect new and evolving threats more accurately.

Furthermore, gen AI can aid in the development of automated response systems that react to threats in real time. While this will never eliminate the need for human oversight, it will reduce the need for manual intervention and toil, allowing for quicker mitigation of attacks. For example, with the appropriate oversight it can automatically apply patches to vulnerable systems or adjust firewall rules to block attack vectors. This automated rapid response capability is particularly valuable in mitigating zero-day vulnerabilities, where the window between the discovery of a vulnerability and its exploitation by attackers can be very short.

Actionable learnings from security event postmortems

In the aftermath of a cybersecurity incident, conducting a thorough postmortem analysis is crucial for understanding what happened, why it happened, and how similar events can be prevented in the future.

Generative AI can play a pivotal role in this process by synthesizing and summarizing complex data from a multitude of sources, including logs, network traffic, and security alerts. By analyzing this data, gen AI can identify patterns and anomalies that may have contributed to the security breach, offering insights that might be overlooked by human analysts due to the sheer volume and complexity of the information.

Furthermore, it can generate comprehensive reports that highlight key findings, causative factors, and potential vulnerabilities, streamlining the postmortem process. This capability not only accelerates the recovery and learning process but also enables organizations to implement more effective remediation strategies, ultimately strengthening their cybersecurity posture.

Generating synthetic data for deep model training

The shortage of real-world data for training cybersecurity systems is a significant hurdle. Gen AI can create realistic, synthetic data sets that mirror genuine network traffic and user behavior without exposing sensitive information.

This synthetic data can be used to train detection systems, improving their accuracy and effectiveness without compromising privacy or security.

Automating phishing detection

Phishing remains one of the most common attack vectors. Gen AI can analyze patterns in phishing emails and websites, generating models that predict and detect phishing attempts with high accuracy.

By integrating these models into email systems and web browsers, organizations can automatically filter out phishing content, protecting users from potential threats.

Putting it all together: The opportunities and the risks

Generative AI holds the promise of transforming cybersecurity practices by automating complex processes, enhancing threat detection and response, and providing a deeper understanding of cyber threats. As the industry continues to integrate gen AI into cybersecurity strategies, it's crucial to remain vigilant about the ethical use of this technology and the potential for misuse.

Nevertheless, the benefits it offers in strengthening digital defenses are undeniable, making it an invaluable asset in the ongoing battle against cyber threats.

How does MongoDB help?

With MongoDB, your development teams can build and deploy robust, correct, and differentiated real-time cyber defenses faster, and at any scale.

To understand how MongoDB does this, consider that the AI technology stack comprises three layers:

The underlying compute (GPUs) and LLMs
The tooling to fine-tune models along with the tooling for in-context learning and inference against the trained models
The AI applications and related end-user experiences

MongoDB operates at the second layer of the stack. It enables customers to bring their own proprietary data to any LLM running on any computing infrastructure to build gen AI-powered cybersecurity applications.

MongoDB does this by addressing the hardest problems when adopting gen AI for cybersecurity. MongoDB Atlas securely unifies operational data, unstructured data, and vector data in a single, fully managed multi-cloud platform, avoiding the need to copy and sync data between different systems. MongoDB’s document-based architecture also allows development teams to easily model relationships between your application data and vector embeddings. This allows deeper and faster analytics and insights against security-related data.

MongoDB’s open architecture is integrated with a rich ecosystem of AI developer frameworks, LLMs, and embedding providers. This, combined with our industry-leading multi-cloud capabilities, allows your development teams the flexibility to move quickly and avoid lock-in to any particular cloud provider or AI technology in this rapidly evolving space.

Check out our AI learning hub to learn more about building AI-powered apps with MongoDB.

Applying gen AI and MongoDB to real-world cybersecurity applications

Threat intelligence

ExTrac utilizes AI-powered analytics and MongoDB Atlas to predict public safety risks by analyzing data from thousands of sources. The platform initially helped Western governments foresee conflicts but is expanding to enterprises for reputational management and more.

MongoDB's document data model allows ExTrac to manage complex data efficiently, enhancing real-time threat identification. Atlas Vector Search aids in augmenting language models and managing vector embeddings for texts, images, and videos, speeding up feature development. This approach enables ExTrac to efficiently model trends, track evolving narratives, and predict risk for its customers, leveraging the flexibility and power of MongoDB to handle data of any shape and structure. Learn more in our ExTrac case study.

CyberSec assessments

VISO TRUST leverages AI to streamline the assessment of third-party cyber risks, making complex vendor security information quickly accessible for informed decision-making.

Utilizing Amazon Bedrock and MongoDB Atlas, VISO TRUST's platform automates the due diligence of vendor security, significantly reducing the workload for security teams. Its AI-powered approach involves artifact intelligence that classifies security documents, detects organizations, and predicts security control locations within artifacts. MongoDB Atlas hosts text embeddings for a dense retrieval system that enhances the accuracy of LLMs through retrieval-augmented generation (RAG), providing instant, actionable security insights. This innovative use of technology enables VISO TRUST to offer rapid, scalable cyber risk assessments, boasting significant reductions in work and time for enterprises like InstaCart and Upwork.

MongoDB's flexible document database and Atlas Vector Search play critical roles in managing and querying vast amounts of data, supporting VISO TRUST's mission to deliver comprehensive cyber risk intelligence. Learn more in our Viso Trust case study.

Steps to get started

Generative AI powered by LLMs augmented with your own operational data encoded as vector embeddings is opening up many new possibilities in cyber security. If you want to learn more about the technology and its possibilities, take a look at our Atlas Vector Search learning byte. In just 10 minutes you’ll get an overview of different use cases and how to get started.

Head over to our quick-start guide to get started with Atlas Vector Search today.

¹ Hill, M. (2023, April 10). Cybersecurity workforce shortage reaches 4 million despite significant recruitment drive. CSO.

← Previous

How MongoDB Enables Digital Twins in the Industrial Metaverse

The integration of MongoDB into the metaverse marks a pivotal moment for the manufacturing industry, unlocking innovative use cases across design and prototyping, training and simulation, and maintenance and repair. MongoDB's powerful capabilities — combined with Augmented Reality (AR) or Virtual Reality (VR) technologies — are reshaping how manufacturers approach these critical aspects of their operations, while also enabling the realization of innovative product features. But first: What is the metaverse, and why is it so important to manufacturers? We often use the term, "digital twin" to refer to a virtual replication of the physical world. It is commonly used for simulations and documentation. The metaverse goes one step further: Not only is it a virtual representation of a physical device or a complete factory, but the metaverse also reacts and changes in real time to reflect a physical object’s condition. The advent of the industrial metaverse over the past decade has given manufacturers an opportunity to embrace a new era of innovation, one that can enhance collaboration, visualization, and training. The industrial metaverse is also a virtual environment that allows geographically dispersed teams to work together in real-time. Overall, the metaverse transforms the way individuals and organizations interact to produce, purchase, sell, consume, educate, and work together. This paradigm shift is expected to accelerate innovation and affect everything from design to production across the manufacturing industry. Here are some of the ways the metaverse — powered by MongoDB — is having an impact on manufacturing. Design and prototyping Design and prototyping processes are at the core of manufacturing innovation. Within the metaverse, engineers and designers can collaborate seamlessly using VR, exploring virtual spaces to refine and iterate on product designs. MongoDB's flexible document-oriented structure ensures that complex design data, including 3D models and simulations, is efficiently stored and retrieved. This enables real-time collaboration, accelerating the design phase while maintaining the precision required for manufacturing excellence. Training and simulation Taking a digital twin and connecting it to physical assets enables training beyond traditional methods and provides immersive simulations in the metaverse that enhance skill development for manufacturing professionals. VR training, powered by MongoDB's capacity to manage diverse data types — such as time-series, key-values and events — enables realistic simulations of manufacturing environments. This approach allows workers to gain hands-on experience in a safe virtual space, preparing them for real-world challenges without affecting production cycles. Gamification is also one of the most effective ways to learn new things. MongoDB's scalability ensures that training data, including performance metrics and user feedback, is efficiently handled to continuously enlarge the training modules and the necessary resources for the ever-increasing amount of data. Maintenance and repair Maintenance and repair operations are streamlined through AR applications within the metaverse. The incorporation of AR and VR technologies into manufacturing processes amplifies the user experience, making interactions more intuitive and immersive. Technicians equipped with AR devices can access real-time information overlaid onto physical equipment, providing step-by-step guidance for maintenance and repairs. MongoDB's support for large volumes of diverse data types, including multimedia and spatial information, ensures a seamless integration of AR and VR content. This not only enhances the visual representation of data from the digital twin and the physical asset but also provides a comprehensive platform for managing the vast datasets generated during AR and VR interactions within the metaverse. Additionally, MongoDB's geospatial capabilities come into play, allowing manufacturers to manage and analyze location-based data for efficient maintenance scheduling and resource allocation. The result is reduced downtime through more efficient maintenance and improved overall operational efficiency. From the digital twin to metaverse with MongoDB The advantages of a metaverse for manufacturers are enormous, and according to Deloitte many executives are confident the industrial metaverse “ will transform research and development, design, and innovation, and enable new product strategies .” However, the realization is not easy for most companies. Challenges include managing system overload, handling vast amounts of data from physical assets, and creating accurate visualizations. The metaverse must also be easily adaptable to changes in the physical world, and new data from various sources must be continuously implemented seamlessly. Given these challenges, having a data platform that can contextualize all the data generated by various systems and then feed that to the metaverse is crucial. That is where MongoDB Atlas , the leading modern database, comes in, providing synchronization capabilities between physical and virtual worlds, enabling flexible data modeling, and providing access to the data via a unified query interface as seen in Figure 1. Figure 1: MongoDB connecting to a physical & virtual factory Generative AI with Atlas Vector Search With MongoDB Atlas, customers can combine three systems — database, search engine, and sync mechanisms — into one, delivering application search experiences for metaverse users 30% to 50% faster . Atlas powers use cases such as similarity search, recommendation engines, Q&A systems, dynamic personalization, and long-term memory for large language models (LLMs). Vector data is integrated with application data and seamlessly indexed for semantic queries, enabling customers to build easier and faster. MongoDB Atlas enables developers to store and access operational data and vector embeddings within a single unified platform. With Atlas Vector Search , users can generate information for maintenance, training, and all the other use cases from all possible information that is accessible. This information can come from text files such as Word, from PDFs, and even from pictures or sound streams from which an LLM then generates an accurate semantic answer. It’s no longer necessary to keep dozens of engineers busy, just creating useful manuals that are outdated at the moment a production line goes through first commissioning. Figure 2: Atlas Vector Search Transforming the manufacturing industry with MongoDB In the digital twin and metaverse-driven future of manufacturing, MongoDB emerges as a linchpin, enabling cost-effective virtual prototyping, enhancing simulation capabilities, and revolutionizing training processes. The marriage of MongoDB with AR and VR technologies creates a symbiotic relationship, fostering innovation and efficiency across design, training, and simulation. As the manufacturing industry continues its journey into the metaverse, the partnership between MongoDB and virtual technologies stands as a testament to the transformative power of digital integration in shaping the future of production. Learn more about how MongoDB is helping organizations innovate with the industrial metaverse by reading how we Build a Virtual Factory with MongoDB Atlas in 5 Simple Steps , how IIoT data can be integrated in 4 steps into MongoDB, or how MongoDB drives Innovations End-To-End in the whole Manufacturing Chain .

March 12, 2024

Next →

MongoDB.local San Francisco 2026: Ship Production AI, Faster

Today at MongoDB.local San Francisco, we announced capabilities that collapse the distance between AI prototype and production. Building AI applications means solving real problems: keeping conversational context clean and queryable, retrieving the right information from thousands of past interactions, connecting AI agents to your data without custom plumbing. These aren't theoretical challenges, they're the friction points that slow teams down every day. The AI era demands more from your data platform. MongoDB gives you everything you need to build quickly. Voyage AI: the best gets better Embedding models can make or break AI search experiences. We're proud that voyage-3-large has been the world's top-performing embedding model on Hugging Face's RTEB benchmark since its inception. But we didn’t rest on our laurels. There’s a new model at the top of the charts. Today, we're pleased to announce that the Voyage 4 model family is now generally available. The best just got better. The voyage-4 series models operate in a shared embedding space, allowing for cross-model compatibility and unprecedented flexibility to optimize for accuracy, speed, or cost. This release also includes voyage-4-nano, our first open-weight model available on HuggingFace, perfect for local development. Additionally, we're launching the new voyage-multimodal-3.5 model, which has been specifically trained to support video content alongside text and images. For developers building multimodal AI applications, this represents a significant leap forward in handling diverse content types within a single retrieval system. Best of all, upgrading is remarkably straightforward—you can simply change the model parameter to "voyage-multimodal-3.5" in your API call, instantly unlocking video capabilities without needing to refactor your existing codebase or change your application architecture. Finally, we’re announcing the public preview of the Embedding and Reranking API on MongoDB Atlas, providing API support for Voyage AI models. While enabling standalone usage of the models with any technology stack, the API benefits from the robust security and scalability standards of MongoDB. By bringing critical components into a single control plane and interface, it eliminates the need to manage separate vendors and significantly reduces operational overhead. Automated Embedding, convenience built into MongoDB Community Persistence matters. An AI with amnesia isn’t helpful; users need systems to remember context from minutes, hours, and weeks ago. Every interaction is a goldmine of preferences, patterns, and behavior that should make the next interaction smarter. But storing conversation history in a database isn't enough. Simple storage solves nothing if you can't retrieve the right information at the right time. The real challenge is intelligent retrieval: finding relevant context across thousands of past interactions, filtered by metadata and user attributes, without your system buckling under production load. This is where vector search becomes critical—enabling semantic search that captures meaning, not just keywords, while operating on your real-time operational data. And this is where MongoDB's approach eliminates a major pain point: the need to sync data between separate systems for vectors and application data. Until now, generating and storing these vectors required overhead—development time, infrastructure management, and cognitive load. No longer. We're introducing Automated Embedding for MongoDB Community Edition in public preview. MongoDB Community Edition now handles the complexity of managing embedding models automatically, giving developers high-accuracy semantic search in the database while maintaining flexibility to use any LLM provider or orchestration framework. Automated Embedding offers one-click automatic embedding directly inside MongoDB, which eliminates the need to sync data and manage external models. It’s an easy way to get high quality embedding natively. Best-in-class retrieval shouldn't require infrastructure work—Automated Embedding in MongoDB Vector Search delivers on that promise. Automated Embedding in MongoDB Vector Search is available now in Community Edition, with Atlas access coming soon. Precise text filtering for advanced search use cases Today, we announced the launch of Lexical Prefilters for Vector Search. This addresses a long-standing request from developers building semantic search interfaces who need advanced text filtering alongside vector operations. The new syntax enables powerful text filtering capabilities—fuzzy matching, phrase search, wildcards, and geospatial filtering—as prefilters for vector search. This leverages full text analysis capabilities while maintaining the semantic power of vector search. We've introduced a new vector data type in $search index definitions and a vectorSearch operator within the $search aggregation stage to make this work seamlessly. This replaces the knnBeta operator with a cleaner, more powerful approach. For teams already using lexical and vector search together, this provides a simplified migration path with significantly expanded capabilities. Intelligent assistance wherever you work MongoDB’s intelligent assistant is generally available in MongoDB Compass. The assistant provides in-app guidance for debugging connection errors, optimizing query performance, and learning best practices, all without leaving your development environment. You can even query your database using natural language through read-only database tools that require your approval before execution, allowing for deeper contextual awareness of your data. The assistant was built to address real friction: developers switching between multiple tools and documentation tabs, waiting for support responses, or getting generic advice from general-purpose AI chatbots that don't understand MongoDB-specific contexts. Now, tailored guidance is available instantly, right where you're working. The modernized Atlas Data Explorer interface brings the Compass experience directly into the Atlas web UI, addressing a critical gap for teams with security policies that restrict desktop application usage. Users can now perform sophisticated query development, optimization, bulk operations, and complex aggregations—all with AI assistance—across all MongoDB Atlas clusters in a unified web interface. Whether you're troubleshooting a connection issue, optimizing a slow query, or learning how to structure an aggregation pipeline, the intelligent assistant delivers MongoDB-specific expertise without context switching. Try the intelligent assistant in the modernized Atlas Data Explorer now. The engine behind MongoDB Search and Vector Search is now available under SSPL Finally, mongot, the engine powering MongoDB Search and Vector Search, is now publicly available under SSPL. While still in preview, after years of development and investment, we're making the source code of this core technology available to the community, expanding our unified search architecture beyond Atlas to every MongoDB deployment. mongot runs separately from mongod, MongoDB's core database process, and is the foundation that makes powerful search native to MongoDB. Releasing mongot under SSPL means full transparency for security audits and debugging complex edge cases. Developers can dive into mongot's architecture, understand how search and vector operations work under the hood, and help shape the future of search at MongoDB. A modern data platform that evolves with your needs These announcements reflect our commitment to anticipating what developers need as AI development matures. Vector search, time series, stream processing, queryable encryption, Atlas itself—we've consistently delivered on emerging requirements. "If you're building an early-stage company that is going to scale very rapidly, you need a database solution that isn't going to break under the load of a huge volume of users," said Eno Reyes, Co-founder and CTO of Factory. "You need a fast-moving team with a reliable solution, and there really is one option in this space—and it's MongoDB." Rabi Shanker Guha, CEO of Thesys, put it this way: “MongoDB helps us move fast in an ever-changing world. The best database is the one you don’t have to think about—it just works exactly where and how you need it. That’s MongoDB for us.” Ship faster, scale confidently Each capability we announced today addresses real friction in the AI development workflow and in the developer experience. We're not asking developers to choose between structured data and vectors, between performance and flexibility, or between rapid iteration and production readiness. The promise is straightforward: ship faster, scale confidently, and focus on what makes your AI application unique—not on managing database infrastructure. In an ecosystem crowded with point solutions and retrofitted legacy systems, MongoDB is a modern data platform built for the long haul.

January 15, 2026