GenAI

56 results

How the NFSA is Using MongoDB Atlas and AI to Make Aussie Culture Accessible

Where can you find everything from facts about Kylie Minogue, to more than 6,000 Australian home movies, to a 60s pop group playing a song with a drum-playing kangaroo ? The NFSA! Founded in 1935, the National Film and Sound Archive of Australia (NFSA) is one of the oldest archives of its kind in the world. It is tasked with collecting, preserving, and sharing Australia’s audiovisual culture. According to its website, the NFSA “represents not only [Australia’s] technical and artistic achievements, but also our stories, obsessions and myths; our triumphs and sorrows; who we were, are, and want to be.” The NFSA’s collection includes petabytes of audiovisual data—including broadcast-quality news footage, TV shows, and movies, high-resolution photographs, radio shows, and video games—plus millions of physical and contextual items like costumes, scripts, props, photographs, and promotional materials, all tucked away in a warehouse. “Today, we have eight petabytes of data, and our data is growing from one to two petabytes each year,” said Shahab Qamar, software engineering manager at NFSA. Making this wealth of data easily accessible to users across Australia (not to mention all over the world) has led to a number of challenges, which is where MongoDB Atlas—which helps developers simplify and accelerate building with data—comes in. Don’t change (but apply a few updates) Because of its broad appeal, the NFSA's collection website alone receives an average of 100,000 visitors each month. When Qamar joined the NFSA in 2020, he saw an opportunity to improve the organization’s web platform. His aim was to ensure the best possible experience for the site’s high number of daily visitors, which had begun to plateau. This included a website refresh, as well as addressing technical issues related to handling site traffic, due to the site being hosted on on-premises servers. The site also wasn’t “optimized for Google Analytics,” said Qamar. In fact, the NFSA website was invisible to Google and other search engines, so he knew it was time for a significant update, which also presented an opportunity to set up strong data foundations to build deeper capabilities down the line. But first, Qamar and team needed to find a setup that could serve the needs of the NFSA and Australia’s 26 million residents more robustly than their previous solution. Specifically, Qamar said, the NFSA was looking for a fully managed database that could also implement search at scale, as well as a system that his small team of five could easily manage. It also needed to ensure high levels of resiliency and the ability to work with more than one cloud provider. The previous NFSA site also didn’t support content delivery networks , he added. MongoDB Atlas supported all of the use cases the NFSA was looking for, Qamar said, including the ability to support multi-cloud hosting. And because Atlas is fully managed, it would readily meet the NFSA's requirements. In July 2023, after months of development, the new and greatly improved NFSA website was launched. The redesign was immediately impactful: Since the NFSA’s redesigned site was launched, the number of users visiting the collection search website has gone up 200%, and content requests—which the NFSA access team responds to on a case-by-case basis—have gone up 16%. (Getting search) back in black While the previous version of the NFSA site included search, the prior functionality was prone to crashing, and the quality of the results was often poor, Qamar said. For example, search results were delivered alphabetically rather than based on relevance, and the previous search didn’t support fine-tuning of relevance based on matches in specific fields. So, as part of its site redesign, the NFSA was looking to add full text search, relevance-based search results, faceting, and pagination. MongoDB Atlas Search —which integrates the database, search engine, and sync mechanism into a single, unified, fully managed platform—ticked all of those boxes. A search results page on the NFSA website Indeed, the NFSA compared search results from its old site to its new MongoDB Atlas site and “found that MongoDB Atlas-based searches were more relevant and targeted,” Qamar said. Previously, configuring site search required manual coding and meant downtime for the site, he noted. “The whole setup wasn’t very developer friendly and, therefore, a barrier to working efficiently with search configuration and fine-tuning,” Qamar said. In comparison, MongoDB Atlas allowed for simple configuration and fine-tuning of the NFSA's search requirements. The NFSA has also been using MongoDB Atlas Charts . Charts help the NFSA easily visualize its collection by custom grouping (like production year or genre), as well as helping the NFSA see which items are most popular with users. “Charts have helped us understand how our collection is growing and evolving over time,” Qamar said. NFSA’s use of MongoDB Charts Can’t get you (AI) out of my head Now, the NFSA—inspired by Qamar’s own training in machine learning and the broad interest in all things AI—is exploring how it can use Atlas Vector Search and generative AI tools to allow users to explore content buried in the NFSA collection. One example cited is putting transcriptions of audiovisual files in NFSA’s collection into a vector database for retrieval-augmented generation (RAG). The NFSA has approximately 27 years worth—meaning, it would take 27 years to play it all back—of material to transcribe, and is currently developing a model to accurately capture the Australian dialect so the work is transcribed correctly. Ultimately, the NFSA is interested in building a RAG-powered AI bot to provide historically and contextually accurate information about work in the NFSA’s archive. The NFSA is also exploring how it can use RAG to deliver accurate, conversation-like search results without training large language models itself, and whether it can leverage AI to help restore some of the older videos in its collection. Qamar and team are also interested in vectorizing audio-visual material for semantic analysis and genre-based classification of collection material at scale, he said. “Historically, we’ve been very metadata-driven and keyword-driven, and I think that’s a missed opportunity. Because when we talk about what an archive does, we archive stories,” Qamar said of the possibilities offered by vectors. “An example I use is, what if the world ended tomorrow? And what if aliens came to Earth and only saw our metadata, what image of Australia would they see? Is that a true image of what Australia is really like?” Qamar said. “How content is described is important, but content’s imagery, the people in it, and the audio and words being spoken are really important. Full-text search can take you somewhere along the way, but vector search allows you to look things up in a semantic manner. So it’s more about ideas and concepts than very specific keywords,” he said. If you’re interested in learning how MongoDB helps accelerate and simplify time-to-mission for federal, state, and local governments, defense agencies, education, and across the public sector, check out MongoDB for Public Sector . Check out MongoDB Atlas Vector Search to learn more about how Vector Search helps organizations like the NFSA build applications powered by semantic search and gen AI. *Note that this story’s subheads come from Australian song titles!

May 14, 2024

Search PDFs at Scale with MongoDB and Nomic

Data is only valuable if it’s accessible. For example, storing photos, audio files, or PDFs without the ability to extract information from them is like keeping junk in your basement, thinking you might need it someday. The problem is finding what you need to dig through your junk when the day comes. Until now, companies have followed a similar approach to unstructured data : store everything in data lakes for future use. But whether it’s junk in a basement or data in a data lake, the result is the same: accessibility is hard or impossible. However, the latest advancements in AI have disrupted this status quo. AI can effectively and efficiently compare similar objects by generating a vector representation or embedding a data object. This capability has revolutionized industries by enabling faster and more precise search, categorization, and recommendation systems than ever before. Whether it's being used to compare text, documents, images, or complex patterns in data, embeddings allow for nuanced interpretations and connections that were impossible with traditional methods. By taking advantage of AI, users can uncover insights and make unprecedented speed and accuracy decisions. A particularly interesting use case is PDF search, since every company in the world deals with PDFs in one way or another. While PDFs allow portability across platforms and operating systems, most PDF readers only allow for basic exact-match queries. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. PDF search powered by MongoDB and Nomic Enter MongoDB and Nomic: MongoDB Atlas Vector Search with Nomic Embed equips organizations with a powerful and affordable AI-powered search solution for large PDF collections. A machine learning company specializing in explainable and accessible AI, Nomic Embed is the company’s flagship text embedding model with out-of-the-box features suitable for scalable PDF search. Its features include: Long context: Nomic Embed breaks new ground by supporting a long context length of 8192 tokens, exceeding the standard 2048. This extended context makes the model ideal for real-world applications that involve processing large PDFs and documents. High throughput: While achieving top performance on the MTEB embedding benchmark, Nomic Embed is smaller than similarly performing models. At only 137 million parameters and 548MB, Nomic Embed enables high-throughput embedding generation for data-heavy workflows or streaming applications. Flexible storage: Nomic Embed provides adjustable embedding size via Matryoshka representation learning. Users can freely choose to store the first 64, 128, 256, or 512 embedding dimensions out of the full 768, depending on their project requirements. Smaller embedding sizes come at a minimal performance loss while providing lower storage costs and faster computing benefits. To put Nomic Embed’s abilities in context, consider a company that processes a high volume of PDFs—say 100,000 documents per month—with an average length of 20 pages each. To improve database retrieval speed, these documents can be partitioned into smaller chunks, such as 2 pages per chunk (see Figure 1 below). Assuming a full page typically contains around 500 words, each document chunk would consist of approximately 1000 words. Figure 1: PDF chunking, embedding creation with Nomic, and storage into MongoDB Embedding models process words as numerical tokens where a general rule of thumb is 3/4 word = 1 token. One embedding is more than sufficient to represent a document chunk in this case, as 4/3 * 1000 tokens fit nicely in Nomic Embed’s long context window. A PDF search application for this company would require 100,000 PDFs x 10 chunks = 1,000,000 embeddings. Benchmarked on Nomic’s AWS Sagemaker real-time inference offering on a single GPU ml.g5.xlarge instance, the total runtime is under 4 hours for a total of $15.60 per month. A similar performing embedding model, such as OpenAI’s text-embedding-3-small, costs $26.66 per month to generate the same number of embeddings. Once the embeddings are stored in MongoDB Atlas, it’s possible to create an Atlas Vector Search index to unlock their potential. Building a PDF search application at this point becomes straightforward. The query text is vectorized, and the embedding is fed to Atlas Vector Search to retrieve similar vectors. The result is a list of the most semantically similar sections of the PDF relevant to the original text. This is a significant leap forward compared to a simple “ctrl-f” search, as it captures meaning rather than just keyword matches. This process can be further improved by implementing a retrieval-augmented generation (RAG) pipeline, combining Atlas Vector Search and a large language model (LLMs). As shown in Figure 2, this approach allows users to ask questions in natural language about the content of the PDF. The relevant documents are then fed to the LLM as context, and the AI is able to provide structured answers by leveraging knowledge about the data. Figure 2: Retrieval Augmented Generation flow with Nomic In a nutshell, Nomic and MongoDB provide the building blocks for advanced RAG applications, equipping developers with a cost-effective and integrated toolset. Seamless integration, supercharged search: Nomic Embeddings in MongoDB Atlas MongoDB Atlas seamlessly ingests Nomic embeddings with its flexible document storage format. Depending on the application, embeddings and additional metadata can be neatly stored together or separately in MongoDB collections. MongoDB Atlas and Nomic Embed are both available as AWS Marketplace offerings for same-VPC deployments. MongoDB Atlas Stream Processing is a perfect fit for Nomic Embed’s high throughput capabilities. Incoming data streams are robustly processed and can be combined with MongoDB Database Triggers to generate embeddings for immediate downstream use. Given Nomic Embed’s lightweight nature and offline capabilities (via private or local deployments from open source), embeddings can be produced and ingested into MongoDB at extremely rapid transfer rates. MongoDB Atlas Vector Search delivers a fast and accessible method to leverage Nomic embeddings for semantic search . MongoDB Atlas Vector Search lets you combine these fast vector search queries with traditional database queries on various metadata, providing a flexible and powerful analytics tool for data insights, user recommendations, and more. Industry use cases PDFs are ubiquitous. In one way or another, every company in the world needs to extract and analyze PDF content to make business decisions or comply with regulations. Let’s have a look at some industry use cases: Financial services The financial services industry is constantly bombarded with essential updates, including market data, financial statements, and regulatory changes. Some of this information such as financial statements, annual reports, and regulatory filings, resides in PDF format. Efficient and reliable navigation through these documents is crucial for gaining a competitive edge in investment decision-making. For example, investors scrutinize key financial metrics such as revenue growth, profit margins, and cash flow trends extracted from income statements, balance sheets, and cash flow statements. They use this information to compare them between companies, gauging their strategic direction, risks, and competitive positioning before investing. However, accessing and extracting data from these PDFs can be a time-consuming challenge, hindering agility in the fast-paced financial landscape. Here, semantic search for financial PDFs offers a dramatic improvement in information discovery. By leveraging semantic search technology, which interprets the intent and contextual meaning behind a search query, FSI professionals can significantly enhance their ability to find relevant information. This applies equally to the broader financial industry, including areas like market analysis, performance evaluation, and many more. Retail In the retail industry, the challenge of processing hundreds of thousands of invoices from numerous suppliers annually is a common scenario. Most invoices are in PDF format, and the challenge arises from the combination of invoice volume and the variability in layouts and languages from one supplier to another. This makes manual processing impractical and error-prone. The question becomes: how can retailers automate this end-to-end process efficiently and accurately? The answer lies in solutions that utilize advanced technologies like AI and PDF search capabilities. By leveraging these solutions, retailers can automatically scan invoices, extract relevant data, and validate it against purchase orders and received goods. Moreover, these solutions offer the flexibility to adapt to different invoice layouts without the need for templates, ensuring scalability and efficiency gains. With increased automation rates and improved accuracy levels, retailers can shift focus from low-value manual tasks to more strategic initiatives, accelerating their digital transformation journey and unlocking significant cost savings along the way. Manufacturing & motion There are vast amounts of unstructured data contained in PDFs across the Manufacturing and Automotive industries, from machine instruction booklets to production or maintenance guidelines, Six Sigma best practices, production results, and team lead annotations. All this valuable data must be shared, read, and stored manually, introducing significant friction when it comes to leveraging its full potential. With MongoDB Atlas Vector Search, manufacturing companies have the opportunity to completely revive this data and make real use of it in their day-to-day operations, all while reducing the time spent managing these manuals and having everything ready to be accessed. It is as simple as vectorizing the documents, uploading them to MongoDB Atlas, and connecting a RAG-enabled application to this data source. With this, operators in a manufacturing plant can describe a problem to a smart interface and ask how to troubleshoot it. The interface will retrieve the specific parts of the manual that show how to address the issue. Moreover, it can also retrieve notes from previous operators, team leaders, or previous troubleshooting efforts, providing a very rich context and accelerating the problem-solving process. PDF RAG-enabled applications in manufacturing open up a wide range of operational improvements that directly benefit the company's bottom line. PDF search at scale In today’s data-driven world, extracting insights from unstructured data like PDFs is challenging. Traditional search methods fall short, but advancements in AI like, Nomic Embed, have revolutionized PDF search. By leveraging MongoDB with Nomic Embed, organizations gain a powerful and cost-effective AI-powered solution for large PDF collections. Nomic Embed’s extensive context, high throughput capabilities, and MongoDB’s seamless integration and powerful analytics enable efficient and reliable PDF search applications. This translates to enhanced data accessibility, faster decision-making, and improved operational efficiency. Don't waste time struggling with traditional PDF search! Apply for an innovation workshop to discuss what’s possible with our industry experts. If you would like to discover more about MongoDB and GenAI: Building a RAG LLM with Nomic Embed and MongoDB From Relational Databases to AI: An Insurance Data Modernization Journey

April 30, 2024

Building AI with MongoDB: Conversation Intelligence with Observe.AI

What's really happening in your business? The answer to that question lies in the millions of interactions between your customers and your brand. If you could listen in on every one of them, you'd know exactly what was up--and down. You’d also be able to continuously improve customer service by coaching agents when needed. However, the reality is that most companies have visibility in only 2% of their customer interactions. Observe.AI is here to change that. The company is focused on being the fastest way to boost contact center performance with live conversation intelligence. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Founded in 2017 and headquartered in California, Observe.AI has raised over $200m in funding. Its team of 250+ members serves more than 300 organizations across various industries. Leading companies like Accolade, Pearson, Public Storage, and 2U partner with Observe.AI to accelerate outcomes from the frontline to the rest of the business. The company has pioneered a 40 billion-parameter contact center large language model (LLM) and one of the industry’s most accurate Generative AI engines. Through these innovations, Observe.AI provides analysis and coaching to maximize the performance of its customers’ front-line support and sales teams. We sat down with Jithendra Vepa, Ph.D, Chief Scientist & India General Manager at Observe.AI to learn more about the AI stack powering the industry-first contact center LLM. Can you start by describing the AI/ML techniques, algorithms, or models you are using? “Our products employ a versatile range of AI and ML techniques, covering various domains. Within natural language processing (NLP), we rely on advanced algorithms and models such as transformers, including the likes of transformer-based in-house LLMs, for text classification, intent and entity recognition tasks, summarization, question-answering, and more. We embrace supervised, semi-supervised, and self-supervised learning approaches to enhance our models' accuracy and adaptability." "Additionally, our application extends its reach into speech processing, where we leverage state-of-the-art methods for tasks like automatic speech recognition and sentiment analysis. To ensure our language capabilities remain at the forefront, we integrate the latest Large Language Models (LLMs), ensuring that our application benefits from cutting-edge natural language understanding and generation capabilities. Our models are trained using contact center data to make them domain-specific and more accurate than generic models out there.” Can you share more on how you train and tune your models? “In the realm of model development and training, we leverage prominent frameworks like TensorFlow and PyTorch. These frameworks empower us to craft, fine-tune, and train intricate models, enabling us to continually improve their accuracy and efficiency." "In our natural language processing (NLP) tasks, prompt engineering and meticulous fine-tuning hold pivotal roles. We utilize advanced techniques like transfer learning and gradient-based optimization to craft specialized NLP models tailored to the nuances of our tasks." How do you operationalize and monitor these models? "To streamline our machine learning operations (MLOps) and ensure seamless scalability, we have incorporated essential tools such as Docker and Kubernetes. These facilitate efficient containerization and orchestration, enabling us to deploy, manage, and scale our models with ease, regardless of the complexity of our workloads." "To maintain a vigilant eye on the performance of our models in real-time, we have implemented robust monitoring and logging to continuously collect and analyze data on model performance, enabling us to detect anomalies, address issues promptly, and make data-driven decisions to enhance our application's overall efficiency and reliability.” The role of MongoDB in Observe.AI technology stack The MongoDB developer data platform gives the company’s developers and data scientists a unified solution to build smarter AI applications. Describing how they use MongoDB, Jithendra says “OBSERVE.AI processes and runs models on millions of support touchpoints daily to generate insights for our customers. Most of this rich, unstructured data is stored in MongoDB. We chose to build on MongoDB because it enables us to quickly innovate, scale to handle large and unpredictable workloads, and meet the security requirements of our largest enterprise customers.” Getting started Thanks so much to Jithendra for sharing details on the technology stack powering Observe.AI’s conversation intelligence and MongoDB’s role. To learn more about how MongoDB can help you build AI-enriched applications, take a look at the MongoDB for Artificial Intelligence page. Here, you will find tutorials, documentation, and whitepapers that will accelerate your journey to intelligent apps.

April 29, 2024

Building AI With MongoDB: Integrating Vector Search And Cohere to Build Frontier Enterprise Apps

Cohere is the leading enterprise AI platform, building large language models (LLMs) which help businesses unlock the potential of their data. Operating at the frontier of AI, Cohere’s models provide a more intuitive way for users to retrieve, summarize, and generate complex information. Cohere offers both text generation and embedding models to its customers. Enterprises running mission-critical AI workloads select Cohere because its models offer the best performance-cost tradeoff and can be deployed in production at scale. Cohere’s platform is cloud-agnostic. Their models are accessible through their own API as well as popular cloud managed services, and can be deployed on a virtual private cloud (VPC) or even on-prem to meet companies where their data is, offering the highest levels of flexibility and control. Cohere’s leading Embed 3 and Rerank 3 models can be used with MongoDB Atlas Vector Search to convert MongoDB data to vectors and build a state-of-the-art semantic search system. Search results also can be passed to Cohere’s Command R family of models for retrieval augmented generation (RAG) with citations. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. A new approach to vector embeddings It is in the realm of embedding where Cohere has made a host of recent advances. Described as “AI for language understanding,” Embed is Cohere’s leading text representation language model. Cohere offers both English and multilingual embedding models, and gives users the ability to specify the type of data they are computing an embedding for (e.g., search document, search query). The result is embeddings that improve the accuracy of search results for traditional enterprise search or retrieval-augmented generation. One challenge developers faced using Embed was that documents had to be passed one by one to the model endpoint, limiting throughput when dealing with larger data sets. To address that challenge and improve developer experience, Cohere has recently announced its new Embed Jobs endpoint . Now entire data sets can be passed in one operation to the model, and embedded outputs can be more easily ingested back into your storage systems. Additionally, with only a few lines of code, Rerank 3 can be added at the final stage of search systems to improve accuracy. It also works across 100+ languages and offers uniquely high accuracy on complex data such as JSON, code, and tabular structure. This is particularly useful for developers who rely on legacy dense retrieval systems. Demonstrating how developers can exploit this new endpoint, we have published the How to use Cohere embeddings and rerank modules with MongoDB Atlas tutorial . Readers will learn how to store, index, and search the embeddings from Cohere. They will also learn how to use the Cohere Rerank model to provide a powerful semantic boost to the quality of keyword and vector search results. Figure 1: Illustrating the embedding generation and search workflow shown in the tutorial Why MongoDB Atlas and Cohere? MongoDB Atlas provides a proven OLTP database handling high read and write throughput backed by transactional guarantees. Pairing these capabilities with Cohere’s batch embeddings is massively valuable to developers building sophisticated gen AI apps. Developers can be confident that Atlas Vector Search will handle high scale vector ingestion, making embeddings immediately available for accurate and reliable semantic search and RAG. Increasing the speed of experimentation, developers and data scientists can configure separate vector search indexes side by side to compare the performance of different parameters used in the creation of vector embeddings. In addition to batch embeddings, Atlas Triggers can also be used to embed new or updated source content in real time, as illustrated in the Cohere workflow shown in Figure 2. Figure 2: MongoDB Atlas Vector Search supports Cohere’s batch and real time workflows. (Image courtesy of Cohere) Supporting both batch and real-time embeddings from Cohere makes MongoDB Atlas well suited to highly dynamic gen AI-powered apps that need to be grounded in live, operational data. Developers can use MongoDB’s expressive query API to pre-filter query predicates against metadata, making it much faster to access and retrieve the more relevant vector embeddings. The unification and synchronization of source application data, metadata, and vector embeddings in a single platform, accessed by a single API, makes building gen AI apps faster, with lower cost and complexity. Those apps can be layered on top of the secure, resilient, and mature MongoDB Atlas developer data platform that is used today by over 45,000 customers spanning startups to enterprises and governments handling mission-critical workloads. What's next? To start your journey into gen AI and Atlas Vector Search, review our 10-minute Learning Byte . In the video, you’ll learn about use cases, benefits, and how to get started using Atlas Vector Search.

April 25, 2024

Collaborating to Build AI Apps: MongoDB and Partners at Google Cloud Next '24

From April 9 to April 11, Las Vegas became the center of the tech world, as Google Cloud Next '24 took over the Mandalay Bay Convention Center—and the convention’s spotlight shined brightest on gen AI. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Between MongoDB’s big announcements with Google Cloud (which included an expanded collaboration to enhance building, scaling, and deploying GenAI applications using MongoDB Atlas Vector Search and Vertex AI ), industry sessions, and customer meetings, we offered in-booth lightning talks with leaders from four MongoDB partners—LangChain, LlamaIndex, Patronus AI, and Unstructured—who shared valuable insights and best practices with developers who want to embed AI into their existing applications or build new-generation apps powered by AI. Developing next-generation AI applications involves several challenges, including handling complex data sources, incorporating structured and unstructured data, and mitigating scalability and performance issues in processing and analyzing them. The lightning talks at Google Cloud Next ‘24 addressed some of these critical topics and presented practical solutions. One of the most popular sessions was from Harrison Chase , co-founder and CEO at LangChain , an open-source framework for building applications based on large language models (LLMs). Harrison provided tips on fixing your retrieval-augmented generation (RAG) pipeline when it fails, addressing the most common pitfalls of fact retrieval, non-semantic components, conflicting information, and other failure modes. Harrison recommended developers use LangChain templates for MongoDB Atlas to deploy RAG applications quickly. Meanwhile, LlamaIndex —an orchestration framework that integrates private and public data for building applications using LLMs—was represented by Simon Suo , co-founder and CTO, who discussed the complexities of advanced document RAG and the importance of using good data to perform better retrieval and parsing. He also highlighted MongoDB’s partnership with LlamaIndex, allowing for ingesting data into the MongoDB Atlas Vector database and retrieving the index from MongoDB Atlas via LlamaParse and LlamaCloud . Guillaume Nozière - Patronus AI Andrew Zane - Unstructured Amidst so many booths, activities, and competing programming, a range of developers from across industries showed up to these insightful sessions, where they could engage with experts, ask questions, and network in a casual setting. They also learned how our AI partners and MongoDB work together to offer complementary solutions to create a seamless gen AI development experience. We are grateful for LangChain, LlamaIndex, Patronus AI, and Unstructured's ongoing partnership. We look forward to expanding our collaboration to help our joint customers build the next generation of AI applications. To learn more about building AI-powered apps with MongoDB, check out our AI Resources Hub and stop by our Partner Ecosystem Catalog to read about our integrations with these and other AI partners.

April 23, 2024

A Smarter Factory Floor with MongoDB Atlas and Google Cloud's Manufacturing Data Engine

The manufacturing industry is undergoing a transformative shift from traditional to digital, propelled by data-driven insights, intelligent automation, and artificial intelligence. Traditional methods of data collection and analysis are no longer sufficient to keep pace with the demands of today's competitive landscape. This is precisely where Google Cloud’s Manufacturing Data Engine (MDE) and MongoDB Atlas come into play, offering a powerful combination for optimizing your factory floor. Unlock the power of your factory data MDE is positioned to transform the factory floor with the power of cloud-driven insights. MDE simplifies communication with your factory floor, regardless of the diverse protocols your machines might use. It effortlessly connects legacy equipment with modern systems, ensuring a comprehensive data stream. MDE doesn't just collect data, it transforms it. By intelligently processing and contextualizing the information, you gain a clearer picture of your production processes in real-time with a historical pretext. It offers pre-built analytics and AI tools directly addressing common manufacturing pain points. This means you can start finding solutions faster, whether it's identifying bottlenecks, reducing downtime, or optimizing resource utilization. Conveniently, it also offers great support for integrations that can further enhance the potential of the data (e.g. additional data sinks). The MongoDB Atlas developer data platform enhances MDE by providing scalability and flexibility through automated scaling to adapt to evolving data requirements. This makes it particularly suitable for dynamic manufacturing environments. MongoDB’s document model can handle diverse data types and structures effortlessly while being incredibly flexible because of its native JSON format. This allows for enriching MDE data with other relevant data, such as supply chain logistics, for a deeper understanding of the factory business. You can gain immediate insights into your operations through real-time analytics, enabling informed decision-making based on up-to-date data. While MDE offers a robust solution for collecting, contextualizing, and managing industrial data, leveraging it alongside MongoDB Atlas offers tremendous advantages Inside the MDE integration Google Cloud’s Manufacturing Data Engine (MDE) acts as a central hub for your factory data. It not only processes and enriches the data with context, but also offers flexible storage options like BigQuery and Cloud Storage. Now, customers already using MongoDB Atlas can skip the hassle of application re-integration and make this data readily accessible for applications. Through this joint solution developed by Google Cloud and MongoDB, you can seamlessly move the processed streaming data from MDE to MongoDB Atlas using Dataflow jobs. MDE publishes the data via a Pub/Sub subscription, which is then picked up by a custom Dataflow job built by MongoDB. This job transforms the data into the desired format and writes it to your MongoDB Atlas cluster. Google Cloud’s MDE and MongoDB Atlas utilize compatible data structures, simplifying data integration through a shared semantic configuration. Once the data resides in MongoDB Atlas, your existing applications can access it seamlessly without any code modifications, saving you time and effort. The flexibility of MDE, combined with the scalability and ease of use of MongoDB Atlas, makes this a powerful and versatile solution for various data-driven use cases such as predictive maintenance and quality control, while still providing factory ownership of the data. Instructions on how to set up the dataflow job are available in the MongoDB github repository. Conclusion If you want to level up your manufacturing data analytics, pairing MDE with MongoDB Atlas provides a proven, easy-to-implement solution. It's easy to get started with MDE and MongoDB Atlas .

April 9, 2024

Workload Isolation for More Scalability and Availability: Search Nodes Now on Google Cloud

May 2, 2024: Announcing Search Nodes in preview on Microsoft Azure Today we’re excited to take the next step in bringing scalable, dedicated architecture to your search experiences with the introduction of Atlas Search Nodes, now in general availability for Google Cloud. This post is also available in: Deutsch , Français , Español , Português , Italiano , 한국어 , 简体中文 . Since our initial announcement of Search Nodes in June of 2023, we’ve been rapidly accelerating access to the most scalable dedicated architecture, starting with general availability on AWS and now expanding to general availability on Google Cloud. We'd like to give you a bit more context on what Search Nodes are and why they're important to any search experience running at scale. Search Nodes provide dedicated infrastructure for Atlas Search and Vector Search workloads to enable even greater control over search workloads. They also allow you to isolate and optimize compute resources to scale search and database needs independently, delivering better performance at scale and higher availability. One of the last things developers want to deal with when building and scaling apps is having to worry about infrastructure problems. Any downtime or poor user experiences can result in lost users or revenue, especially when it comes to your database and search experience. This is one of the reasons developers turn to MongoDB, given the ease of use of having one unified system for your database and search solution. With the introduction of Atlas Search Nodes, we’ve taken the next step in providing our builders with ultimate control, giving them the ability to remain flexible by scaling search workloads without the need to over-provision the database. By isolating your search and database workloads while at the same time automatically keeping your search cluster data synchronized with operational data, Atlas Search and Atlas Vector Search eliminate the need to run a separate ETL tool, which takes time and effort to set up and is yet another fail point for your scaling app. This provides superior performance and higher availability while reducing architectural complexity and wasted engineering time recovering from sync failures. In fact, we’ve seen a 40% to 60% decrease in query time for many complex queries, while eliminating the chances of any resource contention or downtime. With just a quick button click, Search Nodes on Google Cloud offer our existing Atlas Search and Vector Search users the following benefits: Higher availability Increased scalability Workload isolation Better performance at scale Improved query performance We offer both compute-heavy search-specific nodes for relevance-based text search, as well as a memory-optimized option that is optimal for semantic and retrieval augmented generation (RAG) production use cases with Atlas Vector Search. This makes resource contention or availability issues a thing of the past. Search Nodes are easy to opt into and set up — to start, jump on into the MongoDB UI and follow the steps do the following: Navigate to your “Database Deployments” section in the MongoDB UI Click the green “+Create” button On the “Create New Cluster” page, change the radio button for Google Cloud for “Multi-cloud, multi-region & workload isolation” to enable Toggle the radio button for “Search Nodes for workload isolation” to enable. Select the number of nodes in the text box Check the agreement box Click “Create cluster” For existing Atlas Search users, click “Edit Configuration” in the MongoDB Atlas Search UI and enable the toggle for workload isolation. Then the steps are the same as noted above. Jump straight into our docs to learn more!

March 28, 2024

利用工作负载隔离提高可扩展性和可用性:Search Nodes 现已在 Google Cloud 上提供

今天,我们很高兴地宣布 Atlas Search Nodes(公开预览版)现已在 Google Cloud 上提供,这离我们针对搜索体验提供可扩展的专用架构这个目标更进了一步。 自 2023 年 6 月首次宣布推出 Search Nodes 以来,我们一直在加快这个最具可扩展性的专用架构的应用速度, 先是在 AWS 上正式发布 ,现在又在 Google Cloud 上发布了它的公开预览版。让我们简单介绍一下什么是 Search Nodes,以及它为何对任何大规模运行的搜索体验非常重要。 Search Nodes 可为 Atlas Search 和 Vector Search 工作负载提供专用基础架构,让您能够对搜索工作负载拥有更大的控制力度。通过隔离并优化计算资源来独立地扩展搜索和数据库需求,从而大规模提升性能并实现更高的可用性。 在构建和扩展应用时,开发者最不愿处理的一件事情就是要担心基础架构问题。任何停机或不佳的用户体验都可导致用户流失或收入受损,在涉及数据库和搜索体验时,这种影响尤为明显。这也是开发者纷纷转向 MongoDB 的原因之一,因为它可以让开发者为数据库和搜索解决方案使用一个统一的系统。 随着 Atlas Search Nodes 的推出,我们在为构建者提供最大控制力度方面又迈出了重要一步。现在,构建者可以扩展搜索工作负载,而无需过度预配数据库,因此能够保持灵活性。利用 Atlas Search 和 Atlas Vector Search,您可以在隔离搜索和数据库工作负载的同时,自动保持搜索集群数据与操作数据的同步。这样,您就无需运行单独的 ETL 工具,也就不用耗费时间和精力进行额外设置,从而避免在扩展应用时出错。这有助于提升性能和可用性,同时降低架构复杂性,以及减少从同步失败事件中恢复所耗费的工程时间。事实上,我们已经看到许多复杂查询的查询时间减少了 40% - 60%,资源争用或停机问题也得到了解决。 只需切换一下按钮,Google Cloud 上的 Search Nodes 就能为使用 Atlas Search 和 Vector Search 的用户提供以下优势: 更高的可用性 更强的可扩展性 工作负载隔离 大规模提升性能 更好的查询性能 我们为基于相关性的文本搜索提供计算密集型且特定于搜索的节点,同时还提供内存优化选项,该选项最适合使用 Atlas Vector Search 的语义和 RAG 生产用例。这解决了一直以来存在的资源争用或可用性问题。 启用和设置 Search Nodes 非常简单,只需前往 MongoDB 用户界面并执行以下操作: 前往 MongoDB 用户界面中的“数据库部署”部分 单击绿色的“+创建”按钮 在“创建新集群”页面上,将 Google Cloud 的“多云、多区域和工作负载隔离”单选按钮切换至“开启” 将“用于工作负载隔离的 Search Nodes”单选按钮切换至“开启”。在文本框中选择节点数 勾选协议框 单击“创建集群” 对于使用 Atlas Search 的用户,请单击 MongoDB Atlas Search 用户界面中的“修改配置”,并开启工作负载隔离的切换开关。后续步骤与之前所述步骤相同。 直接跳转至我们的文档以了解更多信息 !

March 28, 2024

Building AI With MongoDB: How DevRev is Redefining CRM for Product-Led Growth

OneCRM from DevRev is purpose-built for Software-as-a-Service (SaaS) companies. It brings together previously separate customer relationship management (CRM) suites for product management, support, and software development. Built on a foundation of customizable large language models (LLMs), data engineering, analytics, and MongoDB Atlas , it connects end users, sellers, support, product owners, and developers. OneCRM converges multiple discrete business apps and teams onto a common platform. As the company states on its website “Our mission is to connect makers (Dev) to customers (Rev) . When every employee adopts a “product-thinking” mindset, customer-centricity transcends from a department to become a culture.” DevRev was founded in October 2020 and raised over $85 million in seed funding from investors such as Khosla Ventures and Mayfield. At the time, this made it the largest seed in the history of Silicon Valley. The company is led by its co-founder and CEO, Dheeraj Pandey, who was previously the co-founder and CEO of Nutanix, and by Manoj Agarwal, DevRev's co-founder and former SVP of Engineering at Nutanix. DevRev is headquartered in Palo Alto and has offices in seven global locations. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. CRM + AI: Digging into the stack DevRev’s Support and Product CRM serve over 4,500 customers: Support CRM brings support staff, product managers, and developers onto an AI-native platform to automate Level 1 (L1), assist L2, and elevate L3 to become true collaborators. Product CRM brings product planning, software work management, and product 360 together so product teams can assimilate the voice of the customer in real-time. Figure 1: DevRev’s real-time dashboards empower product teams to detect at-risk customers, monitor product health, track development velocity, and more. AI is central to both the Support and Product CRMs. The company’s engineers build and run their own neural networks, fine-tuned with application data managed by MongoDB Atlas. This data is also encoded by open-source embedding models where it is used alongside OpenAI models for customer support chatbots and question-answering tasks orchestrated by autonomous agents. MongoDB partner LangChain is used to call the models, while also providing a layer of abstraction that frees DevRev engineers to effortlessly switch between different generative AI models as needed. Data flows across DevRev’s distributed microservices estate and into its AI models are powered by MongoDB change streams . Downstream services are notified in real-time of any data changes using a fully reactive, event-driven architecture. MongoDB Atlas: AI-powered CRM on an agile and trusted data platform MongoDB is the primary database backing OneCRM, managing users, customer and product data, tickets, and more. DevRev selected MongoDB Atlas from the very outset of the company. The flexibility of its data model, freedom to run anywhere, reliability and compliance, and operational efficiency of the Atlas managed service all impact how quickly DevRev can build and ship high-quality features to its customers. The flexibility of the document data model enables DevRev’s engineers to handle the massive variety of data structures their microservices need to work with. Documents are large, and each can have many custom fields. To efficiently store, index, and query this data, developers use MongoDB’s Attribute pattern and have the flexibility to add, modify, and remove fields at any time. The freedom to run MongoDB anywhere helps the engineering team develop, test, and release faster. Developers can experiment locally, then move to integration testing, and then production — all running in different environments — without changing a single line of code. This is core to DevRev’s velocity in handling over 4,000 pull requests per month: Developers can experiment and test with MongoDB on local instances — for example adding indexes or evaluating new query operators, enabling them to catch issues earlier in the development cycle. Once unit tests are complete, developers can move to temporary instances in Docker containers for end-to-end integration testing. When ready, teams can deploy to production in MongoDB Atlas. The multi-cloud architecture of Atlas provides flexibility and choice that proprietary offerings from the hyperscalers can’t match. While DevRev today runs on AWS, in the early days of the company, they evaluated multiple cloud vendors. Knowing that MongoDB Atlas could run anywhere gave them the confidence to make a choice on the platform, knowing they would not be locked into that choice in the future. With MongoDB Atlas, our development velocity is 3-4x higher than if we used alternative databases. We can get our innovations to market faster, providing our customers with even more modern and useful CRM solutions. Anshu Avinash, Founding Engineer, DevRev The HashiCorp Terraform MongoDB Atlas Provider automates infrastructure deployments by making it easy to provision, manage, and control Atlas configurations as code. “The automation provided by Atlas and Terraform means we’ve avoided having to hire a dedicated infrastructure engineer for our database layer,” says Anshu. “This is a savings we can redirect into adding developers to work on customer-facing features.” Figure 2: The reactive, event-driven microservices architecture underpinning DevRev’s AI-powered CRM platform Anshu goes on to say, “We have a microservices architecture where each microservice manages its own database and collections. By using MongoDB Atlas, we have little to no management overhead. We never even look at minor version upgrades, which Atlas does for us in the background with zero downtime. Even the major version upgrades do not require any downtime, which is pretty unique for database systems.” Discussing scalability, Anshu says, “As the business has grown, we have been able to scale Atlas, again without downtime. We can move between instance and cluster sizes as our workloads expand, and with auto-storage scaling, we don’t need to worry about disks getting full.” DevRev manages critical customer data, and so relies on MongoDB Atlas’ native encryption and backup for data protection and regulatory compliance. The ability to provide multi-region databases in Atlas means global customers get further control over data residency, latency, and high availability requirements. Anshu goes on to say, “We also have the flexibility to use MongoDB’s native sharding to scale-out the workloads of our largest customers with complete tenant isolation.” DevRev is redefining the CRM market through AI, with MongoDB Atlas playing a critical role as the company’s data foundation. You can learn more about how innovators across the world are using MongoDB by reviewing our Building AI case studies . If your team is building AI apps, sign up for the AI Innovators Program . Successful companies get access to free Atlas credits and technical enablement, as well as connections into the broader AI ecosystem.

March 27, 2024

Fireworks AI and MongoDB: The Fastest AI Apps with the Best Models, Powered By Your Data

We’re happy to announce that Fireworks AI and MongoDB are now partnering to make innovating with generative AI faster, more efficient, and more secure. Fireworks AI was founded in late 2022 by industry veterans from Meta’s PyTorch team, where they focused on performance optimization, improving the developer experience, and running AI apps at scale. This post is also available in: Deutsch , Français , Español , Português , Italiano , 한국어 , 简体中文 . It’s this expertise that Fireworks AI brings to its production AI platform, curating and optimizing the industry's leading open models. Benchmarking by the company shows gen AI models running on Fireworks AI deliver up to 4x faster inference speeds than alternative platforms, with up to 8x higher throughput and scale. Models are one part of the application stack. But for developers to unlock the power of gen AI, they also need to bring enterprise data to those models. That’s why Fireworks AI has partnered with MongoDB, addressing one of the toughest challenges to adopting AI. With MongoDB Atlas , developers can securely unify operational data, unstructured data, and vector embeddings to safely build consistent, correct, and differentiated AI applications and experiences. Jointly, Fireworks AI and MongoDB provide a solution for developers who want to leverage highly curated and optimized open-source models, and combine these with their organization’s own proprietary data — and to do it all with unparalleled speed and security. Lightning-fast models from Fireworks AI: Enabling speed, efficiency, and value Developers can choose from many different models to build their gen AI-powered apps. Navigating the AI landscape to identify the most suitable models for specific tasks — and tuning them to achieve the best levels of price and performance — is complex and creates friction in building and running gen AI apps. This is one of the key pain points that Fireworks AI alleviates. With its lightning-fast inference platform, Fireworks AI curates, optimizes, and deploys 40+ different AI models. These optimizations can simultaneously result in significant cost savings , reduced latency , and improved throughput. Their platform delivers this via: Off-the-shelf models, optimized models, and add-ons: Fireworks AI provides a collection of top-quality text, embedding, and image foundation models . Developers can leverage these models or fine-tune and deploy their own, pairing them with their own proprietary data using MongoDB Atlas. Fine-tuning capabilities : To further improve model accuracy and speed, Fireworks AI also offers a fine-tuning service using its CLI to ingest JSON-formatted objects from databases such as MongoDB Atlas. Simple interfaces and APIs for development and production: The Fireworks AI playground allows developers to interact with models right in a browser. It can also be accessed programmatically via a convenient REST API. This is OpenAI API-compatible and thus interoperates with the broader LLM ecosystem. Cookbook: A simple and easy-to-use cookbook provides a comprehensive set of ready-to-use recipes that can be adapted for various use cases, including fine-tuning, generation, and evaluation. Fireworks AI and MongoDB: Setting the standard for AI with curated, optimized, and fast models With Fireworks AI and MongoDB Atlas, apps run in isolated environments ensuring uptime and privacy, protected by sophisticated security controls that meet the toughest regulatory standards: As one of the top open-source model API providers, Fireworks AI serves 66 billion tokens per day (and growing). With Atlas, you run your apps on a proven platform that serves tens of thousands of customers, from high-growth startups to the largest enterprises and governments. Together, the Fireworks AI and MongoDB joint solution enables: Retrieval-augmented generation (RAG) or Q&A from a vast pool of documents: Ingest a large number of documents to produce summaries and structured data that can then power conversational AI. Classification through semantic/similarity search: Classify and analyze concepts and emotions from sales calls, video conferences, and more to provide better intelligence and strategies. Or, organize and classify a product catalog using product images and text. Images to structured data extraction: Extract meaning from images to produce structured data that can be processed and searched in a range of vision apps — from stock photos, to fashion, to object detection, to medical diagnostics. Alert intelligence: Process large amounts of data in real-time to automatically detect and alert on instances of fraud, cybersecurity threats, and more. Figure 1: The Fireworks tutorial showcases how to bring your own data to LLMs with retrieval-augmented generation (RAG) and MongoDB Atlas Getting started with Fireworks AI and MongoDB Atlas To help you get started, review the Optimizing RAG with MongoDB Atlas and Fireworks AI tutorial, which shows you how to build a movie recommendation app and involves: MongoDB Atlas Database that indexes movies using embeddings. (Vector Store) A system for document embedding generation. We'll use the Fireworks embedding API to create embeddings from text data. (Vectorisation) MongoDB Atlas Vector Search responds to user queries by converting the query to an embedding, fetching the corresponding movies. (Retrieval Engine) The Mixtral model uses the Fireworks inference API to generate the recommendations. You can also use Llama, Gemma, and other great OSS models if you like. (LLM) Loading MongoDB Atlas Sample Mflix Dataset to generate embeddings (Dataset) We can also help you design the best architecture for your organization’s needs. Feel free to connect with your account team or contact us here to schedule a collaborative session and explore how Fireworks AI and MongoDB can optimize your AI development process.

March 26, 2024

Transforming Industries with MongoDB and AI: Telecommunications and Media

This is the second in a six-part series focusing on critical AI use cases across the manufacturing and motion, financial services, retail, telecommunications and media, insurance, and healthcare industries. Read part one here. The telecommunications industry operates in a landscape characterized by tight profit margins, particularly in commoditized communication and connectivity services where differentiation is minimal. With offerings such as voice, data, and internet access being largely homogeneous, telecom companies need to differentiate and diversify revenue streams to create value and stand out in the market. As digital natives disrupt traditional business models with agile and innovative approaches, established companies are not only competing among themselves but also with newcomers to deliver enhanced customer experiences and adapt to evolving consumer demands. To thrive in an environment where advanced connectivity is increasingly expected, telecom operators must prioritize cost efficiency in their Operations Support Systems (OSS) and Business Support Systems (BSS), elevate customer service standards, and enhance overall customer experiences to secure market share and gain a competitive edge. They’re not alone — media publishers, too, must streamline operations through automation while strengthening reader relationships to foster a willingness to pay for personalized and relevant content. Service assurance Telecommunications providers need to deliver network services at optimal quality and performance levels to meet customer expectations and service level agreements. Key aspects of service assurance include performance monitoring, quality of service (QoS) management, and predictive analytics to anticipate potential service degradation or network failures before they occur. With the increasing complexity of telecommunications networks and the growing expectations of customers for high-quality, always-on services, a new bar has been set for service assurance, requiring companies to invest heavily in solutions that can automate and optimize these processes and maintain a competitive edge. Service assurance is revolutionized by artificial intelligence (AI) through several key capabilities: Machine learning (ML) can be a powerful foundation for predictive maintenance, analyzing patterns, and predicting network failures before they occur, allowing for preemptive maintenance and significantly reducing downtime; AI techniques can also sift through complex network systems to accurately identify the root causes of issues, improving the effectiveness of troubleshooting efforts; and, with network optimization, analyzing log data to identify opportunities for improvement, raising efficiency and thus reducing operational costs and optimizing network performance in real-time. MongoDB Atlas ’s JSON-based document model is the ideal data foundation to underpin intelligent applications. It enables developers to store log data from various systems without the need for time-intensive upfront data normalization efforts and with the flexibility to deal with a wide variety of different data structures, even as they change over time. By vectorizing the data with an appropriate ML model, it's possible to reflect the healthy system state and identify log information that shows abnormal system behavior. Atlas Vector Search allows for conducting the required K-Nearest Neighbors (KNN) search in an effective way and as a fully included service of the MongoDB Atlas developer data platform . Finally, using LLM, information about the error, including the analysis of the root cause, can be expressed in natural language, making the job of understanding and fixing the problem much easier for the staff who are in charge of maintenance. Fraud detection and prevention Telecom providers today are utilizing an advanced array of techniques for detecting and preventing fraud, constantly adjusting to the dynamic nature of threat actors. Routine activities for detecting fraud consist of tracking unusual call trends and data usage, along with safeguarding against SIM swap incidents, a method frequently used for identity theft. To prevent fraud, strategies are applied at various levels, starting with stringent verification for new customers during SIM swaps or for transactions with elevated risk, taking into account the unique risk profile of each customer. Machine learning offers telecommunications companies a powerful tool to enhance their fraud detection and prevention capabilities by training ML models on historical data like call detail records (CDR). Moreover, these algorithms can assess the individual risk profile of each customer, tailoring detection and prevention strategies to their specific patterns of use. The models can adapt over time, learning from new data and emerging fraud tactics, thus enabling real-time detection and the automation of fraud prevention measures, reducing manual checks, and speeding up response times. To succeed in fraud detection, many data dimensions need to be considered, making the reaction time a critical factor in preventing the worst things from happening. So, the solution must also support fast, sub-second decisions. By vectorizing the data with an appropriate ML model, normal (healthy) business can be defined, and in turn, deviations from the norm identified, such as suspicious user activities. In addition to Atlas Vector Search, the MongoDB Query API supports stream processing , simplifying data ingestion from various sources and detecting fraud in real-time. Content discovery Today’s media organizations are expected to offer a high degree of content personalization, from streaming services to online publications and more. Viewers want intelligently selected and suggested content tailored to their interests. Using AI can significantly enhance the process of suggesting the next best article to read or show to stream. The most powerful implementations of content personalization track the behavior of the user, such as what content was searched for, how long was content displayed before the next click happened, and the categories the search falls under. Based on these parameters, similar content can be presented, or, as an alternative strategy, content from unseen areas of the portal so the user may discover new types of media and decide if they like it. To bring the right content to the right people at the right time, an automated system needs to maintain a multitude of information facets, which will lay the foundation for proper suggestions. With MongoDB and its document model, all required data points can be easily and flexibly stored in a user’s profile, in content, and in media. Ultimately, by vectorizing the content, an even more powerful system of content suggestions can be built with Atlas Vector Search, which allows for a similarity search that goes well beyond comparing just keywords or a list of attributes. Other notable use cases Differential Pricing: Gather insights into what customers are willing to spend on content or a service by conducting A/B tests and analyzing the data with an ML algorithm. This method facilitates the adoption of dynamic pricing models instead of sticking to a standard price list, thereby enhancing revenue and increasing the paying customer base. Content Summarization and Reformatting: Design a smart assistant tailored for writers, capable of providing automatic suggestions for content summaries, identifying suitable SEO keywords, and adapting articles for various specific audiences. Search Generative Experiences (SGE): Provide more dynamic, personalized, and contextually relevant search results, thus making information retrieval not only more efficient but also more engaging and useful. This can include personalization and summarization elements, as well. In conclusion, the telecommunications industry faces challenges of differentiation and revenue diversification amidst commoditized services and disruptive market forces. To thrive, telecom operators must prioritize cost efficiency, elevate customer service, and enhance experiences. Leveraging AI, MongoDB Atlas offers solutions like service assurance, fraud detection, and content discovery, empowering companies to navigate the complexities of the digital landscape, innovate, and deliver value-added services. From predictive maintenance to personalized content recommendations, MongoDB Atlas stands as a foundational tool for telecom and media companies, driving efficiency, agility, and competitiveness in a rapidly evolving market. Learn more about AI use cases for top industries in our new white paper, “ How Leading Industries are Transforming with AI and MongoDB Atlas .”

March 22, 2024

Introducing Semantic Caching and a Dedicated MongoDB LangChain Package for Gen AI Apps

We are in an unprecedented time in history where developers can build transformative AI applications quickly, without being AI experts themselves. This ability is enabling new classes of applications that can better serve customers with conversational AI for assistance and automation, advanced reasoning and analysis using AI-powered retrieval, and recommendation systems. Behind this revolution are large language models (LLMs) that can be prompted to solve for a wide range of use cases. However, LLMs have various limitations, like knowledge cutoff and a tendency to hallucinate. To overcome these limitations, they must be integrated with proprietary enterprise data sources to build reliable, relevant, and high-quality generative AI applications. That’s where MongoDB plays a critical role in the modern generative AI stack. Developers use MongoDB Atlas Vector Search as a vital part of the generative AI technique known as retrieval-augmented generation (RAG). RAG is the process of feeding LLMs the supplementary data necessary to ground their responses, ensuring they're dependable and precise. LangChain has been a critical part of this journey since the public launch of Atlas Vector Search, enabling developers to build better retriever systems powered by vector search and store conversation history in the operational database. Today, we are excited to announce support for two enhancements: Semantic cache powered by Atlas vector search, which improves the performance of your apps A dedicated LangChain-MongoDB package for Python and JS/TS developers, enabling them to build advanced applications even more efficiently The MongoDB Atlas integration with LangChain can now power all the database requirements for building modern generative AI applications: vector search, semantic caching (currently only available in Python), and conversation history. Earlier, we announced the launch of MongoDB LangChain Templates , which enable the developers to quickly deploy RAG applications, and provided a reference implementation of a basic RAG template using MongoDB Atlas Vector Search and OpenAI and a more advanced Parent-document Retrieval RAG template using MongoDB Atlas Vector Search. We are excited about our partnership with LangChain and will continue innovating. Improve LLM application performance with semantic cache Semantic cache improves the performance of LLM applications by caching responses based on the semantic meaning or context within the queries themselves. This is different from a traditional cache that works based on exact keyword matching. In the era of LLM the value of semantic cache is increasing tremendously, enabling sophisticated user experiences that closely mimic human interactions. For example, if two different users enter two different prompts, “give me suggestions for a comedy movie” and “recommend a comedy movie”, the semantic cache can understand that the intent behind the queries are same and return a similar response, even though different keywords are used, whereas a traditional cache will fail. Figure 1: Semantic cache using MongoDB Atlas Vector Search Check out this video walkthrough for the semantic cache: Accelerate development with a dedicated package With a dedicated LangChain-MongoDB package, MongoDB is even more deeply integrated with LangChain. The Python and Javascript packages contain the following LangChain Integrations: MongoDBAtlasVectorSearch ( Vector stores ) and MongoDBChatMessageHistory ( Chat Messages Memory ). In addition, the Python package includes the MongoDBAtlasSemanticCache ( LLM Caching ). The new package langchain-mongodb contains all the MongoDB-specific implementations and needs to be installed separately from langchain, which includes all the core abstractions. Earlier, everything was in the same package, making it challenging to correctly version and communicate what version should be used and whether any breaking changes were made. Find out more about the langchain-mongodb package: Python: Source code , LangChain docs , MongoDB docs Javascript: Source code , LangChain.js docs , MongoDB docs Get started today Check out this accompanying tutorial and notebook on building advanced RAG with MongoDB and LangChain, which contains a walkthrough and use cases for using semantic cache, vector search, and chat message history. Check out the “ PDFtoChat ” app to see langchain-mongodb JS in action. It allows you to have a conversation with your proprietary PDFs using AI and is built with MongoDB Atlas, LangChain.js, and TogetherAI. It’s an end-to-end SaaS-in-a-box app and includes user authentication, saving PDFs, and saving chats per PDF. Read the excellent overview of semantic caching using LangChain and MongoDB.

March 20, 2024