sam-rossi-&-kevin-yeh---engineering-interns

2926 results

Introducing: Multi-Kubernetes Cluster Deployment Support

Resilience and scalability are critical for today's production applications. MongoDB and Kubernetes are both well known for their ability to support those needs to the highest level. To better enable developers using MongoDB and Kubernetes, we’ve introduced a series of updates and capabilities that makes it easier to manage MongoDB across multiple Kubernetes clusters. In addition to the previously released support for running MongoDB replica sets and Ops Manager across multiple Kubernetes clusters, we're excited to announce the public preview release of support for Sharded Clusters spanning multiple Kubernetes clusters (GA to follow in November 2024). Support for deployment across multiple Kubernetes clusters is facilitated through the Enterprise Kubernetes Operator. As a recap for anyone unaware, the Enterprise Operator automates the deployment, scaling, and management of MongoDB clusters in Kubernetes. It simplifies database operations by handling tasks such as backups, upgrades, and failover, ensuring consistent performance and reliability in the Kubernetes environment. Multi-Kubernetes cluster deployment support enhances availability, resilience, and scalability for critical MongoDB workloads, empowering developers to efficiently manage these workloads within Kubernetes. This approach unlocks the highest level of availability and resilience by allowing shards to be located closer to users and applications, increasing geographical flexibility and reducing latency for globally distributed applications. Deploying replica sets across multiple Kubernetes clusters MongoDB replica sets are engineered to ensure high availability, data redundancy, and automated failover in database deployments. A replica set consists of multiple MongoDB instances—one primary and several secondary nodes—all maintaining the same dataset. The primary node handles all write operations, while the secondary nodes replicate the data and are available to take over as primary if the original primary node fails. This architecture is critical for maintaining continuous data availability, especially in production environments where downtime can be costly. Support for deploying MongoDB replica sets across multiple Kubernetes clusters helps ensure this level of availability for MongoDB-based applications running in Kubernetes. Deploying MongoDB replica sets across multiple Kubernetes clusters enables you to distribute your data, not only across nodes in the Kubernetes cluster, but across different clusters and geographic locations, ensuring that the rest of your deployments remain operational (even if one or more Kubernetes clusters or locations fail) and facilitating faster disaster recovery. To learn more about how to deploy replica sets across multiple Kubernetes clusters using the Enterprise Kubernetes Operator, visit our documentation . Sharding MongoDB across multiple Kubernetes clusters While replica sets duplicate data for resilience (and higher read rates), MongoDB sharded clusters divide the data up between shards, each of which is effectively a replica set, providing resilience for each portion of the data. This helps your database handle large datasets and high-throughput operations since each shard has a primary member handling write operations to that portion of the data; this allows MongoDB to scale up the write throughput horizontally, rather than requiring vertical scaling of every member of a replica set. In a Kubernetes environment, these shards can now be deployed across multiple Kubernetes clusters, giving higher resilience in the event of a loss of a Kubernetes cluster or an entire geographic location. This also offers the ability to locate shards in the same region as the applications or users accessing that portion of the data, reducing latency and improving user experience. Sharding is particularly useful for applications with large datasets and those requiring high availability and resilience as they grow. Support for sharding MongoDB across multiple Kubernetes clusters is currently in public preview and will be generally available in November. Deploying Ops Manager across multiple Kubernetes clusters Ops Manager is the self-hosted management platform that supports automation, monitoring, and backup of MongoDB on your own infrastructure. Ops Manager's most critical function is backup, and deploying it across multiple Kubernetes clusters greatly improves resilience and disaster recovery for your MongoDB deployments in Kubernetes. With Ops Manager distributed across several Kubernetes clusters, you can ensure that backups of deployments remain robust and available, even if one Kubernetes cluster or site fails. Furthermore, it allows Ops Manager to efficiently manage and monitor MongoDB deployments that are themselves distributed across multiple clusters, improving resilience and simplifying scaling and disaster recovery. To learn more about how to deploy Ops Manager across multiple Kubernetes clusters using the Enterprise Kubernetes Operator, visit our documentation . To leverage multi-Kubernetes-cluster support, you can get started with the Enterprise Kubernetes Operator .

October 10, 2024

Building Gen AI with MongoDB & AI Partners | September 2024

Last week I was in London for MongoDB.local London —the 19th stop of the 2024 MongoDB.local tour—where MongoDB, our customers, and our AI partners came together to share solutions we’ve been building that enable companies to accelerate their AI journey. I love attending these events because they offer an opportunity to celebrate our collective achievements, and because it’s great to meet so many (mainly Zoom) friends in person! One of the highlights of MongoDB.local London 2024 was the release of our reference architecture with our MAAP partners AWS and Anthropic , which supports memory-enhanced AI agents. This architecture is already helping businesses streamline complex processes and develop smarter, more responsive applications. We also announced a robust set of vector quantization capabilities in MongoDB Atlas Vector Search that will help developers build powerful semantic search and generative AI applications with more scale—and at a lower cost. Now, with support for the ingestion of scalar quantized vectors, you can import and work with quantized vectors from your embedding model providers of choice, including MAAP partners Cohere, Nomic, and others. A big thank you to all of MongoDB’s AI partners, who continually amaze me with their innovation. MongoDB.local London was another great reminder of the power of collaboration, and I’m excited for what lies ahead as we continue to shape the future of AI together. As the Brits say: Cheers! Welcoming new AI and tech partners In September we also welcomed seven new AI and tech partners that offer product integrations with MongoDB. Read on to learn more about each great new partner! Arize Arize AI is a platform that helps organizations visualize and debug the flow of data through AI applications by quickly identifying bottlenecks in LLM calls and understanding agentic paths. "At Arize AI, we are committed to helping AI teams build, evaluate, and troubleshoot cutting-edge agentic systems. Partnering with MongoDB allows us to provide a comprehensive solution for managing the memory and retrieval that these systems rely on”, said Jason Lopatecki, co-founder and CEO of Arize AI. “With MongoDB’s robust vector search and flexible document storage, combined with Arize’s advanced observability and evaluation tools, we’re empowering developers to confidently build and deploy AI applications." Baseten Baseten provides the applied AI research and infrastructure needed to serve custom and open-source machine learning models performantly, scalably, and cost-efficiently. " We're excited to partner with MongoDB to combine their scalable vector database with Baseten's high-performance inference infrastructure and high-performance models. Together, we're enabling companies to build and deploy generative AI applications, such as RAG apps, that not only scale infinitely but also deliver optimal performance per dollar,” said Tuhin Srivastava, CEO of Baseten. “This partnership empowers developers to bring mission-critical AI solutions to market faster, while maintaining cost-effectiveness at every stage of growth." Doppler Doppler is a cloud-based platform that helps teams manage, organize, and secure secrets across environments and applications that can be used throughout the entire development lifecycle. “Doppler rigorously focuses on making the easy path, the most secure path for developers. This is only possible with deep product partnerships with all the tooling developers have come to love. We are excited to join forces with MongoDB to make zero-downtime secrets rotation for non-relational databases effortlessly simple to set up and maintenance-free,” said Brian Vallelunga, founder and CEO of Doppler. “This will immediately bolster the security posture of a company’s most sensitive data without any additional overhead or distractions." Haize Labs Haize Labs automates language model stress testing at massive scales to discover and eliminate failure modes. This, alongside their inference-time mitigations and observability tools, enables the risk-free adoption of AI. " We're thrilled to partner with MongoDB in empowering companies to build RAG applications that are both powerful yet secure, safe, and reliable,” said Leonard Tang, co-founder and CEO of Haize Labs. “MongoDB Atlas has streamlined the process of developing production-ready GenAI systems, and we're excited to work together to accelerate customers' journey to trust and confidence in their GenAI initiatives." Modal Modal is a serverless platform for data and AI/ML engineers to run and deploy code in the cloud without having to think about infrastructure. Run generative AI models, large-scale batch jobs, job queues, and more, all faster than ever before. “The coming wave of intelligent applications will be built on the potent combination of foundation models, large-scale data, and fast search,” explained Charles Frye, AI Engineer at Modal. “MongoDB Atlas provides an excellent platform for storing, querying, and searching data, from hot new techniques like vector indices to old standbys like lexical search. It's the perfect counterpart to Modal's flexible compute, like serverless GPUs. Together, MongoDB and Modal make it easy to get started with this new paradigm, and then they make it easy to scale it out to millions of users querying billions of records & maxing out thousands of GPUs.” Portkey AI Portkey AI is an AI gateway and observability suite that helps companies develop, deploy, and manage LLM-based applications. " Our partnership with MongoDB is a game-changer for organizations looking to operationalize AI at scale. By combining Portkey's LLMOps expertise with MongoDB's comprehensive data solution, we're enabling businesses to deploy, manage, and scale AI applications with unprecedented efficiency and control,” said Ayush Garg, Chief Technology Officer of Portkey AI. “Together, we're not just streamlining the path from POC to production; we're setting a new standard for how businesses can leverage AI to drive innovation and deliver tangible value." Reka Reka offers fully multimodal models including images, videos with audio, text, and documents to empower AI agents that can see, hear, and speak. "At Reka, we know how challenging it can be to retrieve information buried in unstructured multimodal data. We are excited to join forces with MongoDB to help companies test and optimize multimodal RAG features for faster production deployment,” said Dani Yogatama, CEO of Reka. “Our models understand and reason over multimodal data including text, tables, and images in PDF documents or conversations in videos. Our joint solution streamlines the whole RAG development lifecycle, speeding up time to market and helping companies deliver real values to their customers faster." But wait, there's more! To learn more about building AI-powered apps with MongoDB, check out our AI Resources Hub , and stop by our Partner Ecosystem Catalog to read about our integrations with MongoDB’s ever-evolving AI partner ecosystem.

October 9, 2024

Introducing Dark Mode for MongoDB Documentation

We’re excited to announce a highly requested feature: Dark mode is now available for MongoDB Documentation ! Every day, developers from all backgrounds—beginners to experts—turn to the MongoDB Documentation. It’s packed with comprehensive resources that help you build modern applications using MongoDB and the Atlas developer data platform. With detailed information and step-by-step guides, it’s an invaluable tool for improving your skills and making your development work smoother. From troubleshooting tricky queries to exploring new features, MongoDB Documentation is there to support your projects and help you succeed. With dark mode, you can now switch to a darker interface that’s easier on the eyes. Whether you’re working late or prefer a subdued color palette, dark mode enhances your MongoDB Documentation experience. How to enable dark mode Enabling dark mode is simple. Just click on the sun and moon icon at the top right of the page to switch between dark mode, light mode, and system settings. It will initially default to your system settings. This is a personal setting and won't affect other users within the project or organization. We’ve designed dark mode to provide the same user-friendly experience you’re used to and stay consistent across different tools in the developer workflow, including MongoDB Atlas, which is also available in dark mode . We're all about making your reading experience top-notch! Dark mode is here because you asked for it through our feedback widget on the Docs page. Whether you’re an early adopter of dark mode or just trying it out, we’d love your opinion. Just drop your feedback in the widget next to the color theme selector on the MongoDB Documentation page. Less strain, more gain Dark mode offers a sleek, modern look that brings a refreshing change from the traditional light mode. Beyond its stylish appearance, dark mode also provides significant practical benefits. Reducing the amount of bright light emitted from your screen helps minimize eye strain and fatigue, making extended periods of device use more comfortable. For those using OLED screens, dark mode can help conserve battery life, as these screens consume less power by displaying darker pixels. Whether you’re coding into the late hours or just looking for a more comfortable viewing experience, dark mode is a simple yet powerful tool to enhance your development experience. Try out dark mode on MongoDB Documentation today and enjoy a more comfortable, stylish, and efficient reading experience!

October 9, 2024

Introducing Two MongoDB Generative AI Learning Badges

Want to boost your resume quickly? MongoDB is introducing two new Learning Badges , Building gen AI Apps and Deploying and Evaluating gen AI Apps. Unlike high-stakes certifications, which cover a large breadth and depth of subjects, these digital credentials are focused on specific topics, making them easier and quicker to earn. Best of all, they’re free! The Building Gen AI Applications with MongoDB Learning Badge validates users’ knowledge of developing gen AI applications using MongoDB Atlas Vector Search. It recognizes your understanding of semantic search and how to build chatbots with retrieval-augmented generation (RAG), MongoDB, and Langchain. The Deploying and Evaluating Gen AI Applications with MongoDB Learning Badge validates users’ knowledge of optimizing the performance and evaluating the results of gen AI applications. It recognizes your understanding of chunking strategies, performance evaluation techniques, and deployment options within MongoDB for both prototyping and production stages. Learn, prepare, and earn To earn your badge, simply complete the Learning Badge Path and take a short assessment at the end. Once you pass the short assessment, you'll receive an email with your official Credly badge and digital certificate. You can share it on social media, in email signatures, or on digital resumes. Additionally, you'll gain inclusion in the Credly Talent Directory , where you will be visible to recruiters from top employers and can open up new career opportunities. Learning paths are like curated roadmaps that guide you through essential concepts and skills needed for the assessment. Each badge has its own learning path: Building Gen AI Apps Learning Badge Path: This learning path guides you through the foundations of building a gen AI application with MongoDB Atlas Vector Search. You'll learn what semantic search is and how you can leverage it across a variety of use cases. Then you'll learn how to build your own chatbot by creating a RAG application with MongoDB and Langchain. Deploying and Evaluating Gen AI Apps Learning Badge Path: This learning path will help you take a gen AI application from creation to full deployment, with a focus on optimizing performance and evaluating results. You'll explore chunking strategies, performance evaluation techniques, and deployment options in MongoDB for both prototyping and production stages. We recommend completing the Building gen AI Apps Learning Badge Path before beginning this path. Badge up with MongoDB MongoDB Learning Badges offer a valuable opportunity to showcase your commitment to continuous learning and expertise in specific topics. These digital credentials not only recognize your educational achievements but also serve as a testament to your knowledge and skills. Whether you're a seasoned developer, an aspiring data scientist, or an enthusiastic student, earning a MongoDB badge can significantly enhance your profile and open up new opportunities in your field. Start earning your badges today—it’s quick, effective, and free! Visit MongoDB Learning Badges to begin your journey toward becoming a gen AI application expert and boosting your career prospects.

October 8, 2024

THL Simplifies Architecture with MongoDB Atlas Search

Tourism Holdings Limited (THL) originally became a MongoDB customer in 2019, using MongoDB Atlas to help manage a wide variety of telematics data. I was very excited to welcome Charbel Abdo, Solutions Architect for THL at MongoDB .local Sydney in July 2024 to hear more about how the company has significantly expanded its use of MongoDB. The largest RV rental company in the world, THL has branches in New Zealand (where it is headquartered), Australia, the US, Canada, the UK and Europe. Specializing in building, renting, and selling camper vans, THL has a number of well-known brands under its umbrella. In recent years, THL has made a number of significant digital transformation and technology stack optimization efforts, moving from a ‘bolt-on’ approach that necessitated the use of a distributed search and analytics engine to an integrated search solution with MongoDB Atlas . THL operates a complex ecosystem managed by their in-house platform, Motek, which handles booking, pricing, fleet management, and more—with MongoDB Atlas as the central database. Its +7,000 RVs are fitted with telematics devices that send information—such as location, high-speed events, engine problems, and geofences or restricted areas (for example, during the Australian bushfires of 2020)—to vehicles’ onboard computers. THL initially used a bolt-on approach for complex search functionalities by extending their deployment footprint to include a stand-alone instance of Elasticsearch. This setup, while functional, introduced significant data synchronization and performance issues, as well as increased maintenance overhead. Elasticsearch struggled under heavy loads which led to critical failures and system instability, resulting in THL experiencing frequent outages and data inconsistencies. After two years of coping with these challenges, THL resolved to migrate away from ElasticSearch. After doing due diligence, they identified the MongoDB developer data platform’s integrated Search capabilities as the optimum solution. "A couple of months later, we had migrated everything," said Abdo. "Kudos to the MongoDB account team. They were exceptional." The migration process turned out to be relatively straightforward. By iteratively replacing Elasticsearch with MongoDB Atlas Search , THL was able to simplify its architecture, reduce costs, and eliminate the synchronization issues that had plagued the system. The simplification also led to significant performance and reliability improvements. Because it no longer needed the dedicated sync resources processing millions upon millions of records per day, THL was able to turn off its Elasticsearch cluster and to consolidate its resources. “All data sync related issues were gone, eliminated. But also we got our Friday afternoons back, which is always a good thing!” added Abdo. Abdo’s team can now also use existing monitoring tools rather than having to set up something completely separate from the standalone search engine they were using. “Sometimes, changes are easier than you think,” said Abdo. “We spent two-and-a-half years with our faulty solutions just looking for ways to patch up all the problems that we were having. We tried everything except actually looking into how much it would actually take to migrate. We wasted so much time, so much effort, so much money. While if we had thought about this a couple of years ago, it would have been a breeze.” “Over-engineering is bad, simple is better,” he noted. To learn more about how MongoDB Atlas Search can help you build or deepen your search capabilities, visit our MongoDB Atlas Search page .

October 7, 2024

Vector Quantization: Scale Search & Generative AI Applications

We are excited to announce a robust set of vector quantization capabilities in MongoDB Atlas Vector Search . These capabilities will reduce vector sizes while preserving performance, enabling developers to build powerful semantic search and generative AI applications with more scale—and at a lower cost. In addition, unlike relational or niche vector databases, MongoDB’s flexible document model—coupled with quantized vectors—allows for greater agility in testing and deploying different embedding models quickly and easily. Support for scalar quantized vector ingestion is now generally available, and will be followed by several new releases in the coming weeks. Read on to learn how vector quantization works and visit our documentation to get started! The challenges of large-scale vector applications While the use of vectors has opened up a range of new possibilities , such as content summarization and sentiment analysis, natural language chatbots, and image generation, unlocking insights within unstructured data can require storing and searching through billions of vectors—which can quickly become infeasible. Vectors are effectively arrays of floating-point numbers representing unstructured information in a way that computers can understand (ranging from a few hundred to billions of arrays), and as the number of vectors increases, so does the index size required to search over them. As a result, large-scale vector-based applications using full-fidelity vectors often have high processing costs and slow query times, hindering their scalability and performance. Vector quantization for cost-effectiveness, scalability, and performance Vector quantization, a technique that compresses vectors while preserving their semantic similarity, offers a solution to this challenge. Imagine converting a full-color image into grayscale to reduce storage space on a computer. This involves simplifying each pixel's color information by grouping similar colors into primary color channels or "quantization bins," and then representing each pixel with a single value from its bin. The binned values are then used to create a new grayscale image with smaller size but retaining most original details, as shown in Figure 1. Figure 1: Illustration of quantizing an RGB image into grayscale Vector quantization works similarly, by shrinking full-fidelity vectors into fewer bits to significantly reduce memory and storage costs without compromising the important details. Maintaining this balance is critical, as search and AI applications need to deliver relevant insights to be useful. Two effective quantization methods are scalar (converting a float point into an integer) and binary (converting a float point into a single bit of 0 or 1). Current and upcoming quantization capabilities will empower developers to maximize the potential of Atlas Vector Search. The most impactful benefit of vector quantization is increased scalability and cost savings through reduced computing resources and efficient processing of vectors. And when combined with Search Nodes —MongoDB’s dedicated infrastructure for independent scalability through workload isolation and memory-optimized infrastructure for semantic search and generative AI workloads— vector quantization can further reduce costs and improve performance, even at the highest volume and scale to unlock more use cases. "Cohere is excited to be one of the first partners to support quantized vector ingestion in MongoDB Atlas,” said Nils Reimers, VP of AI Search at Cohere. “Embedding models, such as Cohere Embed v3, help enterprises see more accurate search results based on their own data sources. We’re looking forward to providing our joint customers with accurate, cost-effective applications for their needs.” In our tests, compared to full-fidelity vectors, BSON-type vectors —MongoDB’s JSON-like binary serialization format for efficient document storage—reduced storage size by 66% (from 41 GB to 14 GB). And as shown in Figures 2 and 3, the tests illustrate significant memory reduction (73% to 96% less) and latency improvements using quantized vectors, where scalar quantization preserves recall performance and binary quantization’s recall performance is maintained with rescoring–a process of evaluating a small subset of the quantized outputs against full-fidelity vectors to improve the accuracy of the search results. Figure 2: Significant storage reduction + good recall and latency performance with quantization on different embedding models Figure 3: Remarkable improvement in recall performance for binary quantization when combining with rescoring In addition, thanks to the reduced cost advantage, vector quantization facilitates more advanced, multiple vector use cases that would have been too computationally-taxing or cost-prohibitive to implement. For example, vector quantization can help users: Easily A/B test different embedding models using multiple vectors produced from the same source field during prototyping. MongoDB’s document model —coupled with quantized vectors—allows for greater agility at lower costs. The flexible document schema lets developers quickly deploy and compare embedding models’ results without the need to rebuild the index or provision an entirely new data model or set of infrastructure. Further improve the relevance of search results or context for large language models (LLMs) by incorporating vectors from multiple sources of relevance, such as different source fields (product descriptions, product images, etc.) embedded within the same or different models. How to get started, and what’s next Now, with support for the ingestion of scalar quantized vectors, developers can import and work with quantized vectors from their embedding model providers of choice (such as Cohere, Nomic, Jina, Mixedbread, and others)—directly in Atlas Vector Search. Read the documentation and tutorial to get started. And in the coming weeks, additional vector quantization features will equip developers with a comprehensive toolset for building and optimizing applications with quantized vectors: Support for ingestion of binary quantized vectors will enable further reduction of storage space, allowing for greater cost savings and giving developers the flexibility to choose the type of quantized vectors that best fits their requirements. Automatic quantization and rescoring will provide native capabilities for scalar quantization as well as binary quantization with rescoring in Atlas Vector Search, making it easier for developers to take full advantage of vector quantization within the platform. With support for quantized vectors in MongoDB Atlas Vector Search, you can build scalable and high-performing semantic search and generative AI applications with flexibility and cost-effectiveness. Check out these resources to get started documentation and tutorial . Head over to our quick-start guide to get started with Atlas Vector Search today.

October 7, 2024

MongoDB.local London 2024: Better Applications, Faster

Since we kicked off MongoDB’s series of 2024 events in April, we’ve connected with thousands of customers, partners, and community members in cities around the world—from Mexico City to Mumbai. Yesterday marked the nineteenth stop of the 2024 MongoDB.local tour, and we had a blast welcoming folks across industries to MongoDB.local London, where we discussed the latest technology trends, celebrated customer innovations, and unveiled product updates that make it easier than ever for developers to build next-gen applications. Over the past year, MongoDB’s more than 50,000 customers have been telling us that their needs are changing. They’re increasingly focused on three areas: Helping developers build faster and more efficiently Empowering teams to create AI-powered applications Moving from legacy systems to modern platforms Across these areas, there’s a common need for a solid foundation: each requires a resilient, scalable, secure, and highly performant database. The updates we shared at MongoDB.local London reflect these priorities. MongoDB is committed to ensuring that our products are built to exceed our customers’ most stringent requirements, and that they provide the strongest possible foundation for building a wide range of applications, now and in the future. Indeed, during yesterday’s event, Sahir Azam, MongoDB’s Chief Product Officer, discussed the foundational role data plays in his keynote address. He also shared the latest advancement from our partner ecosystem, an AI solution powered by MongoDB, Amazon Web Services, and Anthropic that makes it easier for customers to deploy gen AI customer care applications. MongoDB 8.0: The best version of MongoDB ever The biggest news at .local London was the general availability of MongoDB 8.0 , which provides significant performance improvements, reduced scaling costs, and adds additional scalability, resilience, and data security capabilities to the world’s most popular document database. Architectural optimizations in MongoDB 8.0 have significantly reduced memory usage and query times, and MongoDB 8.0 has more efficient batch processing capabilities than previous versions. Specifically, MongoDB 8.0 features 36% better read throughput, 56% faster bulk writes, and 20% faster concurrent writes during data replication. In addition, MongoDB 8.0 can handle higher volumes of time series data and can perform complex aggregations more than 200% faster—with lower resource usage and costs. Last (but hardly least!), Queryable Encryption now supports range queries, ensuring data security while enabling powerful analytics. For more on MongoDB.local London’s product announcements—which are designed to accelerate application development, simplify AI innovation, and speed developer upskilling—please read on! Accelerating application development Improved scaling and elasticity on MongoDB Atlas capabilities New enhancements to MongoDB Atlas’s control plane allow customers to scale clusters faster, respond to resource demands in real-time, and optimize performance—all while reducing operational costs. First, our new granular resource provisioning and scaling features—including independent shard scaling and extended storage and IOPS on Azure—allow customers to optimize resources precisely where needed. Second, Atlas customers will experience faster cluster scaling with up to 50% quicker scaling times by scaling clusters in parallel by node type. Finally, MongoDB Atlas users will enjoy more responsive auto-scaling, with a 5X improvement in responsiveness thanks to enhancements in our scaling algorithms and infrastructure. These enhancements are being rolled out to all Atlas customers, who should start seeing benefits immediately. IntelliJ plugin for MongoDB Announced in private preview, the MongoDB for IntelliJ Plugin is designed to functionally enhance the way developers work with MongoDB in IntelliJ IDEA, one of the most popular IDEs among Java developers. The plugin allows enterprise Java developers to write and test Java queries faster, receive proactive performance insights, and reduce runtime errors right in their IDE. By enhancing the database-to-IDE integration, JetBrains and MongoDB have partnered to deliver a seamless experience for their shared user-base and unlock their potential to build modern applications faster. Sign up for the private preview here . MongoDB Copilot Participant for VS Code (Public Preview) Now in public preview, the new MongoDB Participant for GitHub Copilot integrates domain-specific AI capabilities directly with a chat-like experience in the MongoDB Extension for VS Code .

October 3, 2024

Top 4 Reasons to Use MongoDB 8.0

We’re excited to announce that MongoDB 8.0 —the newest version of the world’s most popular document database, used by millions of developers and more than 50,000 customers around the world—is now generally available. MongoDB 8.0 builds upon MongoDB’s industry-leading capabilities to provide significant performance improvements, reduced costs, and greater ease of use, from local deployments to globally distributed applications at enterprise scale. Developers have long loved building with MongoDB, so we've ensured that 8.0 kept the bar extremely high for developer usability. MongoDB 8.0 was also built to exceed our customers’ most stringent security, resiliency, availability, and performance requirements, and is the most impressive version of MongoDB yet. MongoDB 8.0 gives customers the strongest possible foundation for building a wide range of applications, now and in the future. Jim Scharf, Chief Technology Officer, MongoDB For MongoDB 8.0, we focused our engineering efforts around four core goals: Optimize performance for the widest variety of applications Deliver innovative encryption to unlock new use cases Reduce costs and increase scale with rapid and intuitive horizontal scaling for high availability Ensure resilience for unexpected application demand So how do these goals actually benefit teams as they build and manage applications? We’ll start by looking at why you should use MongoDB 8.0. Whether you’re a seasoned MongoDB veteran or are new to the database, MongoDB 8.0 is a great foundation for new applications and supercharging existing ones alike. Version 8.0 combines the things developers love most about MongoDB—like an intuitive and cohesive developer experience, support for a broad set of use cases, and operational ease of use—with unparalleled performance improvements. Top reasons to switch to MongoDB 8.0 1. MongoDB 8.0 is over 30% faster than before As the data applications generate and use grows, minor inefficiencies can lead to disproportionate increases in infrastructure costs. Because many customers primarily interact with businesses through their applications, poor or inconsistent application performance can lead to customer unhappiness, lost opportunities, and declines in revenue. So it’s imperative for organizations to ensure that their applications perform consistently well. MongoDB 8.0 significantly improves performance by allowing applications to rapidly and efficiently query and transform data, with up to 36% better throughput. Architectural optimizations in MongoDB 8.0 have reduced memory usage and query times, and a combination of more efficient batch processing and optimizations has enabled 59% higher throughput for updates and 20% faster concurrent writes during data replication. Additionally, optimizations in MongoDB 8.0 mean the database can handle higher volumes of time series data and perform operations over 200% faster—with lower resource usage and costs. 2. MongoDB 8.0 is more secure than ever Data protection and security are essential. With the increasing complexity and volume of data being transmitted, stored, and processed across environments, safeguarding sensitive information with robust encryption is more critical than ever. Organizations must protect their data throughout its lifecycle—in transit over networks, at rest where it is stored, and while it’s in use for querying and processing. However, it can be challenging to encrypt data while it is queried and processed, leaving data vulnerable to exposure or exfiltration by malicious actors. MongoDB Queryable Encryption is an industry-first innovation developed by the MongoDB Cryptography Research Group. It allows customers to encrypt sensitive data on the client side, store it securely as fully randomized encrypted data in the MongoDB database, and to run expressive queries on the encrypted data for processing. MongoDB 8.0 now includes support for range queries—in addition to equality queries—to expand secure data retrieval with greater flexibility for common searches. With Queryable Encryption, the required data remains encrypted until it reaches an authorized end user using a customer-controlled decryption key—with no cryptography expertise required. 3. MongoDB 8.0 makes it cheaper and easier to scale As organizations grow, their applications’ requirements tend to evolve. For example, scaling to support millions of users can be challenging for organizations that originally designed their applications for thousands of users. This is because implementing architectural changes in production applications can involve significant effort that can be costly and time-consuming. With MongoDB 8.0, horizontal scaling is now faster and easier, and at a lower cost. With horizontal scaling, applications can scale beyond the limits of traditional database resources by splitting data across multiple servers known as shards—without having to pre-provision increasing amounts of compute resources for a single server. New sharding capabilities in MongoDB 8.0 distribute data across shards up to 50 times faster and at up to 50% lower cost to get started. 4. MongoDB 8.0 gives you more control to help your applications run smoothly End-users expect consistent application experiences, even during periods of high demand and usage spikes. Organizations without a highly durable operational database risk poor customer experiences, with lagging application behavior (or even downtime) during times of high demand. MongoDB 8.0 provides greater control for teams optimizing database performance for unpredictable spikes in usage and sustained periods of high demand. MongoDB 8.0 includes new capabilities to set a default maximum time limit for running queries, to reject recurring types of problematic queries, and to set query settings to persist through events like database restarts. These capabilities help deliver consistent application behavior and high performance, irrespective of demand spikes or unexpected events. Ready to try MongoDB 8.0? If you are building a new application, the easiest way to get started with MongoDB 8.0 is by going to mongodb.com/try , where you can sign up for a free Atlas account, download the Community edition, and learn more about self-managing MongoDB with an Enterprise Advanced subscription. If you are running a previous version of MongoDB, there are helpful upgrade tutorials for MongoDB Atlas and self-managed deployments . Additionally, documentation and expert help from the MongoDB professional services team are on hand. If you have an existing application that is not currently using MongoDB as the database, check out the MongoDB Relational Migrator tool . Relational Migrator can help you map existing relational schemas to a MongoDB schema, perform data migrations, and convert existing relational queries, triggers, and stored procedures to work with MongoDB. The MongoDB engineering and product teams listened attentively to developer feedback , and MongoDB 8.0 was built with developer usability—as well as security, durability, availability, and performance—top of mind. We’re excited for you to give it a try, and are sure you’ll enjoy the performance gains and other benefits of MongoDB 8.0!

October 2, 2024

MongoDB 8.0: Raising the Bar

I recently received an automated reminder that I was approaching a work anniversary, which took me somewhat by surprise. It’s hard to believe that it’s already been a year (to the day) that I joined MongoDB ! So I thought I’d take a moment to reflect on my MongoDB journey so far, share some exciting product updates, and signal where we’re headed next. Our customers I joined MongoDB because it built a product developers love. The innovation of MongoDB’s document model empowered developers to simply build. No longer encumbered by having to formalize and denormalize their data schema before their application was even designed, MongoDB enabled developers to interact with data in an intuitive JSON format, and made it easy to evolve data structures as the life of their application evolved. One of my first steps upon joining the company was to learn more about our customers. I was excited to learn that in addition to delighting developers, MongoDB had launched capabilities that enabled it to win mission-critical workloads from enterprise class customers—including 70% of the Fortune 100 and highly regulated global financial institutions, health care providers, and government agencies. I found it remarkable that customers could replicate data across AWS, Google Cloud, and Microsoft Azure in MongoDB Atlas (our fully-managed cloud database service) with just a few mouse clicks, and that some customers replicate data between the cloud and on premises using MongoDB Enterprise Advanced. This optionality struck me as powerful in the era of rapid advancements in AI, as it enables customers to easily bring their data to the best cloud provider for AI. Soon after I joined MongoDB, the team was firming up the development roadmap for the next version of MongoDB, and they asked for my input on the plan. The team was debating whether to focus on features developers would love, or governance capabilities required by large enterprises. I knew that ideally we would please all of our customers, so we had to try to make this an “and” and not an “or.” While I was new to MongoDB, from my 17+ years at AWS I learned that all customers demand security, durability, availability, and performance (in that order) from any modern technology offering. If a product or service doesn’t have those four elements, customers won’t buy whatever you’re selling. So as a team, we agreed that our next release of MongoDB—MongoDB 8.0—had to raise the bar for all of our customers, delivering great security, durability, availability, and performance. The plan We had less than a year before our target launch, so we knew we had to get moving, fast. My team and I brought MongoDB’s product and engineering organizations together to align on the plan for our next release. We set goals around delivering significant improvements in security, durability, and availability. And we set a line in the sand—that we weren’t going to release MongoDB 8.0 unless it was the best-performing version of MongoDB yet. Measuring the performance of a feature-rich database like MongoDB can be tricky, as customers run a wide range of workloads. So we decided to run a suite of benchmarks to simulate customer workloads. We also developed Andon cord -inspired automation that would automatically roll back any code contributions that regressed our performance metrics. Finally, a set of senior engineering leaders met regularly to review our progress and immediately escalated any blockers that could jeopardize our launch, so that we could quickly fix things. From my experience, I knew that great teams really respond when they’re given clear goals, and when they’re empowered to innovate, so I was excited to see what they would come up with. I’m proud to say that our product and engineering teams rose to the challenge. Announcing MongoDB 8.0 Today, I’m thrilled to announce the general availability of MongoDB 8.0 —the most secure, durable, available, and performant version of MongoDB yet! The team came up with architectural optimizations in MongoDB 8.0 that have significantly reduced memory usage and query times, and have made batch processing more efficient than previous versions. Specifically, MongoDB 8.0 features: 36% better read throughput 56% faster bulk writes 20% faster concurrent writes during data replication 200% faster on complex aggregations of times series data In making these improvements, we're seeing benchmarks for typical web applications perform 32% better overall. Here’s a breakdown of how MongoDB 8.0 performs against some of our benchmarks: Improved performance benefits all users of applications built atop MongoDB, and for MongoDB customers, it can mean reduced costs (due to an improved price/performance ratio). In addition to significant performance gains, MongoDB 8.0 delivers a wide range of improvements, including (but not limited to): Improving availability by delivering sharding enhancements to distribute data across shards up to 50 times faster and at up to 50% lower starting cost, with reduced need for additional configuration or setup. Improving support for a wide range of search and AI applications at higher scale and lower cost, via the delivery of quantized vectors—compressed representations of full-fidelity vectors—that require up to 96% less memory and are faster to retrieve while preserving accuracy. Enabling customers to encrypt data at rest, in transit, and in use by expanding MongoDB’s Queryable Encryption to also support range queries. Queryable Encryption is a groundbreaking, industry-first innovation developed by the MongoDB Cryptography Research Group that allows customers to encrypt sensitive application data, store it securely as fully randomized encrypted data in the MongoDB database, and run expressive queries on the encrypted data —with no cryptography expertise required. You might wonder why we’re so confident that customers are going to love MongoDB 8.0. Well, we’ve been acting as our own customer, and have moved our own applications over to 8.0. This approach is generally called “ dogfooding ,” but we think that “eating our own pizza” sounds more appetizing. Our internal build system—which our software developers use daily—is built atop MongoDB, and when we upgraded to MongoDB 8.0 we saw query latencies drop by approximately 75%! This was a double win, as it improved the performance of our own tooling, and it set our performance chat room abuzz with excitement in anticipation of delighting external customers. While results may vary based on your particular workload, the point is that we just couldn’t wait to share MongoDB 8.0’s performance gains with customers. Indeed, customers are also already seeing great results on MongoDB 8.0. For example, Felix Horvat, Chief Technology Officer at OCELL , a climate technology company in Germany, said: “With MongoDB 8.0, we have seen an incredible boost in performance, with some of our queries running twice as fast as before . This improvement not only enhances our data processing capabilities but also aligns perfectly with our commitment to resource efficiency. By optimizing our backend operations, we can be more effective in our climate initiatives while conserving resources—a true reflection of our dedication to sustainable solutions.” I encourage you to check out MongoDB 8.0 yourself. It’s available today via MongoDB Atlas, as part of MongoDB Enterprise Advanced for on-premises and hybrid deployments, and as a free download from mongodb.com/try with MongoDB Community Edition. In addition, customers upgrading from previous versions of MongoDB to 8.0 can find helpful upgrade guides on mongodb.com. What’s next? We’re excited for you to try MongoDB 8.0 and to share your feedback, as customer feedback helps us guide our roadmap for future releases. Going forward, please watch this space. Over the next few weeks, we’ll be publishing a series of engineering blog posts that dig into MongoDB’s investments in the technology behind MongoDB 8.0. We’re also planning posts about horizontal scaling in MongoDB 8.0, and one that will look closely at queryable encryption (QE), but let me know what you’d like to hear more about. It’s been an exciting year at MongoDB—I can’t wait to see what the next one has in store! –Jim

October 2, 2024

Bringing Gen AI Into The Real World with Ramblr and MongoDB

How do you bring the benefits of gen AI, a technology typically experienced on a keyboard and screen, into the physical world? That's the problem the team at Ramblr.ai , a San Francisco-based startup, is solving with its powerful and versatile 3D annotation and recognition capabilities. “With Ramblr you can record continuously what you are doing, and then ask the computer, in natural language, ‘Where did I go wrong’ or ‘What should I do next?” said Frank Angermann, Lead Pipeline & Infrastructure Engineer at Ramblr.ai. Gen AI for the real world One of the best examples of Ramblr’s technology, and its potential, is its work with the international chemical giant BASF. In a video demonstration on Ramblr’s website, a BASF engineer can be seen tightening bolts on a connector (or ‘flange’) joining two parts of a pipeline. Every move the engineer makes is recorded via a helmet-mounted camera. Once the worker is finished for the day this footage, and the footage of every other person working on the pipeline, is uploaded to a database. Using Ramblr’s technology, quality assurance engineers from BASF then query the collected footage from every worker, asking the software to, ‘Please assess footage from today’s pipeline connection work and see if any of the bolts were not tightened enough.’ Having processed the footage, Ramblr assesses whether those flanges had been assembled correctly and identifies any that required further inspection or correction. The method behind the magic “We started Ramblr.ai as an annotation platform, a place where customers could easily label images from a video and have machine learning models then identify that annotation throughout the video automatically,” said Frank. “In the past this work would be carried out manually by thousands of low-paid workers tagging videos by hand. We thought we could be better by automating that process,” he added. The software allows customers to easily customize and add annotations to footage for their particular use case, and with its gen-AI powered active learning approach Ramblr then ‘fills in’ the rest of the video based on those annotations. Why MongoDB? MongoDB has been part of the Ramblr technology stack since the beginning. “We use MongoDB Atlas for half of our storage processes. Metadata, annotation data, etc., can all be stored in the same database. This means we don’t have to rely on separate databases to store different types of data,” said Frank. Flexibility of data storage was also a key consideration when choosing a database. “With MongoDB Atlas, we could store information the way we wanted to,” he added. The built-in vector database capabilities of Atlas were also appealing to the Rambler team, “The ability to store vector embeddings without having to do any more work - for instance not having to move a 3mb array of data somewhere else to process it, was a big bonus for us.” The future Aside from infrastructure and construction Q&A, robotics is another area in which the Ramblr team is eager to deploy their technology. “Smaller robotics companies don’t typically have the data to train the models that inform their products. There are quite a few use cases where we could support these companies and provide a more efficient and cost-effective way to teach the robots more efficiently. We are extremely efficient in providing information for object detectors,” said Frank. But while there are plenty of commercial uses for Ramblr’s technology, the growth in spatial computing in the consumer sector - especially following the release of Apple’s Vision Pro and Meta Quest headsets - opens up a whole new category of use cases. “Spatial computing will be a big part of the world. Being able to understand the particular processes, taxonomy, and what the person is actually seeing in front of them will be a vital part of the next wave of innovation in user interfaces and the evolution of gen AI,” Frank added. Are you building AI apps? Join the MongoDB AI Innovators Program today! Successful participants gain access to free Atlas credits, technical enablement, and invaluable connections within the broader AI ecosystem. If your company is interested in being featured, we’d love to hear from you. Connect with us at ai_adopters@mongodb.com. Head over to our quick-start guide to get started with Atlas Vector Search today.

September 30, 2024

AI-Driven Noise Analysis for Automotive Diagnostics

Aftersales service is a crucial revenue stream for the automotive industry, with leading manufacturers executing repairs through their dealer networks. One global automotive giant recently embarked on an ambitious project to revolutionize their diagnostic process. Their project—which aimed to increase efficiency, customer satisfaction, and revenue throughput—involved the development of an AI-powered solution that could quickly analyze engine sounds and compare them to a database of known problems, significantly reducing diagnostic times for complex engine issues. Traditional diagnostic methods can be time-consuming, expensive, and imprecise, especially for complex engine issues. MongoDB’s client in automotive manufacturing envisioned an AI-powered solution that could quickly analyze engine sounds and compare them to a database of known problems, significantly reducing diagnostic times. Initial setbacks, then a fresh perspective Despite the client team's best efforts, the project faced significant challenges and setbacks during the nine-month prototype phase. Though the team struggled to produce reliable results, they were determined to make the project a success. At this point, MongoDB introduced its client to Pureinsights , a specialized gen AI implementation and MongoDB AI Application Program partner , to rethink the solution and to salvage the project. As new members of the project team, and as Pureinsights’s CTO and Lead Architect, respectively, we brought a fresh perspective to the challenge. Figure 1: Before and after the AI-powered noise diagnostic solution A pragmatic approach: Text before sound Upon review, we discovered that the project had initially started with a text-based approach before being persuaded to switch to sound analysis. The Pureinsights team recommended reverting to text analysis as a foundational step before tackling the more complex audio problem. This strategy involved: Collecting text descriptions of car problems from technicians and customers. Comparing these descriptions against a vast database of known issues already stored in MongoDB. Utilizing advanced natural language processing, semantic / vector search, and Retrieval Augmented Generation techniques to identify similar cases and potential solutions. Our team tested six different models for cross-lingual semantic similarity, ultimately settling on Google's Gecko model for its superior performance across 11 languages. Pushing boundaries: Integrating audio analysis With the text-based foundation in place, we turned to audio analysis. Pureinsights developed an innovative approach to the project by combining our AI expertise with insights from advanced sound analysis research. We drew inspiration from groundbreaking models that had gained renown for their ability to identify cities solely from background noise in audio files. This blend of AI knowledge and specialized audio analysis techniques resulted in a robust, scalable system capable of isolating and analyzing engine sounds from various recordings. We adapted these sophisticated audio analysis models, originally designed for urban sound identification, to the specific challenges of automotive diagnostics. These learnings and adaptations are also applicable to future use cases for AI-driven audio analysis across various industries. This expertise was crucial in developing a sophisticated audio analysis model capable of: Isolating engine and car noises from customer or technician recordings. Converting these isolated sounds into vectors. Using these vectors to search the manufacturer's existing database of known car problem sounds. At the heart of this solution is MongoDB’s powerful database technology. The system leverages MongoDB’s vector and document stores to manage over 200,000 case files. Each "document" is more akin to a folder or case file containing: Structured data about the vehicle and reported issue Sound samples of the problem Unstructured text describing the symptoms and context This unified approach allows for seamless comparison of text and audio descriptions of customer engine problems using MongoDB's native vector search technology. Encouraging progress and phased implementation The solution's text component has already been rolled out to several dealers, and the audio similarity feature will be integrated in late 2024. This phased approach allows for real-world testing and refinement before a full-scale deployment across the entire repair network. The client is taking a pragmatic, step-by-step approach to implementation. If the initial partial rollout with audio diagnostics proves successful, the plan is to expand the solution more broadly across the dealer network. This cautious (yet forward-thinking) strategy aligns with the automotive industry's move towards more data-driven maintenance practices. As the solution continues to evolve, the team remains focused on enhancing its core capabilities in text and audio analysis for current diagnostic needs. The manufacturer is committed to evaluating the real-world impact of these innovations before considering potential future enhancements. This measured approach ensures that each phase of the rollout delivers tangible benefits in efficiency, accuracy, and customer satisfaction. By prioritizing current diagnostic capabilities and adopting a phased implementation strategy, the automotive giant is paving the way for a new era of efficiency and customer service in their aftersales operations. The success of this initial rollout will inform future directions and potential expansions of the AI-powered diagnostic system. A new era in automotive diagnostics The automotive giant brought industry expertise and a clear vision for improving their aftersales service. MongoDB provided the robust, flexible data platform essential for managing and analyzing diverse, multi-modal data types at scale. We, at Pureinsights, served as the AI application specialist partner, contributing critical AI and machine learning expertise, and bringing fresh perspectives and innovative approaches. We believe our role was pivotal in rethinking the solution and salvaging the project at a crucial juncture. This synergy of strengths allowed the entire project team to overcome initial setbacks and develop a groundbreaking solution that combines cutting-edge AI technologies with MongoDB's powerful data management capabilities. The result is a diagnostic tool leveraging text and audio analysis to significantly reduce diagnostic times, increase customer satisfaction, and boost revenue through the dealer network. The project's success underscores several key lessons: The value of persistence and flexibility in tackling complex challenges The importance of choosing the right technology partners The power of combining domain expertise with technological innovation The benefits of a phased, iterative approach to implementation As industries continue to evolve in the age of AI and big data, this collaborative model—bringing together industry leaders, technology providers, and specialized AI partners—sets a new standard for innovation. It demonstrates how companies can leverage partnerships to turn ambitious visions into reality, creating solutions that drive business value while enhancing customer experiences. The future of automotive diagnostics—and AI-driven solutions across industries—looks brighter thanks to the combined efforts of forward-thinking enterprises, cutting-edge database technologies like MongoDB, and specialized AI partners like Pureinsights. As this solution continues to evolve and deploy across the global dealer network, it paves the way for a new era of efficiency, accuracy, and customer satisfaction in the automotive industry. This solution has the potential to not only revolutionize automotive diagnostics but also set a new standard for AI-driven solutions in other industries, demonstrating the power of collaboration and innovation. To deliver more solutions like this—and to accelerate gen AI application development for organizations at every stage of their AI journey—Pureinsights has joined the MongoDB AI Application Program (MAAP). Check out the MAAP page to learn more about the program and how MAAP ecosystem members like Pureinsights can help your organization accelerate time-to-market, minimize risks, and maximize the value of your AI investments.

September 27, 2024

Away From the Keyboard: Apoorva Joshi, MongoDB Senior AI Developer Advocate

Welcome to our article series focused on developers and what they do when they’re not building incredible things with code and data. “Away From the Keyboard” features interviews with developers at MongoDB, discussing what they do, how they establish a healthy work-life balance, and their advice for others looking to create a more holistic approach to coding. In this article, Apoorva Joshi shares her day-to-day responsibilities as a Senior AI Developer Advocate at MongoDB; what a flexible approach to her job and life looks like; and how her work calendar helps prioritize overall balance. Q: What do you do at MongoDB? Apoorva: My job is to help developers successfully build AI applications using MongoDB. I do this through written technical content, hands-on workshops, and design whiteboarding sessions. Q: What does work-life balance look like for you? Apoorva: I love remote work. It allows me to have a flexible approach towards work and life where I can accommodate life things, like dental appointments, walks, or lunches in the park during my work day—as long as work gets done. Q: Was that balance always a priority for you or did you develop it later in your career? Apoorva: Making work-life balance a priority has been a fairly recent development. During my first few years on the job, I would work long hours, partly because I felt like I needed to prove myself and also because I hadn’t prioritized finding activities I enjoyed outside of school or work up until then. The first lockdown during the pandemic put a lot of things into perspective. With work and life happening in the same place, I felt the need for boundaries. Having nowhere to go encouraged me to try out new hobbies, such as solving jigsaw puzzles; as well as reconnecting with old favorites, like reading and painting. Q: What benefits has this balance given you? Apoorva: Doing activities away from the keyboard makes me more productive at work. A flexible working schedule also creates a stress-free environment and allows me to bring my 100% to work. This balance helps me make time for family and friends, exercise, chores, and hobbies. Overall, having a healthy work-life balance helps me lead a fulfilling life that I am proud of. Q: What advice would you give to a developer seeking to find a better balance? Apoorva: The first step to finding a balance between work and life is to recognize that boundaries are healthy. I have found that putting everyday things, such as lunch breaks and walks on my work calendar is a good way to remind myself to take that break or close my laptop, while also communicating those boundaries with my colleagues. If you are having trouble doing this on your own, ask a family member, partner, or friend to remind you! Thank you to Apoorva Joshi for sharing her insights! And thanks to all of you for reading. Look for more in our new series. Interested in learning more about or connecting more with MongoDB? Join our MongoDB Community to meet other community members, hear about inspiring topics, and receive the latest MongoDB news and events. And let us know if you have any questions for our future guests when it comes to building a better work-life balance as developers. Tag us on social media: @/mongodb

September 26, 2024