Resources

Scaling Vector Database Operations with MongoDB and Voyage AI

The performance and scalability of your AI application depend on efficient vector storage and retrieval. In this webinar, we explore how MongoDB Atlas Vector Search and Voyage AI embeddings optimize these aspects through quantization—a technique that reduces the precision of vector embeddings (e.g., float32 to int8) to decrease storage costs and improve query performance while managing accuracy trade-offs. Vector embeddings are the foundation of AI-driven applications, along with powerful capabilities such as retrieval-augmented generation (RAG), semantic search, and agent-based workflows. However, as data volumes grow, the cost and complexity of storing and querying high-dimensional vectors increase. Senior Staff Developer Advocate Anant Srivastava covered practical strategies for converting embeddings to lower-bit representations, balancing performance with accuracy. In a step-by-step tutorial, he shows you how to apply these optimizations using Voyage AI embeddings to reduce both query latency and infrastructure costs. Key Takeaways: How quantization works to dramatically reduce the memory footprint of embeddings How MongoDB Atlas Vector Search integrates automatic quantization to efficiently manage millions of vector embeddings Real-world metrics for retrieval latency, resource utilization, and accuracy across float32, int8, and binary embeddings Combining binary quantization with a rescoring step yields near float32-level accuracy with a fraction of the computational overhead Best practices and tips for balancing speed, cost, and precision—especially at the 1M+ embedding scale essential for RAG, semantic search, and recommendation systems

Watch presentation →

AI Database Comparison: MongoDB Atlas vs. Elasticsearch

Large language models (LLMs) are revolutionizing AI application development, from retrieval-augmented generation (RAG) to intelligent agentic systems that dynamically reason, learn, and adapt. The AI ecosystem offers a range of technologies to build these solutions—but one of the most critical elements is the database. In generative AI applications, your database directly impacts response latency, application performance, and output accuracy. This session compares two solutions for vector data storage, indexing, and retrieval: MongoDB Atlas and Elasticsearch. We break down how each handles semantic search, provide performance insights, and walk through best practices for implementing vector search in MongoDB Atlas. Watch MongoDB Staff Developer Advocate Richmond Alake as he guides you through: Common search patterns in AI workloads with concrete examples and detailed performance guidance for MongoDB Atlas Vector Search. Why database architecture matters in generative AI, illustrated through real-world implementation scenarios that showcase developer productivity and application capabilities. How to implement MongoDB Atlas Vector Search step-by-step, with live coding demonstrations that show how to enable semantic search for powerful RAG solutions through a unified developer experience. Practical walkthrough of building complete RAG pipelines that integrate real-time data, with working code examples you can adapt for your own dynamic LLM-driven applications. Actionable best practices for optimizing MongoDB Atlas for AI workloads, including live demonstrations of index configuration, efficient query patterns, and practical scaling strategies. Whether you're building AI chatbots, recommendation engines, or agentic systems, understanding how your database powers LLM-enabled applications is essential. Watch now to gain practical knowledge and working code examples that will help you design and optimize AI-powered solutions with MongoDB Atlas.

Watch presentation →