Top Considerations When Choosing a Hybrid Search Solution

Jose Parra
September 30, 2025 | Updated: October 9, 2025

Search has evolved. Today, natural language queries have largely replaced simple keyword searches when addressing our information needs. Instead of typing “Peru travel guide” into a search engine, we now ask a large language model (LLM) “Where should I visit in Peru in December during a 10-day trip? Create a travel guide.”

Is keyword search no longer useful? While the rise of LLMs and vector search may suggest that traditional keyword search is becoming less prevalent, the future of search actually relies on effectively combining both methods. This is where hybrid search plays a crucial role, blending the precision of traditional text search with the powerful contextual understanding of vector search. Despite advances in vector technology, keyword search still has a lot to contribute and remains essential to meeting current user expectations.

The rise of hybrid search

By late 2022 and particularly throughout 2023, as vector search saw a surge in popularity (see image 1 below), it quickly became clear that vector embeddings alone were not enough. Even as embedding models continue to improve at retrieval tasks, full-text search will always remain useful for identifying tokens outside the training corpus of an embedding model. That is why users soon began to combine vector search with lexical search, exploring ways to leverage both precision and context-aware retrieval. This shift was driven in large part by the rise of generative AI use cases like retrieval-augmented generation (RAG), where high-quality retrieval is essential.

Figure 1. Number of vector search vendors per year and type.

Bar graph displaying the number of vector search vendors each year. The number has increased significantly each year since 2019.

As hybrid search matured beyond basic score combination, the main fusion techniques emerged - reciprocal rank fusion (RRF) and relative score fusion (RSF). They offer ways to combine results that do not rely on directly comparable score scales. RRF focuses on ranking position, rewarding documents that consistently appear near the top across different retrieval methods. RSF, on the other hand, works directly with raw scores from different sources of relevance, using normalization to minimize outliers and align modalities effectively at a more granular level than rank alone can provide. Both approaches quickly gained traction and have become standard techniques in the market.

How did the market react?

The industry realized the need to introduce hybrid search capabilities, which brought different challenges for different types of players.

For lexical-first search platforms, the main challenge was to add vector search features and implement the bridging logic with their existing keyword search infrastructure. These vendors understood that the true value of hybrid search emerges when both modalities are independently strong, customizable, and tightly integrated.

On the other hand, vector-first search platforms faced the challenge of adding lexical search. Implementing lexical search through traditional inverted indexes was often too costly due to storage differences, increased query complexity, and architectural overhead. Many adopted sparse vectors, which represent keyword importance in a way similar to traditional term-frequency methods used in lexical search. Sparse vectors were key for vector-first databases in enabling a fast integration of lexical capabilities without overhauling the core architecture.

Hybrid search soon became table stakes and the industry focus shifted toward improving developer efficiency and simplifying integration. This led to a growing trend of vendors building native hybrid search functions directly into their platforms. By offering out-of-the-box support to combine and manage both search types, the delivery of powerful search experiences was accelerated.

As hybrid search became the new baseline, more sophisticated re-ranking approaches emerged. Techniques like cross-encoders, learning-to-rank models, and dynamic scoring profiles began to play a larger role, providing systems with additional alternatives to capture nuanced user intent. These methods complement hybrid search by refining the result order based on deeper semantic understanding.

What to choose? Lexical-first or vector-first solutions? Top considerations when choosing a hybrid search solution

When choosing how to implement hybrid search, your existing infrastructure plays a major role in the decision. For users working within a vector-first database, leveraging their lexical capabilities without rethinking the architecture is often enough. However, if the lexical search requirements are advanced, commonly the optimal solution is served with a traditional lexical search solution coupled with vector search, like MongoDB. Traditional lexical - or lexical-first - search offers greater flexibility and customization for keyword search, and when combined with vectors, provides a more powerful and accurate hybrid search experience.

Figure 2. Vector-first vs Lexical-first systems: Hybrid search evaluation.

Table providing an evaluation of vector-first and lexical-first systems. For the column for control, vector is marked as low and lexical is high. For complexity, vector is low and lexical is medium. For Flexibility, vector is medium and lexical is high. And finally, for keyword capabilities, vector is low and lexical is high.

Indexing strategy is another factor to consider. When setting up hybrid search, users can either keep keyword and vector data in separate indexes or combine them into one. Separate indexes give more freedom to tweak each search type, scale them differently, and experiment with scoring. The compromise is higher complexity, with two pipelines to manage and the need to normalize scores. On the other hand, a combined index is easier to manage, avoids duplicate pipelines, and can be faster since both searches run in a single pass. However, it limits flexibility to what the search engine supports and ties the scaling of keyword and vector search together. The decision is mainly a trade-off between control and simplicity.

Lexical-first solutions were built around inverted indexes for keyword retrieval, with vector search added later as a separate component. This often results in hybrid setups that use separate indexes. Vector-first platforms were designed for dense vector search from the start, with keyword search added as a supporting feature. These tend to use a single index for both approaches, making them simpler to manage but sometimes offering less mature keyword capabilities.

Lastly, a key aspect to take into account is the implementation style. Solutions with hybrid search functions handle the combination of lexical and vector search natively, removing the need for developers to manually implement it. This reduces development complexity, minimizes potential errors, and ensures that result merging and ranking are optimized by default. Built-in function support streamlines the entire implementation, allowing teams to focus on building features rather than managing infrastructure.

In general, lexical-first systems tend to offer stronger keyword capabilities and more flexibility in tuning each search type, while vector-first systems provide a simpler, more unified hybrid experience. The right choice depends on whether you prioritize control and mature lexical features or streamlined management with lower operational overhead.

How does MongoDB do it?

When vector search emerged, MongoDB added vector search indexes to the existing traditional lexical search indexes. With that, MongoDB evolved into a competitive vector database by providing developers with a unified architecture for building modern applications. The result is an enterprise-ready platform that integrates traditional lexical search indexes and vector search indexes into the core database.

MongoDB recently released native hybrid search functions to MongoDB Atlas and as part of a public preview for use with MongoDB Community Edition and MongoDB Enterprise Server deployments. This feature is part of MongoDB’s integrated ecosystem, where developers get an out-of-the-box hybrid search experience to enhance the accuracy of application search and RAG use cases.

As a result, instead of managing separate systems for different workloads, MongoDB users benefit from a single platform designed to support both operational and AI-driven use cases. As generative AI and modern applications advance, MongoDB gives organizations a flexible, AI-ready foundation that grows with them.

Read our blog to learn more about MongoDB’s new Hybrid Search function.

Visit the MongoDB AI Learning Hub to learn more about building AI applications with MongoDB.

← Previous

Charting a New Course for SaaS Security: Why MongoDB Helped Build the SSCF

The way companies everywhere work is powered by SaaS. From collaboration tools to critical infrastructure, organizations rely on SaaS applications to drive their business forward. But this widespread adoption has created a significant security blind spot. How can you ensure every one of these applications is configured securely when they all offer different settings, capabilities, and levels of visibility? This inconsistency creates friction, wastes resources, and ultimately, exposes businesses to unnecessary risk. At MongoDB, we believe that securing the SaaS ecosystem is a shared responsibility. That's why we were proud to collaborate with the Cloud Security Alliance (CSA) and industry leaders like GuidePoint Security to develop a new standard—the SaaS Security Capability Framework (SSCF) . The problem: A gap in cloud security For years, the majority of security assessments have focused on the SaaS provider's organizational security, often through frameworks like SOC 2 or ISO 27001. While essential, these frameworks don't always address a critical question: what security capabilities are available to the SaaS customer within the application? This gap means that security teams face a chaotic landscape. Every new SaaS app brings a different set of configurable controls for logging, identity management, and data access. This makes it nearly impossible to implement and track consistent security policies at scale, leading to a burdensome assessment process for everyone involved. The solution: A common framework for SaaS security The SSCF was created to solve this problem by establishing a clear, technical set of customer-facing security controls that SaaS vendors should provide. The framework is designed to empower customers by ensuring they have the tools they need to operate applications securely at scale on their side of the Shared Security Responsibility Model (SSRM). The framework helps with many use cases, but three key audiences stand out: For risk management teams: The SSCF provides a clear baseline to use during vendor assessments, simplifying procurement. For SaaS security teams: It offers a checklist for implementing the security features enterprises expect, streamlining the security program. For SaaS vendors: The SSCF standardizes assessment responses, reducing the overhead of custom questionnaires and helping vendors meet customer requirements. The SSCF focuses on six critical domains, aligned with CSA’s Cloud Control Matrix, providing specific and actionable controls for each: Change Control and Configuration Management (CCC): Ensuring you can programmatically query and get documentation on all security configurations. Data Security and Privacy Lifecycle Management (DSP): Giving customers control over features like disabling file uploads to prevent malicious code. Identity and Access Management (IAM): Providing robust, modern controls for user access, including SSO enforcement, non-human identity (NHI) governance, and a dedicated read-only security auditor role. Interoperability and Portability (IPY): Giving administrators control over mass data exports and visibility into application integrations. Logging and Monitoring (LOG): Defining a clear set of comprehensive requirements for machine-readable logs with mandatory fields for effective threat detection and forensics. Security Incident Management (SEF): Requiring a simple, effective way for vendors to notify a designated customer security contact during an incident. MongoDB's commitment to a more secure ecosystem Our involvement in creating the SSCF stems from our deep commitment to the security of our customers' data and the broader developer community. We believe that robust security shouldn't be an afterthought; it must be built in and easy to consume. The principles outlined in the SSCF—like strong identity controls and comprehensive logging—are philosophies we already built into our own data platform. Strong security capabilities allow our customers to build and innovate faster and more securely, knowing they have a reliable foundation. And personally, as a co-chair of the CSA SSCF, I’ve seen great excitement and engagement on the part of our working group—which helped me realize how many companies are affected by this lack of consistency. The SSCF is a vital step toward creating a more trusted, efficient, and secure global SaaS ecosystem. We are thrilled to have been a part of this foundational work and will continue to champion this standard that empowers developers and security teams alike. Visit our security page to learn more about how MongoDB helps protect your data.

September 30, 2025

Next →

MongoDB Announces Leadership Transition

Dev Ittycheria, President and Chief Executive Officer, shared the following message with MongoDB employees this morning. This is the hardest email I have ever had to write to all of you. If you have not seen the announcement, I have decided to retire as CEO. Effective November 10, 2025, Chirantan “CJ” Desai will become the new CEO of MongoDB. This was not an easy decision for me. The process to get to this point has been deeply emotional, as I care profoundly about MongoDB and the people who have made the company what it is today. This news may come as a surprise, and for some, perhaps even a shock. That’s natural. Leadership transitions can evoke a range of reactions. I want to share why this is happening, and why it’s the right thing for MongoDB. Every personnel change, including the most senior leadership changes, involves two key decisions: first, recognizing that it is the right time for change, and second, selecting the best person to replace the person leaving. This email is intended to explain both decisions. Earlier this year, as part of our regular succession planning process, the Board and I discussed my long-term commitment. They asked if I would continue as CEO for another five years. After many conversations with my family and the Board, I realized I could not make that commitment. Some CEOs see their title as their identity. I do not. My core responsibility is to serve in the company's best interests. The company is primed for a new leader. One with a fresh perspective, grounded in experience and skills needed to guide MongoDB through its next evolution as a company, what we call MongoDB 3.0. Consequently, I informed the Board that I would commit to two more years to help find a successor. That began the search process for a suitable successor. To our surprise and delight, what we thought would easily take 12 to 24 months happened much faster than anyone expected. After engaging with multiple qualified candidates, we found the right successor in CJ. CJ is uniquely qualified for this role. CJ brings the rare growth-at-scale experience that will help continue to build MongoDB into an iconic technology company. At ServiceNow, he was the only executive to work directly with three of its highly regarded public company CEOs and played a pivotal role in organically scaling the company from just over $1 billion to more than $10 billion in revenue. Only a handful of independent software companies have ever reached that milestone. CJ helped transform ServiceNow from a product company to a platform company, scaled engineering, drove go-to-market excellence, and engaged deeply with investors. More recently, as President of Product and Engineering at Cloudflare, he helped fuel strong growth and stock performance. CJ also possesses the personal qualities needed to succeed as CEO. He is humble, eager to learn, and wants to draw on the perspectives of the people at MongoDB and other stakeholders to inform his thinking. This blend of experience, judgment, and character gives me full confidence that he is well-equipped to lead MongoDB through its next phase of growth. I often think of MongoDB’s journey as a long and extraordinary expedition. For the past eleven years, I have had the privilege of serving as its guide, helping chart the course, rally the team, and climb together through both calm and challenging terrain. Along the way, we have reached remarkable summits and proven what is possible through relentless innovation, persistence, and teamwork. Now it is time for a new guide to lead the next stage of the ascent and take MongoDB to even greater heights. CJ is the right leader to take MongoDB to the next summit. MongoDB is on a strong footing, with a clear strategy, an exceptional leadership team, a product platform that is more relevant than ever, and a business that is executing well. The rise of AI and the explosion of data-intensive applications play directly to MongoDB’s strengths. Our technology sits at the center of how modern applications are built and how organizations will harness data to power intelligent, adaptive systems. I am confident MongoDB is perfectly positioned to capture this next wave of innovation. As for me, I am not running away from MongoDB or leaving to join another company as CEO. I will remain on the Board and work closely with CJ to ensure a seamless transition. Over the years, this role has demanded an enormous amount of focus and energy; as a result, there are many things I’ve missed doing along the way. I’m looking forward to being more present for those moments — from simple time with my family to experiences and travel we’ve long put off. I plan to hold on to my MongoDB stock, as I firmly believe in the people and the opportunity, knowing that MongoDB’s best days are ahead of it. Yes, change can be unsettling. I’m sure you will have many questions about this change, such as why now, why CJ is the best person to lead the company, and what this means for you. We will hold an all-hands meeting tomorrow at 10:30AM ET to discuss this transition, introduce CJ and take your questions. That being said, I want to emphasize that the right change at the right time is how great companies get stronger. Just as a championship team refreshes its roster to stay competitive, MongoDB is bringing in new leadership, including other recent C-suite leaders who came before CJ, to drive our next phase of growth. This is not an ending; it’s the founding of a new moment. I am incredibly proud of what we have built together and genuinely excited about what lies ahead with CJ leading us forward. I also want to thank each of you for making this journey so meaningful. Words cannot fully capture my gratitude for your passion, creativity, and belief in building something truly special. I have often said that I want MongoDB to be an inflection point in people’s careers, a place where they can grow, take risks, and do the best work of their lives. I can say without hesitation that it has been exactly that for me. The skills I have developed, the experiences I have gained, and the relationships I have formed here have shaped me more than any other chapter in my professional life. I will carry them with me always, and will continue to cheer for and support MongoDB every step of the way. --Dev

November 3, 2025