LAUNCHMongoDB 8.3 is built for the sub-100ms retrieval & zero downtime AI demands. Read blog >
AI DATAStop fighting your data layer. Get the memory & retrieval agents need to scale. Read blog >

Better RAG Results With Reciprocal Rank Fusion (RRF) and Hybrid Search

Try Atlas for free

Key takeaways

Table of contents

Reciprocal rank fusion (RRF) is a popular algorithm to implement hybrid search. It aggregates rankings from multiple searches—for example, a keyword search, vector search, and so on—into a single ranking that is more accurate.

Before diving into RRF, let’s discuss the limitations posed by keyword search and vector search, when used individually.

Keyword search solely relies on the keywords given in the user query. The search results are based only on the keyword matches and disregard any contextual relevance, particularly in the case of ambiguous queries. For example, if you search for reciprocal rank fusion using keyword search, it will return only those results that contain these exact keywords, but you may not get related yet contextually appropriate results, like reranking, retrieval-augmented generation, and hybrid search.

Vector search looks for semantically appropriate results, based on the KNN algorithm. There are two types of searches:

This method extends the keyword search by using weighted, high-dimensional sparse vectors that look for frequencies (less/more/absence/presence) of search terms. However, it does not fully capture the context or meaning of a search query.

Dense vectors represent data (for example, text) as numerical vectors in multiple dimensions, where each dimension captures some detail of the data. These dense vectors are created using a deep learning model. When a query is fired, the similarity between the vectors is measured to identify which vector representations are closest to each other. The closer the vectors are, the more related they are contextually. This way, dense vector search gives semantically appropriate search results.

Hybrid search combines various types of searches, like keyword and vector search, to produce multiple lists of relevant documents. These combined results can then be passed to a reranking module, which optimizes the final ranking order based on relevance signals or learned scoring. An example is shown below:

Example of hybrid search

Different types of retrieval methods

From basic to more advanced retrieval methods, each one provides pros and cons and can be suitable for different use cases. Below is a quick overview of some important retrieval methodologies:

BM25

BM25 is the traditional keyword search algorithm, where the results are based on the exact input words given by the user. If you want an exact match result, like in a document search, BM25 is an excellent choice.

Vector search retrieval captures semantic similarity between the input terms and tries to provide contextually relevant results. It uses dense vector embeddings to find close proximity words and improve the relevance of responses. It is quite useful for LLM-based question-answer systems and chatbots.

Multi-query retrieval

The multi-query retriever uses a large language model (LLM) to produce multiple versions or interpretations of the same user query. The search outcomes of each query are then aggregated to get a more comprehensive final query response. This can be useful for product recommendation systems, document retrievals, and academic research.

Ensemble

Ensemble retriever combines keyword search (sparse retriever like BM25) and dense retriever (vector search) to produce a list of relevant documents. It uses methods like reciprocal rank fusion to combine scores from multiple retrieval methods and provide the final ranking and unified result. Combining semantic matches with keyword matches produces more effective and accurate results. Ensemble retriever is a good choice for search engines, recommendation systems, and many more use cases.

Retrieval-augmented generation (RAG) using hybrid search

Using hybrid search greatly enhances RAG, as the results are more accurate compared to just keyword or semantic search. Reranking of search results using reciprocal rank fusion further ensures high quality results. The following diagram illustrates a hybrid search RAG pipeline that uses reranking.

Example of RAG using hybrid search

Once the LLM receives a user prompt, it generates a query and sends both the query and the prompt to the data store to retrieve relevant documents. Since hybrid search is used, two separate ranked lists with scores are generated—one from keyword search and another from vector search. These scores are then combined, typically using a fusion technique such as reciprocal rank fusion, followed by reranking. The final set of results is returned to the LLM, which uses the information to generate a response for the end user.

Why RRF?

Implementing reciprocal rank fusion has several advantages. It combines the strength of various methods, including sparse and dense vector results. This improves the relevance of the results. An RRF algorithm assigns a reciprocal rank score based on the document ranks from multiple retrieval methods. This reduces hallucination and mitigates errors that might occur due to the use of individual methods, thus improving performance and reliability of search results. The accuracy and reliability is crucial in RAG applications like content summarization, information retrieval, and question answer systems.

How RRF works

As a first step, whenever a user fires a query, multiple searches are kicked off. It can be a keyword search, semantic search, or both. Each of these methods generates a ranking (of results). The next step is to calculate the reciprocal rank score of each of the generated results. The score is calculated as:

k is a constant that helps in balancing the influence of individual rankings. The value of k decides the sensitivity to rank positions.

In the above formula, rank is the position of the document in the list.

In some implementations, different search strategies can be assigned different weights before combining their scores. This allows the system to favor one retrieval method—such as semantic search—over another, depending on the use case or domain requirements.

The next step is to combine the scores obtained from each strategy and sum them to obtain a single score. Then, rank the documents again (rerank), based on the combined score. Documents having higher scores will be placed on the top in the final ranking.

The final fused ranking is the list obtained from the above step, which is a blended result rather than the normalized result, a more accurate way to rank the results.

Steps for reciprocal rank fusion method for reranking

Example query translation

Let’s consider our previous example of Best places to visit in Paris, to illustrate how reciprocal rank fusion is applied to the results of multiple algorithms to produce the most relevant results.

First, we will calculate the reciprocal ranking scores of the results obtained through keyword search, keeping the value of k as 60:

  1. Eiffel Tower = 1/(1+60) = 0.0164
  2. Louvre Museum = 1/(2+60) = 0.0161
  3. Notre-Dame Cathedral = 1/(3+60) = 0.0159

Next, we calculate the reciprocal ranking scores of the results obtained through semantic search:

  1. Montmartre = 1/(1+60) = 0.0164
  2. Eiffel Tower = 1/(2+60) = 0.0161
  3. Le Marais = 1/(3+60) = 0.0159
  4. Seine River Cruise = 1/(4+60) = 0.0156

Now, let’s add the scores and combine the results, with the highest score on top:

  1. Eiffel Tower = 0.0164 + 0.0161 = 0.0325
  2. Montmartre = 0.0164
  3. Louvre Museum = 0.0161
  4. Le Marais = 0.0158
  5. Notre-Dame Cathedral = 0.0158

While keyword search focuses on exact keywords, like sight-seeing locations, semantic search also considers individuals’ personal experiences, gathered from blogs or reviews. Combining both can give the best of both worlds to a user.

Use cases of RAG with RRF

Some popular use cases of RAG with RRF are:

  • E-commerce and retail: Customers don’t need to search for the exact product names, and can simply type what they want. Reranking helps produce results that are most useful to the user based on his search terms.
  • Healthcare and medical research: Getting search data based on keyword (exact search words) and semantic search (similar studies) gives the most relevant evidence for diagnosis or treatment.
  • Market analysis: Using data from multiple sources—like news reports, transcripts, blogs, and filings—and merging it with domain-specific data searches, produces a much more accurate and unified view of the financial data for analysis.
  • Recruitment process: RAG can speed up the recruitment process by combining the resume details, along with the interview transcripts of a person. RRF can then produce accurate ranking of multiple candidates based on the practical interview experience as well as the data from the resume.

FAQs

 

Get started with Atlas today

Get started in seconds. Our free clusters come with 512 MB of storage so you can play around with sample data and get oriented with our platform.
Try FreeContact sales
GET STARTED WITH:
  • 125+ regions worldwide
  • Sample data sets
  • Always-on authentication
  • End-to-end encryption
  • Command line tools