BlogRun AI wherever your compliance framework demands. Read blog >

BlogRetrieval accuracy is now a competitive advantage Read blog >

Better RAG Results With Reciprocal Rank Fusion (RRF) and Hybrid Search

Try Atlas for free

Key takeaways

Retrieval‑augmented generation (RAG) is a technique to fetch relevant data and combine it with a language model to give accurate, context‑aware answers.
A keyword search produces search results based on the given input keywords, whereas vector search uses semantic search to produce context aware results. A hybrid search combines both.
Reranking reorders results from multiple retrieval methods, ensuring the most relevant documents appear at the top.
RRF is an efficient reranking algorithm that combines rankings from multiple search methods into a single ranking, improving accuracy and relevance.
Some important use cases of RRF are healthcare support and research, product search, financial forecasting, and market analysis.

Table of contents

Search results using keyword search
Search results using vector search

Sparse vector search
Dense vector search

Hybrid search
Different types of retrieval methods

BM25
Vector search
Multi-query retrieval
Ensemble

Retrieval-augmented generation (RAG) using hybrid search

Why RRF?
How RRF works

Example query translation
Use cases of RAG with RRF
FAQs
Related content

Reciprocal rank fusion (RRF) is a popular algorithm to implement hybrid search. It aggregates rankings from multiple searches—for example, a keyword search, vector search, and so on—into a single ranking that is more accurate.

Before diving into RRF, let’s discuss the limitations posed by keyword search and vector search, when used individually.

Search results using keyword search

Keyword search solely relies on the keywords given in the user query. The search results are based only on the keyword matches and disregard any contextual relevance, particularly in the case of ambiguous queries. For example, if you search for reciprocal rank fusion using keyword search, it will return only those results that contain these exact keywords, but you may not get related yet contextually appropriate results, like reranking, retrieval-augmented generation, and hybrid search.

Search results using vector search

Vector search looks for semantically appropriate results, based on the KNN algorithm. There are two types of searches:

Sparse vector search

This method extends the keyword search by using weighted, high-dimensional sparse vectors that look for frequencies (less/more/absence/presence) of search terms. However, it does not fully capture the context or meaning of a search query.

Dense vector search

Dense vectors represent data (for example, text) as numerical vectors in multiple dimensions, where each dimension captures some detail of the data. These dense vectors are created using a deep learning model. When a query is fired, the similarity between the vectors is measured to identify which vector representations are closest to each other. The closer the vectors are, the more related they are contextually. This way, dense vector search gives semantically appropriate search results.

Hybrid search

Hybrid search combines various types of searches, like keyword and vector search, to produce multiple lists of relevant documents. These combined results can then be passed to a reranking module, which optimizes the final ranking order based on relevance signals or learned scoring. An example is shown below:

Different types of retrieval methods

From basic to more advanced retrieval methods, each one provides pros and cons and can be suitable for different use cases. Below is a quick overview of some important retrieval methodologies:

BM25

BM25 is the traditional keyword search algorithm, where the results are based on the exact input words given by the user. If you want an exact match result, like in a document search, BM25 is an excellent choice.

Vector search

Vector search retrieval captures semantic similarity between the input terms and tries to provide contextually relevant results. It uses dense vector embeddings to find close proximity words and improve the relevance of responses. It is quite useful for LLM-based question-answer systems and chatbots.

Multi-query retrieval

The multi-query retriever uses a large language model (LLM) to produce multiple versions or interpretations of the same user query. The search outcomes of each query are then aggregated to get a more comprehensive final query response. This can be useful for product recommendation systems, document retrievals, and academic research.

Ensemble

Ensemble retriever combines keyword search (sparse retriever like BM25) and dense retriever (vector search) to produce a list of relevant documents. It uses methods like reciprocal rank fusion to combine scores from multiple retrieval methods and provide the final ranking and unified result. Combining semantic matches with keyword matches produces more effective and accurate results. Ensemble retriever is a good choice for search engines, recommendation systems, and many more use cases.

Retrieval-augmented generation (RAG) using hybrid search

Using hybrid search greatly enhances RAG, as the results are more accurate compared to just keyword or semantic search. Reranking of search results using reciprocal rank fusion further ensures high quality results. The following diagram illustrates a hybrid search RAG pipeline that uses reranking.

Once the LLM receives a user prompt, it generates a query and sends both the query and the prompt to the data store to retrieve relevant documents. Since hybrid search is used, two separate ranked lists with scores are generated—one from keyword search and another from vector search. These scores are then combined, typically using a fusion technique such as reciprocal rank fusion, followed by reranking. The final set of results is returned to the LLM, which uses the information to generate a response for the end user.

Why RRF?

Implementing reciprocal rank fusion has several advantages. It combines the strength of various methods, including sparse and dense vector results. This improves the relevance of the results. An RRF algorithm assigns a reciprocal rank score based on the document ranks from multiple retrieval methods. This reduces hallucination and mitigates errors that might occur due to the use of individual methods, thus improving performance and reliability of search results. The accuracy and reliability is crucial in RAG applications like content summarization, information retrieval, and question answer systems.

How RRF works

As a first step, whenever a user fires a query, multiple searches are kicked off. It can be a keyword search, semantic search, or both. Each of these methods generates a ranking (of results). The next step is to calculate the reciprocal rank score of each of the generated results. The score is calculated as:

Code Snippet

k is a constant that helps in balancing the influence of individual rankings. The value of k decides the sensitivity to rank positions.

In the above formula, rank is the position of the document in the list.

In some implementations, different search strategies can be assigned different weights before combining their scores. This allows the system to favor one retrieval method—such as semantic search—over another, depending on the use case or domain requirements.

The next step is to combine the scores obtained from each strategy and sum them to obtain a single score. Then, rank the documents again (rerank), based on the combined score. Documents having higher scores will be placed on the top in the final ranking.

The final fused ranking is the list obtained from the above step, which is a blended result rather than the normalized result, a more accurate way to rank the results.

Steps for reciprocal rank fusion method for reranking

Example query translation

Let’s consider our previous example of Best places to visit in Paris, to illustrate how reciprocal rank fusion is applied to the results of multiple algorithms to produce the most relevant results.

First, we will calculate the reciprocal ranking scores of the results obtained through keyword search, keeping the value of k as 60:

Eiffel Tower = 1/(1+60) = 0.0164
Louvre Museum = 1/(2+60) = 0.0161
Notre-Dame Cathedral = 1/(3+60) = 0.0159

Next, we calculate the reciprocal ranking scores of the results obtained through semantic search:

Montmartre = 1/(1+60) = 0.0164
Eiffel Tower = 1/(2+60) = 0.0161
Le Marais = 1/(3+60) = 0.0159
Seine River Cruise = 1/(4+60) = 0.0156

Now, let’s add the scores and combine the results, with the highest score on top:

Eiffel Tower = 0.0164 + 0.0161 = 0.0325
Montmartre = 0.0164
Louvre Museum = 0.0161
Le Marais = 0.0158
Notre-Dame Cathedral = 0.0158

While keyword search focuses on exact keywords, like sight-seeing locations, semantic search also considers individuals’ personal experiences, gathered from blogs or reviews. Combining both can give the best of both worlds to a user.

Use cases of RAG with RRF

Some popular use cases of RAG with RRF are:

E-commerce and retail: Customers don’t need to search for the exact product names, and can simply type what they want. Reranking helps produce results that are most useful to the user based on his search terms.
Healthcare and medical research: Getting search data based on keyword (exact search words) and semantic search (similar studies) gives the most relevant evidence for diagnosis or treatment.
Market analysis: Using data from multiple sources—like news reports, transcripts, blogs, and filings—and merging it with domain-specific data searches, produces a much more accurate and unified view of the financial data for analysis.
Recruitment process: RAG can speed up the recruitment process by combining the resume details, along with the interview transcripts of a person. RRF can then produce accurate ranking of multiple candidates based on the practical interview experience as well as the data from the resume.

FAQs

RRF is a compound type of retriever that combines results obtained from different types of retrievers into one unified ranking.

Some practical use cases of RRF are in information retrieval in search engines, document retrieval, recommendation systems, and e-commerce search systems.

Reciprocal rank score is a metric used to evaluate the effectiveness of a search system to return relevant results. It is the inverse of the rank of the particular result of the query and is calculated as 1/(rank+constant k).

Get started with Atlas today

Get started in seconds. Our free clusters come with 512 MB of storage so you can play around with sample data and get oriented with our platform.

Try FreeContact sales

GET STARTED WITH:

125+ regions worldwide
Sample data sets
Always-on authentication
End-to-end encryption

Command line tools

Better RAG Results With Reciprocal Rank Fusion (RRF) and Hybrid Search

Key takeaways

Search results using keyword search