RRF Ranker
Reciprocal Rank Fusion (RRF) Ranker is a reranking strategy for Milvus hybrid search that balances results from multiple vector search paths based on their ranking positions rather than their raw similarity scores. Like a sports tournament that considers players' rankings rather than individual statistics, RRF Ranker combines search results based on how highly each item ranks in different search paths, creating a fair and balanced final ranking.
When to use RRF Ranker
RRF Ranker is specifically designed for hybrid search scenarios where you want to balance results from multiple vector search paths without assigning explicit importance weights. It's particularly effective for:
Use Case | Example | Why RRF Ranker Works Well |
---|---|---|
Multimodal search with equal importance | Image-text search where both modalities matter equally | Balances results without requiring arbitrary weight assignments |
Ensemble vector search | Combining results from different embedding models | Democratically merges rankings without favoring any particular model's scoring distribution |
Cross-lingual search | Finding documents across multiple languages | Ranks results fairly regardless of language-specific embedding characteristics |
Expert recommendations | Combining recommendations from multiple expert systems | Creates consensus rankings when different systems use incomparable scoring methods |
If your hybrid search application requires balancing multiple search paths democratically without assigning explicit weights, RRF Ranker is your ideal choice.
Mechanism of RRF Ranker
The main workflow of the RRFRanker strategy is as follows:
-
Collect Search Rankings: Collect the rankings of results from each path of vector search (rank_1, rank_2).
-
Merge Rankings: Convert the rankings from each path (rank_rrf_1, rank_rrf_2) according to a formula .
The calculation formula involves N, which represents the number of retrievals. ranki(d) is the ranking position of document d generated by the i(th) retriever. k is a smoothing parameter typically set at 60.
-
Aggregate Rankings: Re-rank the search results based on the combined rankings to produce the final results.
Example of RRF Ranker
This example demonstrates a Hybrid Search (topK=5) on sparse-dense vectors and illustrates how the RRFRanker strategy reranks the results from two ANN searches.
- Results of ANN search on sparse vectors of texts (topK=5):
ID | Rank (sparse) |
---|---|
101 | 1 |
203 | 2 |
150 | 3 |
198 | 4 |
175 | 5 |
- Results of ANN search on dense vectors of texts (topK=5):
ID | Rank (dense) |
---|---|
198 | 1 |
101 | 2 |
110 | 3 |
175 | 4 |
250 | 5 |
- Use RRF to rearrange the rankings of the two sets of search results. Assume that the smoothing parameter
k
is set at 60.
ID | Score (Sparse) | Score (Dense) | Final Score |
---|---|---|---|
101 | 1 | 2 | 1/(60+1)+1/(60+2) = 0.01639 |
198 | 4 | 1 | 1/(60+4)+1/(60+1) = 0.01593 |
175 | 5 | 4 | 1/(60+5)+1/(60+4) = 0.01554 |
203 | 2 | N/A | 1/(60+2) = 0.01613 |
150 | 3 | N/A | 1/(60+3) = 0.01587 |
110 | N/A | 3 | 1/(60+3) = 0.01587 |
250 | N/A | 5 | 1/(60+5) = 0.01554 |
- The final results after reranking(topK=5):
Rank | ID | Final Score |
---|---|---|
1 | 101 | 0.01639 |
2 | 203 | 0.01613 |
3 | 198 | 0.01593 |
4 | 150 | 0.01587 |
5 | 110 | 0.01587 |
Usage of RRF Ranker
When using the RRF reranking strategy, you need to configure the parameter k
. It is a smoothing parameter that can effectively alter the relative weights of full-text search versus vector search. The default value of this parameter is 60, and it can be adjusted within a range of (0, 16384). The value should be floating-point numbers. The recommended value is between [10, 100]. While k=60
is a common choice, the optimal k
value can vary depending on your specific applications and datasets. We recommend testing and adjusting this parameter based on your specific use case to achieve the best performance.
Create an RRF Ranker
After your collection is set up with multiple vector fields, create an RRF Ranker with an appropriate smoothing parameter:
- Python
- Java
- Go
- NodeJS
- cURL
from pymilvus import RRFRanker
ranker = RRFRanker(100)
import io.milvus.v2.service.vector.request.ranker.RRFRanker;
RRFRanker ranker = new RRFRanker(100);
ranker := milvusclient.NewRRFReranker().WithK(100)
ranker: RRFRanker("100")
"ranker": {
"strategy": "rrf",
"params": {
"k": 100
}
}
export ranker='{
"strategy": "rrf",
"params": {"k": 100}
}'
Apply to hybrid search
RRF Ranker is designed specifically for hybrid search operations that combine multiple vector fields. Here's how to use it in a hybrid search:
- Python
- Java
- NodeJS
- Go
- cURL
# Python
from pymilvus import AnnSearchRequest
# Define text vector search request
text_search = AnnSearchRequest(
data=["modern dining table"],
anns_field="text_vector",
param={},
limit=10
)
# Define image vector search request
image_search = AnnSearchRequest(
data=[image_embedding], # Image embedding vector
anns_field="image_vector",
param={},
limit=10
)
# Apply RRF Ranker to product hybrid search
# The smoothing parameter k controls the balance
hybrid_results = milvus_client.hybrid_search(
collection_name,
[text_search, image_search], # Multiple search requests
ranker=ranker, # Apply the RRF ranker
limit=10,
output_fields=["product_name", "price", "category"]
)
// java
// nodejs
// go
# restful
For more information on hybrid search, refer to Multi-Vector Hybrid Search.