Version: User Guides (Cloud)

Hybrid Search

Hybrid Search refers to a search method that conducts multiple ANN searches simultaneously, reranks multiple sets of results from these ANN searches, and ultimately returns a single set of results. Using Hybrid Search can enhance the search accuracy. Zilliz Cloud supports conducting Hybrid Search on a collection with multiple vector fields.

Hybrid Search is most commonly used in scenarios including sparse-dense vector searches and multimodal searches. This guide will demonstrate how to conduct a Hybrid Search in Zilliz Cloud with a specific example.

Scenarios

Hybrid Search is suitable for the following two scenarios:

Sparse-Dense Vector Search

Different types of vectors can represent different information, and using various embedding models can more comprehensively represent different features and aspects of the data. For example, using different embedding models for the same sentence can generate a dense vector to represent the semantic meaning and a sparse vector to represent the word frequency in the sentence.

Sparse vectors: Sparse vectors are characterized by their high vector dimensionality and the presence of few non-zero values. This structure makes them particularly well-suited for traditional information retrieval applications. In most cases, the number of dimensions used in sparse vectors correspond to different tokens across one or more languages. Each dimension is assigned a value that indicates the relative importance of that token within the document. This layout proves advantageous for tasks that involve keyword matching.
Dense vectors: Dense vectors are embeddings derived from neural networks. When arranged in an ordered array, these vectors capture the semantic essence of the input text. Note that dense vectors are not limited to text processing; they are also extensively used in computer vision to represent the semantics of visual data. These dense vectors, usually generated by text embedding models, are characterized by most or all elements being non-zero. Thus, dense vectors are particularly effective for semantic search applications, as they can return the most similar results based on vector distance even in the absence of exact keyword matches. This capability allows for more nuanced and context-aware search results, often capturing relationships between concepts that might be missed by keyword-based approaches.

For more details, refer to Sparse Vector and Dense Vector.

Multimodal Search

Multimodal search refers to the similarity search of unstructured data across multiple modalities (such as images, videos, audio, text, etc). For instance, a person can be represented using various modalities of data such as fingerprints, voiceprints, and facial features. Hybrid Search supports multiple searches simultaneously. For example searching a person with both similar fingerprints and voiceprints.

Workflow

The main workflow for conducting a Hybrid Search is as follows:

Generate dense vectors through embedding models like BERT and Transformers.
Generate sparse vectors through embedding models like BM25, BGE-M3, SPLADE, etc.
Create a collection in Zilliz Cloud and define the collection schema which includes both dense and sparse vector fields.
Insert sparse-dense vectors into the collection just created in the previous step.
Conduct a Hybrid Search: ANN Search on dense vectors will return a set of top-K most similar results, and text match on sparse vectors will also return a set of top-K results.
Normalization: Normalize the scores of the two sets of top-K results, converting the scores to a range between [0,1].
Choose an appropriate reranking strategy to merge and rerank the two sets of top-K results and ultimately return a final set of top-K results.

UZkfwo6uXhexxpb0uc2cpD7qnxL

Examples

This section will use a specific example to illustrate how to conduct a Hybrid Search on sparse-dense vectors to enhance the accuracy of text searches.

Create a collection with multiple vector fields

The process of creating a collection includes three parts: defining the collection schema, configuring the index parameters, and creating the collection.

Define schema

In this example, multiple vector fields need to be defined within the collection schema. Currently, each collection can include up to 4 vector fields by default. But you can also modify the value of proxy.maxVectorFieldNum to include up to 10 vector fields in a collection as needed.

The following example defines a collection schema, where dense and sparse are the two vector fields:

id: This field serves as the primary key for storing text IDs. The data type of this field is INT64.
text: This field is used for storing textual content. The data type of this field is VARCHAR, with a maximum length of 1000 characters.
dense: This field is used to store the dense vectors of the texts. The data type of this field is FLOAT_VECTOR, with a vector dimension of 768.
sparse: This field is used to store the sparse vectors of the texts. The data type of this field is SPARSE_FLOAT_VECTOR.

Python
Java
Go
NodeJS
cURL

# Create a collection in customized setup mode
from pymilvus import (
    MilvusClient, DataType
)

client = MilvusClient(
    uri="YOUR_CLUSTER_ENDPOINT",
    token="YOUR_CLUSTER_TOKEN"
)

# Create schema
schema = MilvusClient.create_schema(
    auto_id=False,
    enable_dynamic_field=True,
)
# Add fields to schema
schema.add_field(field_name="id", datatype=DataType.INT64, is_primary=True)
schema.add_field(field_name="text", datatype=DataType.VARCHAR, max_length=1000)
schema.add_field(field_name="sparse", datatype=DataType.SPARSE_FLOAT_VECTOR)
schema.add_field(field_name="dense", datatype=DataType.FLOAT_VECTOR, dim=5)

import io.milvus.v2.client.ConnectConfig;
import io.milvus.v2.client.MilvusClientV2;
import io.milvus.v2.common.DataType;
import io.milvus.v2.service.collection.request.AddFieldReq;
import io.milvus.v2.service.collection.request.CreateCollectionReq;

MilvusClientV2 client = new MilvusClientV2(ConnectConfig.builder()
        .uri("YOUR_CLUSTER_ENDPOINT")
        .token("YOUR_CLUSTER_TOKEN")
        .build());

CreateCollectionReq.CollectionSchema schema = client.createSchema();
schema.addField(AddFieldReq.builder()
        .fieldName("id")
        .dataType(DataType.Int64)
        .isPrimaryKey(true)
        .autoID(false)
        .build());

schema.addField(AddFieldReq.builder()
        .fieldName("text")
        .dataType(DataType.VarChar)
        .maxLength(1000)
        .build());

schema.addField(AddFieldReq.builder()
        .fieldName("dense")
        .dataType(DataType.FloatVector)
        .dimension(768)
        .build());

schema.addField(AddFieldReq.builder()
        .fieldName("sparse")
        .dataType(DataType.SparseFloatVector)
        .build());

// WIP

import { MilvusClient, DataType } from "@zilliz/milvus2-sdk-node";

const address = "YOUR_CLUSTER_ENDPOINT";
const token = "YOUR_CLUSTER_TOKEN";
const client = new MilvusClient({address, token});

// Create a collection in customized setup mode
// Define fields
const fields = [
    {
        name: "id",
        data_type: DataType.Int64,
        is_primary_key: true,
        auto_id: false
    },
    {
        name: "text",
        data_type: DataType.VarChar,
        max_length: 1000
    },
    {
        name: "sparse",
        data_type: DataType.SPARSE_FLOAT_VECTOR
    },
    {
        name: "dense",
        data_type: DataType.FloatVector,
        dim: 768
    }
]

export schema='{
        "autoId": false,
        "enabledDynamicField": true,
        "fields": [
            {
                "fieldName": "id",
                "dataType": "Int64",
                "isPrimary": true
            },
            {
                "fieldName": "text",
                "dataType": "VarChar",
                "elementTypeParams": {
                    "max_length": 1000
                }
            },
            {
                "fieldName": "sparse",
                "dataType": "SparseFloatVector"
            },
            {
                "fieldName": "dense",
                "dataType": "FloatVector",
                "elementTypeParams": {
                    "dim": "768"
                }
            }
        ]
    }'

During sparse vector searches, you can simplify the process of generating sparse embedding vectors by leveraging Full-Text Search capabilities. For more details, see Full Text Search.

Create index

After defining the collection schema, it is necessary to set up the vector indexes and the similarity metrics. In this example, an index of the AUTOINDEX type is created for both the dense vector field dense, and the sparse vector field sparse.

Python
Java
NodeJS
cURL

from pymilvus import MilvusClient

# Prepare index parameters
index_params = client.prepare_index_params()

# Add indexes
index_params.add_index(
    field_name="dense",
    index_name="dense_index",
    index_type="AUTOINDEX",
    metric_type="IP",
    params={"nlist": 128},
)

index_params.add_index(
    field_name="sparse",
    index_name="sparse_index",
    index_type="AUTOINDEX",  # Index type for sparse vectors
    metric_type="IP",  # Currently, only IP (Inner Product) is supported for sparse vectors
    params={"drop_ratio_build": 0.2},  # The ratio of small vector values to be dropped during indexing
)

import io.milvus.v2.common.IndexParam;
import java.util.*;

Map<String, Object> denseParams = new HashMap<>();
denseParams.put("nlist", 128);
IndexParam indexParamForDenseField = IndexParam.builder()
        .fieldName("dense")
        .indexName("dense_index")
        .indexType(IndexParam.IndexType.AUTOINDEX)
        .metricType(IndexParam.MetricType.IP)
        .extraParams(denseParams)
        .build();

Map<String, Object> sparseParams = new HashMap<>();
sparseParams.put("drop_ratio_build", 0.2);
IndexParam indexParamForSparseField = IndexParam.builder()
        .fieldName("sparse")
        .indexName("sparse_index")
        .indexType(IndexParam.IndexType.AUTOINDEX)
        .metricType(IndexParam.MetricType.IP)
        .extraParams(sparseParams)
        .build();

List<IndexParam> indexParams = new ArrayList<>();
indexParams.add(indexParamForDenseField);
indexParams.add(indexParamForSparseField);

const index_params = [{
    field_name: "dense",
    index_type: "AUTOINDEX",
    metric_type: "IP"
},{
    field_name: "sparse",
    index_type: "AUTOINDEX",
    metric_type: "IP"
}]

export indexParams='[
        {
            "fieldName": "dense",
            "metricType": "IP",
            "indexName": "dense_index",
            "indexType":"AUTOINDEX",
            "params":{"nlist":128}
        },
        {
            "fieldName": "sparse",
            "metricType": "IP",
            "indexName": "sparse_index",
            "indexType": "AUTOINDEX"
        }
    ]'

Create collection

Create a collection named demo with the collection schema and indexes configured in the previous two steps.

Python
Java
NodeJS
cURL

from pymilvus import MilvusClient

client.create_collection(
    collection_name="hybrid_search_collection",
    schema=schema,
    index_params=index_params
)

CreateCollectionReq createCollectionReq = CreateCollectionReq.builder()
        .collectionName("hybrid_search_collection")
        .collectionSchema(schema)
        .indexParams(indexParams)
        .build();
client.createCollection(createCollectionReq);

res = await client.createCollection({
    collection_name: "hybrid_search_collection",
    fields: fields,
    index_params: index_params,
})

export CLUSTER_ENDPOINT="YOUR_CLUSTER_ENDPOINT"
export TOKEN="YOUR_CLUSTER_TOKEN"

curl --request POST \
--url "${CLUSTER_ENDPOINT}/v2/vectordb/collections/create" \
--header "Authorization: Bearer ${TOKEN}" \
--header "Content-Type: application/json" \
-d "{
    \"collectionName\": \"hybrid_search_collection\",
    \"schema\": $schema,
    \"indexParams\": $indexParams
}"

Insert data

Insert the sparse-dense vectors into the the collection demo.

Python
Java
NodeJS
cURL

from pymilvus import MilvusClient

data=[
    {"id": 0, "text": "Artificial intelligence was founded as an academic discipline in 1956.", "sparse":{9637: 0.30856525997853057, 4399: 0.19771651149001523, ...}, "dense": [0.3580376395471989, -0.6023495712049978, 0.18414012509913835, ...]},
    {"id": 1, "text": "Alan Turing was the first person to conduct substantial research in AI.", "sparse":{6959: 0.31025067641541815, 1729: 0.8265339135915016, ...}, "dense": [0.19886812562848388, 0.06023560599112088, 0.6976963061752597, ...]},
    {"id": 2, "text": "Born in Maida Vale, London, Turing was raised in southern England.", "sparse":{1220: 0.15303302147479103, 7335: 0.9436728846033107, ...}, "dense": [0.43742130801983836, -0.5597502546264526, 0.6457887650909682, ...]}

res = client.insert(
    collection_name="hybrid_search_collection",
    data=data
)

import com.google.gson.Gson;
import com.google.gson.JsonObject;
import io.milvus.v2.service.vector.request.InsertReq;

Gson gson = new Gson();
JsonObject row1 = new JsonObject();
row1.addProperty("id", 1);
row1.addProperty("text", "Artificial intelligence was founded as an academic discipline in 1956.");
row1.add("dense", gson.toJsonTree(dense1));
row1.add("sparse", gson.toJsonTree(sparse1));

JsonObject row2 = new JsonObject();
row2.addProperty("id", 2);
row2.addProperty("text", "Alan Turing was the first person to conduct substantial research in AI.");
row2.add("dense", gson.toJsonTree(dense2));
row2.add("sparse", gson.toJsonTree(sparse2));

JsonObject row3 = new JsonObject();
row3.addProperty("id", 3);
row3.addProperty("text", "Born in Maida Vale, London, Turing was raised in southern England.");
row3.add("dense", gson.toJsonTree(dense3));
row3.add("sparse", gson.toJsonTree(sparse3));

List<JsonObject> data = Arrays.asList(row1, row2, row3);
InsertReq insertReq = InsertReq.builder()
        .collectionName("hybrid_search_collection")
        .data(data)
        .build();

InsertResp insertResp = client.insert(insertReq);

const { MilvusClient, DataType } = require("@zilliz/milvus2-sdk-node")

var data = [
    {id: 0, text: "Artificial intelligence was founded as an academic discipline in 1956.", sparse:[9637: 0.30856525997853057, 4399: 0.19771651149001523, ...] , dense: [0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592]},
    {id: 1, text: "Alan Turing was the first person to conduct substantial research in AI.", sparse:[6959: 0.31025067641541815, 1729: 0.8265339135915016, ...] , dense: [0.19886812562848388, 0.06023560599112088, 0.6976963061752597, 0.2614474506242501, 0.838729485096104]},
    {id: 2, text: "Born in Maida Vale, London, Turing was raised in southern England." , sparse:[1220: 0.15303302147479103, 7335: 0.9436728846033107, ...] , dense: [0.43742130801983836, -0.5597502546264526, 0.6457887650909682, 0.7894058910881185, 0.20785793220625592]}       
]

var res = await client.insert({
    collection_name: "hybrid_search_collection",
    data: data,
})

curl --request POST \
--url "${CLUSTER_ENDPOINT}/v2/vectordb/entities/insert" \
--header "Authorization: Bearer ${TOKEN}" \
--header "Content-Type: application/json" \
-d '{
    "data": [
        {"id": 0, "text": "Artificial intelligence was founded as an academic discipline in 1956.", "sparse":{"9637": 0.30856525997853057, "4399": 0.19771651149001523}, "dense": [0.3580376395471989, -0.6023495712049978, 0.18414012509913835, ...]},
        {"id": 1, "text": "Alan Turing was the first person to conduct substantial research in AI.", "sparse":{"6959": 0.31025067641541815, "1729": 0.8265339135915016}, "dense": [0.19886812562848388, 0.06023560599112088, 0.6976963061752597, ...]},
        {"id": 2, "text": "Born in Maida Vale, London, Turing was raised in southern England.", "sparse":{"1220": 0.15303302147479103, "7335": 0.9436728846033107}, "dense": [0.43742130801983836, -0.5597502546264526, 0.6457887650909682, ...]}
    ],
    "collectionName": "hybrid_search_collection"
}'

Create multiple AnnSearchRequest instances

Hybrid Search is implemented by creating multiple AnnSearchRequest in the hybrid_search() function, where each AnnSearchRequest represents a basic ANN search request for a specific vector field. Therefore, before conducting a Hybrid Search, it is necessary to create an AnnSearchRequest for each vector field.

📘Notes

In Hybrid Search, each AnnSearchRequest supports only one query vector.

Suppose the query text "Who started AI research?" has already been converted into sparse and dense vectors. Based on this, two AnnSearchRequest search requests are created for the sparse and dense vector fields respectively, as shown in the following example.

Python
Java
NodeJS
cURL

from pymilvus import AnnSearchRequest

query_dense_vector = [0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592]

search_param_1 = {
    "data": [query_dense_vector],
    "anns_field": "dense",
    "param": {
        "metric_type": "IP",
        "params": {"nprobe": 10}
    },
    "limit": 2
}
request_1 = AnnSearchRequest(**search_param_1)

query_sparse_vector = {3573: 0.34701499565746674}, {5263: 0.2639375518635271}
search_param_2 = {
    "data": [query_sparse_vector],
    "anns_field": "sparse",
    "param": {
        "metric_type": "IP",
        "params": {"drop_ratio_build": 0.2}
    },
    "limit": 2
}
request_2 = AnnSearchRequest(**search_param_2)

reqs = [request_1, request_2]

import io.milvus.v2.service.vector.request.AnnSearchReq;
import io.milvus.v2.service.vector.request.data.BaseVector;
import io.milvus.v2.service.vector.request.data.FloatVec;
import io.milvus.v2.service.vector.request.data.SparseFloatVec;

float[] dense = new float[]{-0.0475336798f,  0.0521207601f,  0.0904406682f, ...};
SortedMap<Long, Float> sparse = new TreeMap<Long, Float>() {{
    put(3573L, 0.34701499f);
    put(5263L, 0.263937551f);
    ...
}};

List<BaseVector> queryDenseVectors = Collections.singletonList(new FloatVec(dense));
List<BaseVector> querySparseVectors = Collections.singletonList(new SparseFloatVec(sparse));

List<AnnSearchReq> searchRequests = new ArrayList<>();
searchRequests.add(AnnSearchReq.builder()
        .vectorFieldName("dense")
        .vectors(queryDenseVectors)
        .metricType(IndexParam.MetricType.IP)
        .params("{\"nprobe\": 10}")
        .topK(2)
        .build());
searchRequests.add(AnnSearchReq.builder()
        .vectorFieldName("sparse")
        .vectors(querySparseVectors)
        .metricType(IndexParam.MetricType.IP)
        .params("{\"drop_ratio_build\": 0.2}")
        .topK(2)
        .build());

const search_param_1 = {
    "data": query_vector, 
    "anns_field": "dense", 
    "param": {
        "metric_type": "IP", // 参数值需要与 Collection Schema 中定义的保持一致
        "params": {"nprobe": 10}
    },
    "limit": 2 // AnnSearchRequest 返还的搜索结果数量
}

const search_param_2 = {
    "data": query_sparse_vector, 
    "anns_field": "sparse", 
    "param": {
        "metric_type": "IP", // 参数值需要与 Collection Schema 中定义的保持一致
        "params": {"drop_ratio_build": 0.2}
    },
    "limit": 2 // AnnSearchRequest 返还的搜索结果数量
}

export req='[
    {
        "data": [[0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592,....]],
        "annsField": "dense",
        "params": {
            "params": {
                "nprobe": 10
             }
        },
        "limit": 2
    },
    {
        "data": [{"3573": 0.34701499565746674}, {"5263": 0.2639375518635271}],
        "annsField": "sparse",
        "params": {
            "params": {
                "drop_ratio_build": 0.2
             }
        },
        "limit": 2
    }
 ]'

Since the parameter limit is set to 2, each AnnSearchRequest returns 2 search results. In this example, 2 AnnSearchRequest are created, therefore a total of 4 search results will be returned.

Configure a reranking strategy

To merge and rerank the two sets of ANN search results, it is necessary to select an appropriate reranking strategy. Zilliz Cloud supports two types of reranking strategy: WeightedRanker and RRFRanker. When choosing a reranking strategy, one thing to consider is whether to there is any emphasis for one or more basic ANN search on the vector fields.

WeightedRanker: This strategy is recommended if you require the results to emphasize a particular vector field. The WeightedRanker allows you to assign higher weights to certain vector fields, emphasizing them more. For instance, in multimodal searches, textual descriptions of an image might be considered more important than the colors in this image.
RRFRanker (Reciprocal Rank Fusion Ranker): This strategy is recommended when there is no specific emphasis. The RRF can effectively balance the importance of each vector field.

For more details about the mechanisms of these two reranking strategies, refer to Reranking.

The following two examples demonstrate how to use the WeightedRanker and RRFRanker reranking strategies:

Example 1: Using WeightedRanker

When using the WeightedRanker strategy, you need to input weight values into the WeightedRanker function. The number of basic ANN searches in a Hybrid Search corresponds to the number of values that need to be inputted. The input values should be in the range [0,1], with values closer to 1 indicating greater importance.

Python
Java
NodeJS
cURL

from pymilvus import WeightedRanker

rerank= WeightedRanker(0.8, 0.3) 

import io.milvus.v2.service.vector.request.ranker.BaseRanker;
import io.milvus.v2.service.vector.request.ranker.WeightedRanker;

BaseRanker reranker = new WeightedRanker(Arrays.asList(0.8f, 0.3f));

import { MilvusClient, DataType } from "@zilliz/milvus2-sdk-node";

const rerank = WeightedRanker(0.8, 0.3);

export rerank='{
        "strategy": "ws",
        "params": {"weights": [0.8,0.3]}
    }'

Example 2: Using RRFRanker

When using the RRFRanker strategy, you need to input the parameter value k into the RRFRanker. The default value of k is 60. This parameter helps to determine how the ranks are combined from different ANN searches, aiming to balance and blend the importance across all searches.

Python
Java
NodeJS
cURL

from pymilvus import RRFRanker

ranker = RRFRanker(100)

import io.milvus.v2.service.vector.request.ranker.BaseRanker;
import io.milvus.v2.service.vector.request.ranker.RRFRanker;

BaseRanker reranker = new RRFRanker(100);

import { MilvusClient, DataType } from "@zilliz/milvus2-sdk-node";

const rerank = RRFRanker("100");

export rerank='{
        "strategy": "rrf",
        "params": { "k": 100}
    }'

Perform a Hybrid Search

Before conducting a Hybrid Search, it is necessary to load the collection into memory. If any vector fields in the collection do not have an index or are not loaded, an error will occur when calling the Hybrid Search method.

Python
Java
NodeJS
cURL

from pymilvus import MilvusClient

res = client.hybrid_search(
    collection_name="hybrid_search_collection",
    reqs=reqs,
    ranker=ranker,
    limit=2
)
for hits in res:
    print("TopK results:")
    for hit in hits:
        print(hit)

import io.milvus.v2.common.ConsistencyLevel;
import io.milvus.v2.service.vector.request.HybridSearchReq;
import io.milvus.v2.service.vector.response.SearchResp;

HybridSearchReq hybridSearchReq = HybridSearchReq.builder()
        .collectionName("hybrid_search_collection")
        .searchRequests(searchRequests)
        .ranker(reranker)
        .topK(2)
        .consistencyLevel(ConsistencyLevel.BOUNDED)
        .build();

SearchResp searchResp = client.hybridSearch(hybridSearchReq);

const { MilvusClient, DataType } = require("@zilliz/milvus2-sdk-node")

res = await client.loadCollection({
    collection_name: "hybrid_search_collection"
})

import { MilvusClient, RRFRanker, WeightedRanker } from '@zilliz/milvus2-sdk-node';

const search = await client.search({
  collection_name: "hybrid_search_collection",
  data: [search_param_1, search_param_2],
  limit: 2,
  rerank: RRFRanker(100)
});

curl --request POST \
--url "${CLUSTER_ENDPOINT}/v2/vectordb/entities/advanced_search" \
--header "Authorization: Bearer ${TOKEN}" \
--header "Content-Type: application/json" \
-d "{
    \"collectionName\": \"hybrid_search_collection\",
    \"search\": ${req},
    \"rerank\": {
        \"strategy\":\"rrf\",
        \"params\": {
            \"k\": 10
        }
    },
    \"limit\": 3,
    \"outputFields\": [
        \"user_id\",
        \"word_count\",
        \"book_describe\"
    ]
}"

The following is the output:

["['id: 844, distance: 0.006047376897186041, entity: {}', 'id: 876, distance: 0.006422005593776703, entity: {}']"]

Since limit=2 is specified in the Hybrid Search, Zilliz Cloud will rerank the four search results from step 3 and ultimately return only the top 2 most similar search results.

Scenarios​

Sparse-Dense Vector Search​

Multimodal Search​

Workflow​

Examples​

Create a collection with multiple vector fields​

Define schema​

Create index​

Create collection​

Insert data​

Create multiple AnnSearchRequest instances​

Configure a reranking strategy​

Perform a Hybrid Search​