Skip to main content
Version: User Guides (Cloud)

Basic ANN Search

Based on an index file recording the sorted order of vector embeddings, the Approximate Nearest Neighbor (ANN) search locates a subset of vector embeddings based on the query vector carried in a received search request, compares the query vector with those in the subgroup, and returns the most similar results. With ANN search, Zilliz Cloud provides an efficient search experience. This page helps you to learn how to conduct basic ANN searches.

Overview

The ANN and the k-Nearest Neighbors (kNN) search are the usual methods in vector similarity searches. In a kNN search, you must compare all vectors in a vector space with the query vector carried in the search request before figuring out the most similar ones, which is time-consuming and resource-intensive.

Unlike kNN searches, an ANN search algorithm asks for an index file that records the sorted order of vector embeddings. When a search request comes in, you can use the index file as a reference to quickly locate a subgroup probably containing vector embeddings most similar to the query vector. Then, you can use the specified metric type to measure the similarity between the query vector and those in the subgroup, sort the group members based on similarity to the query vector, and figure out the top-K group members.

ANN searches depend on pre-built indexes, and the search throughput, memory usage, and search correctness may vary with the index types you choose. You need to balance search performance and correctness.

To reduce the learning curve, Zilliz Cloud provides AUTOINDEX. With AUTOINDEX, Zilliz Cloud can analyze the data distribution within your collection while building the index and sets the most optimized index parameters based on the analysis to strike a balance between search performance and correctness.

For details on AUTOINDEX and applicable metric types, refer to AUTOINDEX Explained and Metric Types. In this section, you will find detailed information about the following topics:

In ANN searches, a single-vector search refers to a search that involves only one query vector. Based on the pre-built index and the metric type carried in the search request, Zilliz Cloud will find the top-K vectors most similar to the query vector.

In this section, you will learn how to conduct a single-vector search. The code snippet assumes you have created a collection in a quick-setup manner. The search request carries a single query vector and asks Zilliz Cloud to use Inner Product (IP) to calculate the similarity between query vectors and vectors in the collection and returns the three most similar ones.

from pymilvus import MilvusClient

client = MilvusClient(
uri="YOUR_CLUSTER_ENDPOINT",
token="YOUR_CLUSTER_TOKEN"
)

# 4. Single vector search
query_vector = [0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592]
res = client.search(
collection_name="my_collection",
anns_field="vector",
data=[query_vector],
limit=3,
search_params={"metric_type": "IP"}
)

for hits in res:
for hit in hits:
print(hit)

# [
# [
# {
# "id": 551,
# "distance": 0.08821295201778412,
# "entity": {}
# },
# {
# "id": 296,
# "distance": 0.0800950899720192,
# "entity": {}
# },
# {
# "id": 43,
# "distance": 0.07794742286205292,
# "entity": {}
# }
# ]
# ]

Milvus ranks the search results by their similarity scores to the query vector in descending order. The similarity score is also termed the distance to the query vector, and its value ranges vary with the metric types in use.

The following table lists the applicable metric types and the corresponding distance ranges.

Metric Type

Characteristics

Distance Range

L2

A smaller value indicates a higher similarity.

[0, ∞)

IP

A greater value indicates a higher similarity.

[-1, 1]

COSINE

A greater value indicates a higher similarity.

[-1, 1]

JACCARD

A smaller value indicates a higher similarity.

[0, 1]

HAMMING

A smaller value indicates a higher similarity.

[0, dim(vector)]

Similarly, you can include multiple query vectors in a search request. Zilliz Cloud will conduct ANN searches for the query vectors in parallel and return two sets of results.

# 7. Search with multiple vectors
# 7.1. Prepare query vectors
query_vectors = [
[0.041732933, 0.013779674, -0.027564144, -0.013061441, 0.009748648],
[0.0039737443, 0.003020432, -0.0006188639, 0.03913546, -0.00089768134]
]

# 7.2. Start search
res = client.search(
collection_name="my_collection",
data=query_vectors,
limit=3,
)

for hits in res:
print("TopK results:")
for hit in hits:
print(hit)

# Output
#
# [
# [
# {
# "id": 551,
# "distance": 0.08821295201778412,
# "entity": {}
# },
# {
# "id": 296,
# "distance": 0.0800950899720192,
# "entity": {}
# },
# {
# "id": 43,
# "distance": 0.07794742286205292,
# "entity": {}
# }
# ],
# [
# {
# "id": 730,
# "distance": 0.04431751370429993,
# "entity": {}
# },
# {
# "id": 333,
# "distance": 0.04231833666563034,
# "entity": {}
# },
# {
# "id": 232,
# "distance": 0.04221535101532936,
# "entity": {}
# }
# ]
# ]

ANN Search in Partition

Suppose you have created multiple partitions in a collection, and you can narrow the search scope to a specific number of partitions. In that case, you can include the target partition names in the search request to restrict the search scope within the specified partitions. Reducing the number of partitions involved in the search improves search performance.

The following code snippet assumes a partition named PartitionA in your collection.

# 4. Single vector search
query_vector = [0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592]
res = client.search(
collection_name="my_collection",
partition_names=["partitionA"],
data=[query_vector],
limit=3,
)

for hits in res:
print("TopK results:")
for hit in hits:
print(hit)

# [
# [
# {
# "id": 551,
# "distance": 0.08821295201778412,
# "entity": {}
# },
# {
# "id": 296,
# "distance": 0.0800950899720192,
# "entity": {}
# },
# {
# "id": 43,
# "distance": 0.07794742286205292,
# "entity": {}
# }
# ]
# ]

Use Output Fields

In a search result, Zilliz Cloud includes the primary field values and similarity distances/scores of the entities that contain the top-K vector embeddings by default. You can include the names of the target fields, including both the vector and scalar fields, in a search request as the output fields to make the search results carry the values from other fields in these entities.

# 4. Single vector search
query_vector = [0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592],

res = client.search(
collection_name="quick_setup",
data=[query_vector],
limit=3, # The number of results to return
search_params={"metric_type": "IP"}
output_fields=["color"]
)

print(res)

# [
# [
# {
# "id": 551,
# "distance": 0.08821295201778412,
# "entity": {
# "color": "orange_6781"
# }
# },
# {
# "id": 296,
# "distance": 0.0800950899720192,
# "entity": {
# "color": "red_4794"
# }
# },
# {
# "id": 43,
# "distance": 0.07794742286205292,
# "entity": {
# "color": "grey_8510"
# }
# }
# ]
# ]

Use Limit and Offset

You may notice that the parameter limit carried in the search requests determines the number of entities to include in the search results. This parameter specifies the maximum number of entities to return in a single search, and it is usually termed top-K.

If you wish to perform paginated queries, you can use a loop to send multiple Search requests, with the Limit and Offset parameters carried in each query request. Specifically, you can set the Limit parameter to the number of Entities you want to include in the current query results, and set the Offset to the total number of Entities that have already been returned.

The table below outlines how to set the Limit and Offset parameters for paginated queries when returning 100 Entities at a time.

Queries

Entities to return per query

Entities already been returned in total

The 1st query

100

0

The 2nd query

100

100

The 3rd query

100

200

The nth query

100

100 x (n-1)

Note that, the sum of limit and offset in a single ANN search should be less than 16,384.

# 4. Single vector search
query_vector = [0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592],

res = client.search(
collection_name="quick_setup",
data=[query_vector],
limit=3, # The number of results to return
search_params={
"metric_type": "IP",
"offset": 10 # The records to skip
}
)

Use Level

To optimize ANN searches, Zilliz Cloud provides a parameter named level to control the search precision with simplified search optimization.

This parameter ranges from 1 to 10 and defaults to 1. Increasing the value improves search recall rate with a degradation in search performance. In common cases, the default value yields a maximum of 90% recall rate. You can increase the value as required.

📘Notes

The level parameter is still in Public Preview. If you cannot set it to a value greater than 5, your cluster may not fully support this feature. As a workaround, you can set it to a value within the range from 0 to 5 instead, or contact us at support@zilliz.com.

# 4. Single vector search
query_vector = [0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592],

res = client.search(
collection_name="quick_setup",
data=[query_vector],
limit=3, # The number of results to return
search_params={
params: {
"level": 10 # The precision control
}
}
)

Get Recall Rate

You can set enable_recall_calculation to truewhen you tweek the level parameter so that you can evaluate the precisions of your search with different level values.

📘Notes

The enable_recall_calculation parameter is still in Public Preview, and you might not be able to use it due to compatibility issues. For any assistance, please contact us at support@zilliz.com.

# 4. Single vector search
query_vector = [0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592],

res = client.search(
collection_name="quick_setup",
data=[query_vector],
limit=3, # The number of results to return
search_params={
params: {
"level": 10 # The precision control,
"enable_recall_calculation": True # Ask to return recall rate
}
}
)

AUTOINDEX considerably flattens the learning curve of ANN searches. However, the search results may not always be correct as the top-K increases. By reducing the search scope, improving search result relevancy, and diversifying the search results, Zilliz Cloud works out the following search enhancements.

  • Filtered Search

    You can include filtering conditions in a search request so that Zilliz Cloud conducts metadata filtering before conducting ANN searches, reducing the search scope from the whole collection to only the entities matching the specified filtering conditions.

    For more about metadata filtering and filtering conditions, refer to Filtered Search and Filtering.

  • Range Search

    You can improve search result relevancy by restricting the distance or score of the returned entities within a specific range. In Zilliz Cloud, a range search involves drawing two concentric circles with the vector embedding most similar to the query vector as the center. The search request specifies the radius of both circles, and Zilliz Cloud returns all vector embeddings that fall within the outer circle but not the inner circle.

    For more about range search, refer to Range Search.

  • Grouping Search

    If the returned entities hold the same value in a specific field, the search results may not represent the distribution of all vector embeddings in the vector space. To diversify the search results, consider using the grouping search.

    For more about grouping search, refer to Grouping Search,

  • Hybrid Search

    A collection can include up to four vector fields to save the vector embeddings generated using different embedding models. By doing so, you can use a hybrid search to rerank the search results from these vector fields, improving the recall rate.

    For more about hybrid search, refer to Hybrid Search.

  • Search Iterator

    A single ANN search returns a maximum of 16,384 entities. Consider using search iterators if you need more entities to return in a single search.

    For details on search iterators, refer to Search Iterator.

  • Use Partition Key

    Involving multiple scalar fields in metadata filtering and using a rather complicated filtering condition may affect search efficiency. Once you set a scalar field as the partition key and use a filtering condition involving the partition key in the search request, it can help restrict the search scope within the partitions corresponding to the specified partition key values.

    For details on the partition key, refer to Use Partition Key.

  • Use mmap

    For details on mmap-settings, refer to Use mmap.