メインコンテンツまでスキップ
バージョン: User Guides (Cloud)

Index Scalar Fields

Zilliz Cloud supports indexing on scalar fields (non-vector fields) to significantly accelerate filtering and search performance, especially on large datasets.

Overview

Indexing a scalar field is optional, but recommended if you frequently access a specific scalar field in filter conditions.

Zilliz Cloud supports AUTOINDEX for the following field types:

Field Type

Description

VARCHAR

String

INT8, INT32, INT64

Integer

FLOAT, DOUBLE

Floating point

BOOL

Boolean

ARRAY

Homogeneous array of scalar values

Preparations

Before creating indexes, define a collection that includes both vector and scalar fields. Zilliz Cloud requires a vector field in every collection.

In this example, we define a schema for a product catalog, including a required vector field (vector) and a scalar field of the DOUBLE type (price):

from pymilvus import MilvusClient, DataType

client = MilvusClient(uri="YOUR_CLUSTER_ENDPOINT") # Replace with your cluster endpoint

# Define schema with dynamic field support
schema = client.create_schema(
auto_id=False,
enable_dynamic_field=True # Enable dynamic field
)

# Define fields
schema.add_field(field_name="product_id", datatype=DataType.INT64, is_primary=True)
schema.add_field(field_name="vector", datatype=DataType.FLOAT_VECTOR, dim=5) # Vector field
schema.add_field(field_name="price", datatype=DataType.DOUBLE) # Scalar field

# Create the collection
client.create_collection(
collection_name="product_catalog",
schema=schema
)

Index a scalar field

You can create an index on a scalar field using AUTOINDEX. No additional index parameters are needed. The example below creates an index on the price field:

index_params = client.prepare_index_params() # Prepare an empty IndexParams object, without having to specify any index parameters

index_params.add_index(
field_name="price", # Name of the scalar field to be indexed
index_type="AUTOINDEX", # Type of index to be created
index_name="price_index" # Name of the index to be created
)

After defining the index parameters, you can apply them to the collection using create_index():

client.create_index(
collection_name="product_catalog",
index_params=index_params
)

Check index details

Once you have created an index, you can check its details.

# Describe index
res = client.list_indexes(
collection_name="product_catalog"
)

print(res)

res = client.describe_index(
collection_name="product_catalog",
index_name="price_index"
)

print(res)

Drop an index

You can simply drop an index if it is no longer needed.

# Drop index
client.drop_index(
collection_name="product_catalog",
index_name="price_index"
)

Advanced features

There are also several advanced features around scalar indexes that you may be interested in.

NGRAM [READ MORE]

The `NGRAM` index in Zilliz Cloud is built to accelerate `LIKE` queries on `VARCHAR` fields or specific JSON paths within `JSON` fields. Before building the index, Zilliz Cloud splits text into short, overlapping substrings of a fixed length n, known as n-grams. For example, with n = 3, the word "Milvus" is split into 3-grams "Mil", "ilv", "lvu", and "vus". These n-grams are then stored in an inverted index that maps each gram to the document IDs in which it appears. At query time, this index allows Zilliz Cloud to quickly narrow the search to a small set of candidates, resulting in much faster query execution.