Skip to main content
Version: User Guides (Cloud)

Index Scalar Fields

Zilliz Cloud supports indexing on scalar fields (non-vector fields) to significantly accelerate filtering and search performance, especially on large datasets.

📘Notes

Indexing scalar fields is optional, but highly recommended for fields that are frequently used in filter conditions.

What you can index

Zilliz Cloud supports AUTOINDEX for the following field types:

Field Type

Description

VARCHAR

String

INT8, INT32, INT64

Integer

FLOAT, DOUBLE

Floating point

BOOL

Boolean

ARRAY

Homogeneous array of scalar values

JSON

Schema-defined or dynamic field (with specific path targeting)

📘Notes

Indexing the entire JSON object is not supported. You must specify a path to a scalar value within the JSON field. For more information, refer to JSON Field

Define a collection schema

Before creating indexes, define a collection that includes both vector and scalar fields. Zilliz Cloud requires a vector field in every collection.

In this example, we define a schema for a product catalog, including scalar fields, a JSON metadata field, and a required vector field, with the dynamic field feature enabled:

from pymilvus import MilvusClient, DataType

client = MilvusClient(uri="YOUR_CLUSTER_ENDPOINT") # Replace with your cluster endpoint

# Define schema with dynamic field support
schema = client.create_schema(
auto_id=False,
enable_dynamic_field=True # Enable dynamic field
)

# Required fields
schema.add_field(field_name="product_id", datatype=DataType.INT64, is_primary=True)
schema.add_field(field_name="vector", datatype=DataType.FLOAT_VECTOR, dim=5)

# Scalar and JSON fields
schema.add_field(field_name="price", datatype=DataType.DOUBLE)
schema.add_field(field_name="metadata", datatype=DataType.JSON, nullable=True)

# Create the collection
client.create_collection(
collection_name="product_catalog",
schema=schema
)
📘Notes

For more information on how to use JSON fields and the dynamic field, refer to JSON Field and Dynamic Field.

Index a non-JSON field

You can create an index on any non-JSON scalar field using AUTOINDEX. No additional index parameters are needed.

index_params = client.prepare_index_params() # Prepare an empty IndexParams object, without having to specify any index parameters

index_params.add_index(
field_name="price", # Name of the scalar field to be indexed
index_type="AUTOINDEX", # Type of index to be created
index_name="price_index" # Name of the index to be created
)

Index a JSON field

Zilliz Cloud supports indexing a JSON field using JSON path indexing. This allows you to filter by keys or nested values inside a JSON object without scanning the entire field.

Example JSON field

Consider a schema-defined metadata field:

{
"metadata": {
"category": "electronics",
"brand": "BrandA",
"in_stock": true,
"tags": ["clearance", "summer_sale"],
"string_price": "99.99"
}
}

You can create indexes on paths like:

  • metadata["category"]

  • metadata["tags"]

  • metadata["string_price"] → using a cast function to convert string numbers into double

JSON path indexing syntax

To create a JSON path index, specify:

  • JSON path (json_path): The path to the key or nested field within your JSON object that you want to index.

    • Example: metadata["category"]

      This defines where the indexing engine should look inside the JSON structure.

  • JSON cast type (json_cast_type): The data type that Zilliz Cloud should use when interpreting and indexing the value at the specified path.

    • This type must match the actual data type of the field being indexed.

    • For a complete list, refer to Supported JSON cast types.

# Index the category field as a string
index_params.add_index(
field_name="metadata",
index_type="AUTOINDEX", # Must be set to AUTOINDEX for JSON path indexing
index_name="category_index",
params={
"json_path": "metadata[\"category\"]",
"json_cast_type": "varchar"
}
)

# Index the tags array as string array
index_params.add_index(
field_name="metadata",
index_type="AUTOINDEX", # Must be set to AUTOINDEX for JSON path indexing
index_name="tags_array_index",
params={
"json_path": "metadata[\"tags\"]",
"json_cast_type": "array_varchar"
}
)

Use JSON cast functions for type conversion

When your JSON contains values in an incorrect format (e.g., numbers stored as strings), you can use cast functions to convert values during indexing.

# Convert string numbers to double for indexing
index_params.add_index(
field_name="metadata",
index_type="AUTOINDEX", # Must be set to AUTOINDEX for JSON path indexing
index_name="string_to_double_index",
params={
"json_path": "metadata[\"string_price\"]",
"json_cast_type": "double", # # Must be the output type of the cast function
"json_cast_function": "STRING_TO_DOUBLE" # Case insensitive
}
)
📘Notes
  • If type conversion fails (e.g. value "not_a_number" cannot be converted to a number), the value is skipped and unindexed.

  • For details on cast function parameters, refer to JSON Field.

Index keys in the dynamic field

If you have dynamic field enabled, you can index specific scalar keys that were not explicitly defined in the schema. These keys are stored in a hidden JSON field and treated just like other scalar fields for indexing purposes.

📘Notes

For details on the dynamic field, refer to Dynamic Field.

# Index a dynamic key (e.g., inserted but not defined in schema)
index_params.add_index(
field_name="overview", # Key name in the dynamic field
index_type="AUTOINDEX",
index_name="overview_index",
params={
"json_path": "overview", # Key name in the dynamic field
"json_cast_type": "varchar" # # Data type that Zilliz Cloud uses when indexing the values
}
)

Apply indexes to the collection

After defining the index parameters, you can apply them to the collection using create_index():

client.create_index(
collection_name="product_catalog",
index_params=index_params
)

FAQ

When should I create indexes on scalar fields?

Creating scalar indexes is optional—but highly recommended when the field is frequently used in filter conditions. Without an index, Zilliz Cloud performs a full collection scan during filtering, which can significantly impact performance on large datasets. Indexing such fields enables faster filtering using scalar indexes.

Can I create multiple indexes on the same JSON field?

Yes, you can index different paths within the same JSON field, but only one index per unique path is allowed.

When indexing a specific JSON path, what if the JSON path doesn't exist in some rows?

Those rows are silently skipped during indexing. No errors will be raised.

When indexing a JSON field, what happens when cast functions fail?

Zilliz Cloud silently ignores values that cannot be converted, such as a string that can't be parsed to a number.

Can I index specific array elements?

Yes, you can index specific array positions like metadata["tags"][0] for the first element of an array.

What happens if some values in a JSON field can’t be cast to the index type?

They will be skipped silently during indexing and excluded from index-based query results. This can cause partial results if your data has inconsistent types.

Can I index the same JSON field path multiple times with different types?

No, a single JSON path or dynamic field key supports only one index at a time. You must choose one json_cast_type for indexing.

Where can I find full details about indexing JSON fields or dynamic fields?

Refer to the JSON Field and Dynamic Field sections for more information.