バージョン: User Guides (BYOC)

[説明] このページは機械翻訳された日本語版です。内容に誤りがございましたら、報告していただけると助かります。

スカラーフィールドのインデックス作成
Contact Sales to Enable BYOC

Zilliz Cloud は、スカラーフィールド（非ベクトルフィールド）に対するインデックス作成をサポートしており、特に大規模なデータセットにおいてフィルタリングと検索のパフォーマンスを大幅に向上させます。

概要

スカラーフィールドのインデックス作成は任意ですが、フィルタ条件で特定のスカラフィールドに頻繁にアクセスする場合は推奨されます。

Zilliz Cloud は、以下のフィールドタイプに対して AUTOINDEX をサポートしています：

フィールドタイプ	AUTOINDEX の解決先	説明
`VARCHAR`	BITMAP (C* < 100) / INVERTED ( C ≥ 100)	文字列データタイプ。詳細については、String Field を参照してください。
`INT8`, `INT16`, `INT32`, `INT64`	BITMAP (C < 100) / STL_SORT (C ≥ 100)	整数。詳細については、Boolean & Number を参照してください。
`FLOAT`, `DOUBLE`	BITMAP (C* < 100) / INVERTED ( C ≥ 100)	浮動小数点数。詳細については、Boolean & Number を参照してください。
`BOOL`	BITMAP	ブール値。詳細については、Boolean & Number を参照してください。
`ARRAY`	BITMAP (C* < 100) / INVERTED ( C ≥ 100)	スカラー値の同種配列。詳細については、配列 Field を参照してください。
`GEOMETRY`	RTREE	空間情報を格納する幾何学データ。詳細については、ジオメトリ Field を参照してください。
`TIMESTAMPTZ`	STL_SORT	タイムゾーン対応の ISO 8601 入力。UTC として保存され、タイムゾーン間で一貫したフィルタリングと順序付けが可能になります。詳細については、TIMESTAMPTZ Field を参照してください。

📘Notes

基数（上記表の C）は、コレクション全体におけるフィールド内の一意の値の数を示します。例えば、浮動小数点フィールドの基数は、そのフィールド内の異なる浮動小数点値の数です。

配列フィールドの場合、基数はセグメント内のすべての配列にわたる固有の要素値の数です。例えば：

[1, 2, 3]
[2, 3, 4]
[1, 4, 5]

個別の要素値は {1, 2, 3, 4, 5} → カーディナリティ = 5 です。すべての配列からすべての要素をフラット化し、一意の値をカウントします — 個別の配列の数や配列の長さではありません。

準備

インデックスを作成する前に、ベクトルフィールドとスカラーフィールドの両方を含むコレクションを定義します。Zilliz Cloud では、すべてのコレクションにベクトルフィールドが必要です。

この例では、製品カタログのスキーマを定義します。必須のベクトルフィールド（vector）と DOUBLE 型のスカラーフィールド（price）を含めます：

Python
Java
NodeJS
Go
cURL

from pymilvus import MilvusClient, DataType

client = MilvusClient(uri="YOUR_CLUSTER_ENDPOINT") # Replace with your cluster endpoint

# Define schema with dynamic field support
schema = client.create_schema(
    auto_id=False,
    enable_dynamic_field=True # Enable dynamic field
)

# Define fields
schema.add_field(field_name="product_id", datatype=DataType.INT64, is_primary=True)
schema.add_field(field_name="vector", datatype=DataType.FLOAT_VECTOR, dim=5) # Vector field
schema.add_field(field_name="price", datatype=DataType.DOUBLE) # Scalar field

# Create the collection
client.create_collection(
    collection_name="product_catalog",
    schema=schema
)

import io.milvus.v2.client.*;
import io.milvus.v2.service.collection.request.CreateCollectionReq;
import io.milvus.v2.service.collection.request.AddFieldReq;

ConnectConfig config = ConnectConfig.builder()
        .uri("YOUR_CLUSTER_ENDPOINT")
        .build();
MilvusClientV2 client = new MilvusClientV2(config);

CreateCollectionReq.CollectionSchema schema = CreateCollectionReq.CollectionSchema.builder()
        .enableDynamicField(true)
        .build();
schema.addField(AddFieldReq.builder()
        .fieldName("product_id")
        .dataType(DataType.Int64)
        .isPrimaryKey(Boolean.TRUE)
        .build());
schema.addField(AddFieldReq.builder()
        .fieldName("vector")
        .dataType(DataType.FloatVector)
        .dimension(5)
        .build());

schema.addField(AddFieldReq.builder()
        .fieldName("price")
        .dataType(DataType.Double)
        .build());

CreateCollectionReq requestCreate = CreateCollectionReq.builder()
        .collectionName("product_catalog")
        .collectionSchema(schema)
        .build();
client.createCollection(requestCreate);

import { MilvusClient, DataType } from '@zilliz/milvus2-sdk-node';

// initialize client
const client = new MilvusClient({
  address: 'YOUR_CLUSTER_ENDPOINT', // Replace with your cluster endpoint
});

const collectionName = 'product_catalog';

// define schema
const schema = [
  {
    name: 'product_id',
    description: 'Primary key',
    data_type: DataType.Int64,
    is_primary_key: true,
    autoID: false,
  },
  {
    name: 'vector',
    description: 'Embedding vector',
    data_type: DataType.FloatVector,
    type_params: {
      dim: '5',
    },
  },
  {
    name: 'price',
    description: 'Product price',
    data_type: DataType.Double,
  },
];

// create collection
const res = await client.createCollection({
    collection_name: collectionName,
    fields: schema,
    enable_dynamic_field: true,
});

console.log('Create collection result:', res);

import (
    "context"

    "github.com/milvus-io/milvus/client/v2/entity"
    "github.com/milvus-io/milvus/client/v2/milvusclient"
)

ctx, cancel := context.WithCancel(context.Background())
defer cancel()

client, err := milvusclient.New(ctx, &milvusclient.ClientConfig{
    Address: "YOUR_CLUSTER_ENDPOINT",
})
if err != nil {
    return err
}

schema := entity.NewSchema().WithDynamicFieldEnabled(true)
schema.WithField(entity.NewField().
    WithName("product_id").pk
    WithDataType(entity.FieldTypeInt64).
    WithIsPrimaryKey(true),
).WithField(entity.NewField().
    WithName("vector").
    WithDataType(entity.FieldTypeFloatVector).
    WithDim(5),
).WithField(entity.NewField().
    WithName("price").
    WithDataType(entity.FieldTypeDouble),
)

err = client.CreateCollection(ctx, milvusclient.NewCreateCollectionOption("product_catalog", schema))
if err != nil {
    return err
}

export TOKEN="YOUR_CLUSTER_TOKEN"
export CLUSTER_ENDPOINT="YOUR_CLUSTER_ENDPOINT"

export productIdField='{
  "fieldName": "product_id",
  "dataType": "Int64",
  "isPrimary": true,
  "autoID": false
}'

export vectorField='{
  "fieldName": "vector",
  "dataType": "FloatVector",
  "elementTypeParams": {
    "dim": 5
  }
}'

export priceField='{
  "fieldName": "price",
  "dataType": "Double"
}'

export schema="{
  \"autoID\": false,
  \"enableDynamicField\": true,
  \"fields\": [
    $productIdField,
    $vectorField,
    $priceField
  ]
}"

curl --request POST \
--url "${CLUSTER_ENDPOINT}/v2/vectordb/collections/create" \
--header "Authorization: Bearer ${TOKEN}" \
--header "Content-Type: application/json" \
--data "{
  \"collectionName\": \"product_catalog\",
  \"schema\": $schema
}"

スカラー型フィールドにインデックスを作成する

AUTOINDEX を使用してスカラー型フィールドにインデックスを作成できます。追加のインデックスパラメータは不要です。以下の例では、price フィールドにインデックスを作成しています：

Python
Java
NodeJS
Go
cURL

index_params = client.prepare_index_params() # Prepare an empty IndexParams object, without having to specify any index parameters

index_params.add_index(
    field_name="price", # Name of the scalar field to be indexed
    index_type="AUTOINDEX", # Type of index to be created
    index_name="price_index" # Name of the index to be created
)

import io.milvus.v2.common.IndexParam;

List<IndexParam> indexParams = new ArrayList<>();
indexParams.add(IndexParam.builder()
        .fieldName("price")
        .indexName("price_index")
        .indexType(IndexParam.IndexType.AUTOINDEX)
        .build());

const indexParams = [{
    collection_name: collectionName,
    field_name: 'price',
    index_type: 'AUTOINDEX',
    index_name: 'price_index'
}];

import (
    "github.com/milvus-io/milvus/client/v2/index"
)

indexOpt := client.NewCreateIndexOption("product_catalog", "price",
        index.NewInvertedIndex())

export priceIndex='{
  "fieldName": "price",
  "indexName": "price_index",
  "params": {
    "index_type": "AUTOINDEX"
  }
}'

インデックスパラメータを定義した後、create_index() を使用してコレクションに適用できます。

Python
Java
NodeJS
Go
cURL

client.create_index(
    collection_name="product_catalog",
    index_params=index_params
)

import io.milvus.v2.service.index.request.CreateIndexReq;

client.createIndex(CreateIndexReq.builder()
        .collectionName("product_catalog")
        .indexParams(indexParams)
        .build());

client.createIndex(indexParams)

indexTask1, err := client.CreateIndex(ctx, indexOpt1)
if err != nil {
    return err
}
indexTask2, err := client.CreateIndex(ctx, indexOpt2)
if err != nil {
    return err
}
indexTask3, err := client.CreateIndex(ctx, indexOpt3)
if err != nil {
    return err
}
indexTask4, err := client.CreateIndex(ctx, indexOpt4)
if err != nil {
    return err
}

curl --request POST \
--url "${CLUSTER_ENDPOINT}/v2/vectordb/indexes/create" \
--header "Authorization: Bearer ${TOKEN}" \
--header "Content-Type: application/json" \
--data "{
  \"collectionName\": \"product_catalog\",
  \"indexParams\": [$priceIndex]
}"

インデックスの詳細を確認する

インデックスを作成したら、その詳細を確認できます。

Python
Java
NodeJS

# Describe index
res = client.list_indexes(
    collection_name="product_catalog"
)

print(res)

res = client.describe_index(
    collection_name="product_catalog",
    index_name="price_index"
)

print(res)

import io.milvus.v2.service.index.request.DescribeIndexReq;
import io.milvus.v2.service.index.response.DescribeIndexResp;

// Describe index
// List the index names
ListIndexesReq listIndexesReq = ListIndexesReq.builder()
    .collectionName("product_catalog")
    .build();

List<String> indexNames = client.listIndexes(listIndexesReq);

System.out.println(indexNames);

// Describe an index
DescribeIndexReq describeIndexReq = DescribeIndexReq.builder()
    .collectionName("product_catalog")
    .indexName("price_index")
    .build();

DescribeIndexResp describeIndexResp = client.describeIndex(describeIndexReq);

System.out.println(JSONObject.toJSON(describeIndexResp));

// Describe the index
res = await client.describeIndex({
    collection_name: "product_catalog",
    index_name: "price_index"
})

console.log(JSON.stringify(res.index_descriptions, null, 2))

インデックスの削除

drop_index() メソッドを使用して、コレクションから既存のインデックスを削除します。

📘Notes

Milvus v2.6.x と互換性のあるクラスターでは、スカラーインデックスが不要になった時点で直接削除できます。コレクションを事前にリリースする必要はありません。

Python
Java
NodeJS

# Drop index
client.drop_index(
    collection_name="product_catalog",
    index_name="price_index"
)

// Drop index

DropIndexReq dropIndexReq = DropIndexReq.builder()
    .collectionName("product_catalog")
    .indexName("price_index")
    .build();

client.dropIndex(dropIndexReq);

// Drop the index
res = await client.dropIndex({
    collection_name: "product_catalog",
    index_name: "price_index"
})

console.log(res.error_code)

高度な機能

スカラーインデックスに関して、他にもいくつかの高度な機能があります。

BITMAP [READ MORE]

ビットマップインデックスは、低カーディナリティのスカラーフィールドに対するクエリ性能を向上させるための効率的なインデックス手法です。カーディナリティとは、フィールド内の異なる値の数を指します。異なる要素の数が少ないフィールドは、低カーディナリティと見なされます。

INVERTED [READ MORE]

データに対して頻繁にフィルタクエリを実行する必要がある場合、`INVERTED` インデックスを使用することでクエリパフォーマンスを大幅に向上させることができます。すべてのドキュメントをスキャンするのではなく、Zilliz Cloud は転置インデックスを使用して、フィルタ条件に一致する正確なレコードを迅速に特定します。

Zilliz Cloud の `NGRAM` インデックスは、`VARCHAR` フィールドまたは `JSON` フィールド内の特定の JSONパスに対する `LIKE` クエリを高速化するために構築されます。インデックスを構築する前に、Zilliz Cloud はテキストを固定長 n の短い重複する部分文字列（n-gram と呼ばれる）に分割します。例えば、n = 3 の場合、単語 "Milvus" は 3-gram の "Mil"、"ilv"、"lvu"、"vus" に分割されます。これらの n-gram は、各グラムが出現するドキュメント ID にマッピングされる転置インデックスに格納されます。クエリ時に、このインデックスにより Zilliz Cloud は検索を少数の候補に迅速に絞り込むことができ、クエリ実行が大幅に高速化されます。

RTREE [READ MORE]

`RTREE` インデックスは、Zilliz Cloud の `GEOMETRY` フィールドに対するクエリを高速化するツリー構造のデータ構造です。コレクションにポイント、ライン、ポリゴンなどの幾何オブジェクトを Well-known text (WKT) 形式で格納しており、空間フィルタリングを高速化したい場合、`RTREE` は最適な選択です。

STL_SORT [READ MORE]

STLSORT インデックスは、Zilliz Cloud で数値フィールド（INT8、INT16 など）、VARCHAR フィールド、または TIMESTAMPTZ フィールドのデータをソート順に整理することで、クエリパフォーマンスを向上させるために特別に設計されたインデックスタイプです。

概要​

準備​

スカラー型フィールドにインデックスを作成する​

インデックスの詳細を確認する​

インデックスの削除​

高度な機能​