Skip to main content

create_schema()

Addedv2.3.xModifiedv3.0.x

This operation creates a collection schema.

Request syntax

MilvusClient.create_schema(**kwargs) -> CollectionSchema
📘Notes

This is a class method. You should call this method like this: MilvusClient.create_schema().

PARAMETERS:

  • kwargs -

    • auto_id (bool)

      Whether allows the primary field to automatically increment.

      Setting this to True makes the primary field automatically increment. In this case, the primary field should not be included in the data to insert to avoid errors.

    • enable_dynamic_field (bool)

      Whether allows Zilliz Cloud saves the values of undefined fields in a dynamic field if the data being inserted into the target collection includes fields that are not defined in the collection's schema.

      When you set this to True, Zilliz Cloud will create a field called $meta to store any undefined fields and their values from the data that is inserted.

      📘What is a dynamic field?

      If the data being inserted into the target collection includes fields that are not defined in the collection's schema, those fields will be saved in a reserved dynamic field named $meta as key-value pairs.

    • primary_field (str)

      The name of the primary field.

    • partition_key_field (str)

      The name of the field that serves as the partition key.

      Setting this makes Zilliz Cloud manage all partitions in the current collection.

      📘What is a partition key?

      Once a field is designated as the partition key, Zilliz Cloud calculates a hash based on the partition key value of each inserted entity and saves entities in the partitions of the target collection accordingly.

      This is particularly useful when implementing data separation based on a specific key, such as partition-oriented multi-tenancy.

  • external_source (str) -

    The external source URI, which should be a volume:// URI that points to an accessible external volume. For example, volume://<volume-name>/path/to/folder/..

  • external_spec (str) -

    The external source specifications, which are a set of secondary parameters:

    • format (str) -

      The format of the target source data files.

      Possible values are parquet, vortex, lance-table, and iceberg-table.

    • snapshot_id (str) -

      The ID of an Iceberg table. This applies only when format is iceberg-table.

RETURN TYPE:

CollectionSchema

RETURNS:

A CollectionSchema object.

EXCEPTIONS:

  • MilvusException

    This exception will be raised when any error occurs during this operation.

Examples

  • Schema for managed collection

    from pymilvus import MilvusClient, DataType

    # 1. Create a schema
    schema = MilvusClient.create_schema(
    auto_id=False,
    enable_dynamic_field=False,
    )

    # 2. Add fields to schema
    schema.add_field(field_name="my_id", datatype=DataType.INT64, is_primary=True)

    # {
    # 'auto_id': False,
    # 'description': '',
    # 'fields': [
    # {
    # 'name': 'my_id',
    # 'description': '',
    # 'type': <DataType.INT64: 5>,
    # 'is_primary': True,
    # 'auto_id': False
    # }
    # ]
    # }

    schema.add_field(field_name="my_vector", datatype=DataType.FLOAT_VECTOR, dim=5)

    # {
    # 'auto_id': False,
    # 'description': '',
    # 'fields': [
    # {
    # 'name': 'my_id',
    # 'description': '',
    # 'type': <DataType.INT64: 5>,
    # 'is_primary': True,
    # 'auto_id': False
    # },
    # {
    # 'name': 'my_vector',
    # 'description': '',
    # 'type': <DataType.FLOAT_VECTOR: 101>,
    # 'params': {
    # 'dim': 5
    # }
    # }
    # ]
    # }
  • Schema for external collection

    schema = MilvusClient.create_schema(
    external_source='volume://my_volume/path/to/a/folder/',
    external_spec='{"format": "parquet"}'
    )

    schema.add_field(
    field_name="product_id",
    datatype=DataType.INT64,
    # highlight-next
    external_field="id" # field name in the external data file
    )
    schema.add_field(
    field_name="product_name",
    datatype=DataType.VARCHAR,
    max_length=512,
    # highlight-next
    external_field="name"
    )
    schema.add_field(
    field_name="embedding",
    datatype=DataType.FLOAT_VECTOR,
    dim=768,
    # highlight-next
    external_field="vector"
    )