create_schema()

Addedv2.3.xModifiedv3.0.x

This operation creates a collection schema.

Request syntax

MilvusClient.create_schema(**kwargs) -> CollectionSchema

📘Notes

This is a class method. You should call this method like this: MilvusClient.create_schema().

PARAMETERS:

kwargs -
- auto_id (bool)
  
  Whether allows the primary field to automatically increment.
  
  Setting this to True makes the primary field automatically increment. In this case, the primary field should not be included in the data to insert to avoid errors.
- enable_dynamic_field (bool)
  
  Whether allows Zilliz Cloud saves the values of undefined fields in a dynamic field if the data being inserted into the target collection includes fields that are not defined in the collection's schema.
  
  When you set this to True, Zilliz Cloud will create a field called $meta to store any undefined fields and their values from the data that is inserted.
  
  📘What is a dynamic field?
  If the data being inserted into the target collection includes fields that are not defined in the collection's schema, those fields will be saved in a reserved dynamic field named $meta as key-value pairs.
- primary_field (str)
  
  The name of the primary field.
- partition_key_field (str)
  
  The name of the field that serves as the partition key.
  
  Setting this makes Zilliz Cloud manage all partitions in the current collection.
  
  📘What is a partition key?
  Once a field is designated as the partition key, Zilliz Cloud calculates a hash based on the partition key value of each inserted entity and saves entities in the partitions of the target collection accordingly.
  This is particularly useful when implementing data separation based on a specific key, such as partition-oriented multi-tenancy.
external_source (str) -

The external source URI, which should be a volume:// URI that points to an accessible external volume. For example, volume://<volume-name>/path/to/folder/..
external_spec (str) -

The external source specifications, which are a set of secondary parameters:
- format (str) -
  
  The format of the target source data files.
  
  Possible values are parquet, vortex, lance-table, and iceberg-table.
- snapshot_id (str) -
  
  The ID of an Iceberg table. This applies only when format is iceberg-table.

RETURN TYPE:

CollectionSchema

RETURNS:

A CollectionSchema object.

EXCEPTIONS:

MilvusException

This exception will be raised when any error occurs during this operation.

Examples

Schema for managed collection

from pymilvus import MilvusClient, DataType

# 1. Create a schema
schema = MilvusClient.create_schema(
    auto_id=False,
    enable_dynamic_field=False,
)

# 2. Add fields to schema
schema.add_field(field_name="my_id", datatype=DataType.INT64, is_primary=True)

# {
#     'auto_id': False, 
#     'description': '', 
#     'fields': [
#         {
#             'name': 'my_id', 
#             'description': '', 
#             'type': <DataType.INT64: 5>, 
#             'is_primary': True, 
#             'auto_id': False
#         }
#     ]
# }

schema.add_field(field_name="my_vector", datatype=DataType.FLOAT_VECTOR, dim=5)

# {
#     'auto_id': False, 
#     'description': '', 
#     'fields': [
#         {
#             'name': 'my_id', 
#             'description': '', 
#             'type': <DataType.INT64: 5>, 
#             'is_primary': True, 
#             'auto_id': False
#         }, 
#         {
#             'name': 'my_vector', 
#             'description': '', 
#             'type': <DataType.FLOAT_VECTOR: 101>, 
#             'params': {
#                 'dim': 5
#             }
#         }        
#     ]
# }

Schema for external collection

schema = MilvusClient.create_schema(
    external_source='volume://my_volume/path/to/a/folder/',
    external_spec='{"format": "parquet"}'
)

schema.add_field(
    field_name="product_id",
    datatype=DataType.INT64,
    # highlight-next
    external_field="id" # field name in the external data file
)
schema.add_field(
    field_name="product_name",
    datatype=DataType.VARCHAR,
    max_length=512,
    # highlight-next
    external_field="name"
)
schema.add_field(
    field_name="embedding",
    datatype=DataType.FLOAT_VECTOR,
    dim=768,
    # highlight-next
    external_field="vector"
)

Request syntax​

Examples​

Request syntax

Examples