Skip to main content
Version: User Guides (Cloud)

Create Collection

This tutorial will guide you through the steps to set up a collection for your cluster. You can find the detailed explanations about collections, schema, and index here.

Procedure

If the idea of jumping right into the creation process without pre-defining every field sounds appealing, then the starter API is tailor-made for you. It offers a streamlined approach, demanding only the collection's name and the count of dimensions for the vector field.

from pymilvus import MilvusClient

CLUSTER_ENDPOINT="YOUR_CLUSTER_ENDPOINT" # Set your cluster endpoint
TOKEN="YOUR_CLUSTER_TOKEN" # Set your token
COLLECTION_NAME="medium_articles_2020" # Set your collection name

# Initialize a MilvusClient instance
# Replace uri and token with your own
client = MilvusClient(
uri=CLUSTER_ENDPOINT, # Cluster endpoint obtained from the console
token=TOKEN # API key or a colon-separated cluster username and password
)

# Create a collection
client.create_collection(
collection_name=COLLECTION_NAME,
dimension=768
)

res = client.describe_collection(
collection_name=COLLECTION_NAME
)

print(res)

# Output
#
# {
# "collection_name": "medium_articles_2020",
# "auto_id": false,
# "num_shards": 1,
# "description": "",
# "fields": [
# {
# "field_id": 100,
# "name": "id",
# "description": "",
# "type": 5,
# "params": {},
# "is_primary": true
# },
# {
# "field_id": 101,
# "name": "vector",
# "description": "",
# "type": 101,
# "params": {
# "dim": 768
# }
# }
# ],
# "aliases": [],
# "collection_id": 443943328732839733,
# "consistency_level": 2,
# "properties": [],
# "num_partitions": 1,
# "enable_dynamic_field": true
# }

By running the above snippets, you are allowing Zilliz Cloud to take charge of some default settings:

  • Primary key

    Zilliz Cloud automatically creates a primary key and dubs it id.

  • Vector field

    A default vector field named vector is initialized.

Additionally, collections established using this method automatically enable the dynamic schema feature. With this capability active, Zilliz Cloud seamlessly saves each undefined field in the data as dynamic fields upon insertion.

Supported data types

To aid in your schema design, here are the data types Zilliz Cloud can accommodate:

  • Boolean values: BOOLEAN

  • Floating points: DOUBLE (8-byte) and FLOAT (4-byte)

  • Integers: INT8 (8-bit), INT32 (32-bit), and INT64 (64-bit)

  • Float vectors: FLOAT_VECTOR

  • Characters: VARCHAR

  • Structured data types: JSON

Harness these types as building blocks for your collection's schema.

Limits

For a serverless cluster, you can create up to two collections. For a dedicated cluster, the number of collections you can create varies with the CU that your cluster uses.

Maximum number of collections
Serverless cluster2
Dedicated cluster (8 CUs and less)32
Dedicated cluster (More than 8 CUs)256