Skip to main content
Version: User Guides (BYOC)

Enable Dynamic Field

This page explains how to use the dynamic field in a collection for flexible data insertion and retrieval.

Overview

Zilliz Cloud allows you to define the schema of a collection by setting the name and the data type of each specific field so that you can create indexes in these fields for improved search performance.

Once a field is defined, you need to include this field when you insert data. What if some fields are not always present in all your data entries? This is where the dynamic field comes in.

The dynamic field in a collection is a reserved JSON field named $meta. It can hold non-schema-defined fields and their values as key-value pairs. Using the dynamic field, you can search and query both schema-defined fields and any non-schema-defined fields they may have.

Enable dynamic field

When defining a schema for a collection, you can set enable_dynamic_field to True to enable the reserved dynamic field, indicating that any non-schema-defined fields and their values inserted later on will be saved as key-value pairs in the reserved dynamic field.

The following snippet creates a collection with two schema-defined fields, namely id and vector, and enables the dynamic field.

import random, time
from pymilvus import connections, MilvusClient, DataType

CLUSTER_ENDPOINT = "YOUR_CLUSTER_ENDPOINT"
TOKEN = "YOUR_CLUSTER_TOKEN"

# 1. Set up a Milvus client
client = MilvusClient(
uri=CLUSTER_ENDPOINT,
token=TOKEN
)

# 2. Create a collection
schema = MilvusClient.create_schema(
auto_id=False,
enable_dynamic_field=True,
)

schema.add_field(field_name="id", datatype=DataType.INT64, is_primary=True)
# The dim value should be greater than 1
schema.add_field(field_name="vector", datatype=DataType.FLOAT_VECTOR, dim=5)

index_params = MilvusClient.prepare_index_params()

index_params.add_index(
field_name="id",
index_type="STL_SORT"
)

index_params.add_index(
field_name="vector",
index_type="AUTOINDEX",
metric_type="L2"
)

client.create_collection(
collection_name="test_collection",
schema=schema,
index_params=index_params
)

res = client.get_load_state(
collection_name="test_collection"
)

print(res)

# Output
#
# {
# "state": "<LoadState: Loaded>"
# }

Insert dynamic data

Once the collection is created with the dynamic field enabled, you can start inserting data, including any non-schema-defined fields and their values.

Prepare data

In this section, you need to prepare some randomly generated data for the insertion later on.

colors = ["green", "blue", "yellow", "red", "black", "white", "purple", "pink", "orange", "brown", "grey"]
data = []

for i in range(1000):
current_color = random.choice(colors)
current_tag = random.randint(1000, 9999)
data.append({
"id": i,
"vector": [ random.uniform(-1, 1) for _ in range(5) ],
"color": current_color,
"tag": current_tag,
"color_tag": f"{current_color}_{str(current_tag)}"
})

print(data[0])

You can view the structure of the generated data by checking its first entry.

{
id: 0,
vector: [
0.1275656405044483,
0.47417858592773277,
0.13858264437643286,
0.2390904907020377,
0.8447862593689635
],
color: 'blue',
tag: 2064,
color_tag: 'blue_2064'
}

Insert data

Then you can safely insert the data into the collection.

res = client.insert(
collection_name="test_collection",
data=data,
)

print(res)

# Output
#
# {
# "insert_count": 1000,
# "ids": [
# 0,
# 1,
# 2,
# 3,
# 4,
# 5,
# 6,
# 7,
# 8,
# 9,
# "(990 more items hidden)"
# ]
# }

time.sleep(5)

Search with dynamic fields

If you have created the collection with the dynamic field enabled and inserted non-schema-defined fields, you can use these fields in the filter expression of a search or a query as follows:

# 4. Search with dynamic fields
query_vectors = [[0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592]]

res = client.search(
collection_name="test_collection",
data=query_vectors,
filter="color in [\"red\", \"green\"]",
search_params={"metric_type": "L2", "params": {"nprobe": 10}},
limit=3
)

print(res)

# Output
#
# [
# [
# {
# "id": 863,
# "distance": 0.188413605093956,
# "entity": {
# "id": 863,
# "color_tag": "red_2371"
# }
# },
# {
# "id": 799,
# "distance": 0.29188022017478943,
# "entity": {
# "id": 799,
# "color_tag": "red_2235"
# }
# },
# {
# "id": 564,
# "distance": 0.3492690920829773,
# "entity": {
# "id": 564,
# "color_tag": "red_9186"
# }
# }
# ]
# ]

Recaps

It is worth noting that color, tag, and color_tag are not present when you define the collection schema, but you can use them as schema-defined fields when you conduct searches and queries.

If the name of a non-schema-defined field contains characters other than digits, letters, and underscores, such as plus signs (+), asterisks (*), or dollar signs ($), you have to include the key within $meta[] as shown in the following code snippet when using it in a boolean expression or including it in the output fields.

... 
filter='$meta["$key"] in ["a", "b", "c"]',
output_fields='$meta["$key"]'
...