Skip to main content
Version: User Guides (Cloud)

Nullable & Default
Public Preview

Zilliz Cloud allows you to set the nullable attribute and default values for scalar fields, except the primary field. For fields marked as nullable=True, you can skip the field when inserting data, or set it directly to a null value, and the system will treat it as null without causing an error. When a field has a default value, the system will automatically apply this value if no data is specified for the field during insertion.

The default value and nullable attributes streamline data migration from other database systems to Zilliz Cloud by allowing handling of datasets with null values and preserving default value settings. When creating a collection, you can also enable nullable or set default values for fields where values might be uncertain.

Limits

  • Only scalar fields, excluding the primary field, support default values and the nullable attribute.

  • JSON and Array fields do not support default values.

  • Default values or the nullable attribute can only be configured during collection creation and cannot be modified afterward.

  • Scalar fields with the nullable attribute enabled cannot be used as group_by_field in Grouping Search. For more information about grouping search, refer to Grouping Search.

  • Fields marked as nullable cannot be used as partition keys. For more information about partition keys, refer to Use Partition Key.

  • When creating an index on a scalar field with the nullable attribute enabled, null values will be excluded from the index.

Nullable attribute

The nullable attribute allows you to store null values in a collection, providing flexibility when handling unknown data.

Set the nullable attribute

When creating a collection, use nullable=True to define nullable fields (defaults to False). The following example creates a collection named user_profiles_null and sets the age field as nullable:

from pymilvus import MilvusClient, DataType

client = MilvusClient(uri='YOUR_CLUSTER_ENDPOINT')

# Define collection schema
schema = client.create_schema(
auto_id=False,
enable_dynamic_schema=True,
)

schema.add_field(field_name="id", datatype=DataType.INT64, is_primary=True)
schema.add_field(field_name="vector", datatype=DataType.FLOAT_VECTOR, dim=5)
schema.add_field(field_name="age", datatype=DataType.INT64, nullable=True) # Nullable field

# Set index params
index_params = client.prepare_index_params()
index_params.add_index(field_name="vector", index_type="IVF_FLAT", metric_type="L2", params={ "nlist": 128 })

# Create collection
client.create_collection(collection_name="user_profiles_null", schema=schema, index_params=index_params)

Insert entities

When you insert data into a nullable field, insert null or directly omit this field:

data = [
{"id": 1, "vector": [0.1, 0.2, 0.3, 0.4, 0.5], "age": 30},
{"id": 2, "vector": [0.2, 0.3, 0.4, 0.5, 0.6], "age": None},
{"id": 3, "vector": [0.3, 0.4, 0.5, 0.6, 0.7]}
]

client.insert(collection_name="user_profiles_null", data=data)

Search and query with null values

When using the search method, if a field contains null values, the search result will return the field as null:

res = client.search(
collection_name="user_profiles_null",
data=[[0.1, 0.2, 0.4, 0.3, 0.128]],
limit=2,
search_params={"params": {"nprobe": 16}},
output_fields=["id", "age"]
)

print(res)

# Output
# data: ["[{'id': 1, 'distance': 0.15838398039340973, 'entity': {'age': 30, 'id': 1}}, {'id': 2, 'distance': 0.28278401494026184, 'entity': {'age': None, 'id': 2}}]"]

When you use the query method for scalar filtering, the filtering results for null values are all false, indicating that they will not be selected.

# Reviewing previously inserted data:
# {"id": 1, "vector": [0.1, 0.2, ..., 0.128], "age": 30}
# {"id": 2, "vector": [0.2, 0.3, ..., 0.129], "age": None}
# {"id": 3, "vector": [0.3, 0.4, ..., 0.130], "age": None} # Omitted age column is treated as None

results = client.query(
collection_name="user_profiles_null",
filter="age >= 0",
output_fields=["id", "age"]
)

# Example output:
# [
# {"id": 1, "age": 30}
# ]
# Note: Entities with `age` as `null` (id 2 and 3) will not appear in the result.

To return entities with null values, query without any scalar filtering condition as follows:

null_results = client.query(
collection_name="user_profiles_null",
filter="", # Query without any filtering condition
output_fields=["id", "age"]
)

# Example output:
# [{"id": 2, "age": None}, {"id": 3, "age": None}]

Default values

Default values are preset values assigned to scalar fields. If you do not provide a value for a field with a default during insertion, the system automatically uses the default value.

Set default values

When creating a collection, use the default_value parameter to define the default value for a field. The following example shows how to set the default value of age to 18 and status to "active":

schema = client.create_schema(
auto_id=False,
enable_dynamic_schema=True,
)

schema.add_field(field_name="id", datatype=DataType.INT64, is_primary=True)
schema.add_field(field_name="vector", datatype=DataType.FLOAT_VECTOR, dim=5)
schema.add_field(field_name="age", datatype=DataType.INT64, default_value=18)
schema.add_field(field_name="status", datatype=DataType.VARCHAR, default_value="active", max_length=10)

index_params = client.prepare_index_params()
index_params.add_index(field_name="vector", index_type="IVF_FLAT", metric_type="L2", params={ "nlist": 128 })

client.create_collection(collection_name="user_profiles_default", schema=schema, index_params=index_params)

Insert entities

When inserting data, if you omit fields with a default value or set their value to null, the system uses the default value:

data = [
{"id": 1, "vector": [0.1, 0.2, ..., 0.128], "age": 30, "status": "premium"},
{"id": 2, "vector": [0.2, 0.3, ..., 0.129]}, # `age` and `status` use default values
{"id": 3, "vector": [0.3, 0.4, ..., 0.130], "age": 25, "status": None}, # `status` uses default value
{"id": 4, "vector": [0.4, 0.5, ..., 0.131], "age": None, "status": "inactive"} # `age` uses default value
]

client.insert(collection_name="user_profiles_default", data=data)
📘Notes

For more information on how nullable and default value settings take effect, refer to Applicable rules.

Search and query with default values

Entities that contain default values are treated the same as any other entities during vector searches and scalar filtering. You can include default values as part of your search and query operations.

For example, in a search operation, entities with age set to the default value of 18 will be included in the results:

res = client.search(
collection_name="user_profiles_default",
data=[[0.1, 0.2, 0.4, 0.3, 0.128]],
search_params={"params": {"nprobe": 16}},
filter="age == 18", # 18 is the default value of the `age` field
limit=10,
output_fields=["id", "age", "status"]
)

print(res)

# Output
# data: ["[{'id': 2, 'distance': 0.28278401494026184, 'entity': {'id': 2, 'age': 18, 'status': 'active'}}, {'id': 4, 'distance': 0.8315839767456055, 'entity': {'id': 4, 'age': 18, 'status': 'inactive'}}]"]

In a query operation, you can match or filter by default values directly:

# Query all entities where `age` equals the default value (18)
default_age_results = client.query(
collection_name="user_profiles_default",
filter="age == 18",
output_fields=["id", "age", "status"]
)

# Query all entities where `status` equals the default value ("active")
default_status_results = client.query(
collection_name="user_profiles_default",
filter='status == "active"',
output_fields=["id", "age", "status"]
)

Applicable rules

The following table summarizes the behavior of nullable columns and default values under different configuration combinations. These rules determine how Zilliz Cloud handles data when attempting to insert null values or if field values are not provided.

Nullable

Default Value

Default Value Type

User Input

Result

Example

Non-null

None/null

Uses the default value

Field: age

Default value: 18

User input: null

Result: stored as 18

-

None/null

Stored as null

Field: middle_name

Default value: -

User input: null

Result: stored as null

Non-null

None/null

Uses the default value

Field: status

Default value: "active"

User input: null

Result: stored as "active"

-

None/null

Throws an error

Field: email

Default value: -

User input: null

Result: Operation rejected, system throws an error

Null

None/null

Throws an error

Field: username

Default value: null

User input: null

Result: Operation rejected, system throws an error