バージョン: User Guides (Cloud)

[説明] このページは機械翻訳された日本語版です。内容に誤りがございましたら、報告していただけると助かります。

JSON Field Overview

When building applications like product catalogs, content management systems, or user preference engines, you often need to store flexible metadata alongside your vector embeddings. Product attributes vary by category, user preferences evolve over time, and document properties have complex nested structures. JSON fields in Zilliz Cloud solve this challenge by allowing you to store and query flexible structured data without sacrificing performance.

What is a JSON field?

A JSON field is a schema-defined data type (DataType.JSON) in Zilliz Cloud that stores structured key-value data. Unlike traditional rigid database columns, JSON fields accommodate nested objects, arrays, and mixed data types while providing multiple indexing options for fast queries.

Example JSON field structure:

{
  "metadata": { 
    "category": "electronics",
    "brand": "BrandA",
    "in_stock": true,
    "price": 99.99,
    "string_price": "99.99",
    "tags": ["clearance", "summer_sale"],
    "supplier": {
      "name": "SupplierX",
      "country": "USA",
      "contact": {
        "email": "support@supplierx.com",
        "phone": "+1-800-555-0199"
      }
    }
  }
}

In this example, metadata is a single JSON field that contains a mix of flat values (e.g. category, in_stock), arrays (tags), and nested objects (supplier).

📘Notes

Naming convention: Use only letters, numbers, and underscores in JSON keys. Avoid special characters, spaces, or dots as they may cause parsing issues in queries.

JSON field vs. dynamic field

A common point of confusion is the difference between a JSON field and the dynamic field. While both are related to JSON, they serve different purposes.

The table below summarizes the key differences between a JSON field and the dynamic field:

Feature	JSON Field	Dynamic Field
Schema definition	A scalar field that must be explicitly declared in the collection schema with the `DataType.JSON` type.	A hidden JSON field (named `$meta`) that automatically stores undeclared fields.
Use case	Stores structured data where the schema is known and consistent.	Stores flexible, evolving, or semi-structured data that doesn't fit a fixed schema.
Control	You control the field name and structure.	System-managed for undefined fields.
Querying	Query using your field name or target key inside the JSON field: `metadata["key"]`.	Query directly using the dynamic field key: `"dynamic_key"` or via `$meta</code>: <code>$ meta["dynamic_key"]`

Basic operations

The fundamental workflow for using a JSON field involves defining it in your schema, inserting data, and then querying the data using specific filter expressions.

Define a JSON field

To use a JSON field, explicitly define it in your collection schema when creating the collection. The following example demonstrates how to create a collection with a metadata field of type DataType.JSON:

Python
Java
NodeJS
Go
cURL

from pymilvus import MilvusClient, DataType

CLUSTER_ENDPOINT = "YOUR_CLUSTER_ENDPOINT"
TOKEN = "YOUR_CLUSTER_TOKEN" 

# Set up a Milvus client
client = MilvusClient(
    uri=CLUSTER_ENDPOINT,
    token=TOKEN 
)

# Create schema
schema = client.create_schema(auto_id=False, enable_dynamic_field=True)

schema.add_field(field_name="product_id", datatype=DataType.INT64, is_primary=True) # Primary field
schema.add_field(field_name="vector", datatype=DataType.FLOAT_VECTOR, dim=5) # Vector field
# Define a JSON field that allows null values
schema.add_field(field_name="metadata", datatype=DataType.JSON, nullable=True)

client.create_collection(
    collection_name="product_catalog",
    schema=schema
)

// java

// js

// go

# restful

📘Notes

In this example, the JSON field defined in the collection schema allows null values with nullable=True. For details, refer to Nullable & Default.

Insert data

Once the collection is created, insert entities that contain structured JSON objects in your designated JSON field. Your data should be formatted as a list of dictionaries.

Python
Java
NodeJS
Go
cURL

entities = [
    {
        "product_id": 1,
        "vector": [0.1, 0.2, 0.3, 0.4, 0.5],
        "metadata": { # JSON field
            "category": "electronics",
            "brand": "BrandA",
            "in_stock": True,
            "price": 99.99,
            "string_price": "99.99",
            "tags": ["clearance", "summer_sale"],
            "supplier": {
                "name": "SupplierX",
                "country": "USA",
                "contact": {
                    "email": "support@supplierx.com",
                    "phone": "+1-800-555-0199"
                }
            }
        }
    }
]

client.insert(collection_name="product_catalog", data=entities)

// java

// js

// go

# restful

Filtering operations

Before you can perform filtering operations on JSON fields, make sure:

You have created an index on each vector field.
The collection is loaded into memory.

Show example code

Python
Java
NodeJS
Go
cURL

index_params = client.prepare_index_params()
index_params.add_index(
    field_name="vector",
    index_type="AUTOINDEX",
    index_name="vector_index",
    metric_type="COSINE"
)

client.create_index(collection_name="product_catalog", index_params=index_params)

client.load_collection(collection_name="product_catalog")

// java

// js

// go

# restful

Once these requirements are met, you can use the expressions below to filter on your collection based on the values within the JSON field. These filter expressions leverage specific JSON path syntax and dedicated operators.

Filtering with JSON path syntax

To query a specific key, use bracket notation to access JSON keys: json_field_name["key"]. For nested keys, chain them together: json_field_name["key1"]["key2"].

To filter for entities where the category is "electronics":

Python
Java
NodeJS
Go
cURL

# Define filter expression
filter = 'metadata["category"] == "electronics"'

client.search(
    collection_name="product_catalog",  # Collection name
    data=[[0.1, 0.2, 0.3, 0.4, 0.5]],               # Query vector (must match collection's vector dim)
    limit=5,                           # Max. number of results to return
    filter=filter,                    # Filter expression
    output_fields=["product_id", "metadata"]   # Fields to include in the search results
)

// java

// js

// go

# restful

To filter for entities where the nested key supplier["country"] is "USA":

Python
Java
NodeJS
Go
cURL

# Define filter expression
filter = 'metadata["supplier"]["country"] == "USA"'

res = client.search(
    collection_name="product_catalog",  # Collection name
    data=[[0.1, 0.2, 0.3, 0.4, 0.5]],               # Query vector (must match collection's vector dim)
    limit=5,                           # Max. number of results to return
    filter=filter,                    # Filter expression
    output_fields=["product_id", "metadata"]   # Fields to include in the search results
)

print(res)

// java

// js

// go

# restful

Filtering with JSON-specific operators

Zilliz Cloud also provides special operators for querying array values on specific JSON field keys. For example:

json_contains(identifier, expr): Checks if a specific element or sub-array exists within a JSON array
json_contains_all(identifier, expr): Ensures that all elements of the specified JSON expression are present in the field
json_contains_any(identifier, expr): Filters entities where at least one member of the JSON expression exists within the field

To find a product that has the "summer_sale" value under the tags key:

Python
Java
NodeJS
Go
cURL

# Define filter expression
filter = 'json_contains(metadata["tags"], "summer_sale")'

res = client.search(
    collection_name="product_catalog",  # Collection name
    data=[[0.1, 0.2, 0.3, 0.4, 0.5]],               # Query vector (must match collection's vector dim)
    limit=5,                           # Max. number of results to return
    filter=filter,                    # Filter expression
    output_fields=["product_id", "metadata"]   # Fields to include in the search results
)

print(res)

// java

// js

// go

# restful

To find a product that has at least one of the "electronics", "new", or "clearance" values under the tags key:

Python
Java
NodeJS
Go
cURL

# Define filter expression
filter = 'json_contains_any(metadata["tags"], ["electronics", "new", "clearance"])'

res = client.search(
    collection_name="product_catalog",  # Collection name
    data=[[0.1, 0.2, 0.3, 0.4, 0.5]],               # Query vector (must match collection's vector dim)
    limit=5,                           # Max. number of results to return
    filter=filter,                    # Filter expression
    output_fields=["product_id", "metadata"]   # Fields to include in the search results
)

print(res)

// java

// js

// go

# restful

For more information about JSON-specific operators, refer to JSON Operators.

Next: Accelerate JSON queries

By default, queries on JSON fields without acceleration will perform a full scan of all rows, which can be slow on large datasets. To speed up JSON queries, Zilliz Cloud provides advanced indexing and storage optimization features.

The table below summarizes their differences and best-use scenarios:

Technique	Best For	Arrays Acceleration	Notes
JSON Indexing	Small set of frequently accessed keys, arrays on a specific array key	Yes (on indexed array key)	Must preselect keys, maintenance needed if schema evolves
JSON Shredding	General speed-up across many keys, flexible for varied queries	Yes (slightly accelerates array values compared to brute-force queries)	Extra storage config, arrays still need per-key index
NGRAM Index	Wildcard searches, substring matching in text fields	N/A	Not for numeric/range filters

Tip: You can combine these approaches—for example, use JSON shredding for broad query acceleration, JSON indexing for high-frequency array keys, and NGRAM indexing for flexible text search.

For implementation details, refer to:

FAQ

Are there any limitations on the size of a JSON field?

Yes. Each JSON field is limited to 65,536 bytes.

Does a JSON field support setting a default value?

No, JSON fields do not support default values. However, you can set nullable=True when defining the field to allow empty entries.

Refer to Nullable & Default for details.

Are there any naming conventions for JSON field keys?

Yes, to ensure compatibility with queries and indexing:

Use only letters, numbers, and underscores in JSON keys.
Avoid using special characters, spaces, or dots (., /, etc.).
Incompatible keys may cause parsing issues in filter expressions.

How does Zilliz Cloud handle string values in JSON fields?

Zilliz Cloud stores string values exactly as they appear in the JSON input—without semantic transformation. Improperly quoted strings may result in errors during parsing.

Examples of valid strings:

"a\"b", "a'b", "a\\b"

Examples of invalid strings:

'a"b', 'a\'b'

What is a JSON field?​

JSON field vs. dynamic field​

Basic operations​

Define a JSON field​

Insert data​

Filtering operations​

Filtering with JSON path syntax​

Filtering with JSON-specific operators​

Next: Accelerate JSON queries​

FAQ​

Are there any limitations on the size of a JSON field?​

Does a JSON field support setting a default value?​

Are there any naming conventions for JSON field keys?​

How does Zilliz Cloud handle string values in JSON fields?​