Skip to main content
Version: User Guides (BYOC)

Scale Cluster

As your workload grows and more data is written, the cluster may reach its capacity limit. In such cases, read operations will continue to function, but new write operations may fail.

To proactively manage this, you can monitor CU Capacity on the metrics page to determine when scaling is needed. Based on your business needs and patterns, you can increase the number of query CUs to expand cluster capacity or reduce it when demand decreases to save on costs.

This guide explains how to resize a cluster to suit your changing workload.

📘Notes

For clusters with 1 - 8 CUs, you can directly scale query CU. For clusters with more than 8 CUs, please increase replicas.

Considerations

  • Plan Availability: Only supported for Dedicated clusters.

  • Resource Limitations:

    • Scale up

      • Dedicated (Standard) clusters: Up to 32 CUs

        Dedicated (Enterprise) clusters: Up to 256 CUs

      • The product of Number of Query CU × Replica count must not exceed 256

      For larger query CU, contact sales.

    • Scale down

      • Clusters with replicas cannot scale down to less than 8 CUs

      • A scale-down request only succeeds if:

  • During Scaling: The cluster status changes to “Modifying,” during which no operations can be performed. If multiple scaling tasks are triggered, they will be processed sequentially based on trigger timestamp. Completion time depends on data volume.

  • Performance Impact: Scaling may cause slight service jitter.

  • Backup Limitations: Dynamic and scheduled scaling settings are not included in backups. After restoring a cluster, reconfigure these settings manually.

Manual scaling

You can manually scale your cluster up or down via the Zilliz Cloud console or RESTful API. Note that scheduled scaling is only available on the web console.

Via web console

The following demo shows how to manually scale up and down a cluster on the Zilliz Cloud web console.

Via RESTful API

The following example scales an existing cluster to 2 CU. For details, see Modify Cluster.

curl --request POST \
--url "${BASE_URL}/v2/clusters/${CLUSTER_ID}/modify" \
--header "Authorization: Bearer ${TOKEN}" \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
-d '{
"cuSize": 2
}'

The following is an example output.

{
"code": 0,
"data": {
"clusterId": "inxx-xxxxxxxxxxxxxxx",
"prompt": "successfully submitted. Cluster is being upgraded, which is expected to take several minutes. You can access data about the creation progress and status of your cluster by DescribeCluster API. Once the cluster status is RUNNING, you may access your vector database using the SDK."
}
}