Version: User Guides (Cloud)

Global Cluster Explained

Zilliz Cloud global cluster lets you deploy a primary cluster and multiple read-only secondary clusters across multiple regions on the same cloud provider.

This feature is designed for globally distributed, mission-critical applications and helps you achieve resilience against regional outage and low-latency local reads for users around the world.

📘Notes

This feature is available only to Dedicated clusters in a Business Critical project.

Overview

A Zilliz Cloud global cluster consists of one primary cluster and up to five read-only secondary clusters deployed in different regions on the same cloud provider.

Primary cluster: The authoritative heart of your system. It handles all write operations. And its capability to handle read requests is the same as all secondary clusters.
Secondary clusters: These are geographically distributed followers. They serve two critical purposes: acting as a standby for disaster recovery and serving local read-only traffic to users in that region.

All writes are directed to the primary cluster. Zilliz Cloud then automatically replicates data changes from the primary cluster to all secondary clusters.

This multi-region setup provides:

Resilience against regional outages: If the primary cluster fails or experiences an outage, you can promote a secondary cluster as a primary cluster.
Low-latency reads: Because a full copy of your data is available in multiple geographic locations, applications can read from the nearest region to minimize latency.

Connectivity and routing

The following diagram shows how a global cluster works in Zilliz Cloud.

UZjtwUeaxh2lDsb9eeOclNZ6nae

Your application connects to a global cluster through a global endpoint which provides the following benefits:

One Unified URL: Your application uses one global endpoint that does not change, regardless of the underlying infrastructure. During planned switchover or emergency failover, the endpoint updates its internal routing automatically, so you do not need to change connection endpoint in your application code.
Intelligent Routing: The global endpoint routes write requests to the primary cluster and read requests to the primary or an appropriate secondary cluster based on latency and workload.

Switchover and failover

Zilliz Cloud global clusters support two operations that change which region hosts the primary cluster:

Switchover: A planned operation that promotes a fully synchronized secondary cluster to become the new primary cluster. To learn how to perform a switchover, see Manage Global Cluster. During a switchover, Zilliz Cloud first ensures that the selected secondary cluster in another region is fully synchronized with the current primary. The secondary is then promoted to become the new primary cluster. Because promotion occurs only after synchronization is complete, no data loss is expected. For more information, see Manage Global Cluster.
Failover: An unplanned emergency recovery operation that promotes a secondary cluster to primary after an outage in the original primary region. During a cross-region failover, Zilliz Cloud first fences the original primary cluster which stops accepting new write requests. Then, a secondary cluster in another region is promoted as the new primary cluster based on its latest replicated state.

The following table compares these two concepts.

	Switchover	Failover
Typical use case	Planned operations (regional rotation, compliance, data residency, etc.)	Unplanned outage or failure in the primary region
Trigger	Manually initiated for operational reasons	Manually initiated as a recovery action after an incident
RPO & RTO	RPO: 0 (no data loss) RTO: near zero	RPO: equals the sync latency between the old and new primary cluster, typically a few seconds. RTO: typically about a few minutes.

Switchover

Failover

Typical use case

Planned operations (regional rotation, compliance, data residency, etc.)

Unplanned outage or failure in the primary region

Trigger

Manually initiated for operational reasons

Manually initiated as a recovery action after an incident

RPO & RTO

RPO: 0 (no data loss)

RTO: near zero

RPO: equals the sync latency between the old and new primary cluster, typically a few seconds.

RTO: typically about a few minutes.

Typical use cases

The global cluster feature has 2 typical use cases:

Disaster recovery & high availability: You need clusters in multiple regions for failover. In this case, connect to the global cluster through a global endpoint—Zilliz Cloud automatically routes traffic.
Data replication between environments: You run multiple clusters (for example, production and testing) in the same region and need to replicate data between them. In this case, connect to each cluster using its public endpoint.

Billing

In a global cluster, both the primary and secondary clusters are billed as regular Zilliz Cloud Dedicated clusters for their compute and storage usage, plus additional data transfer charges for data replication between clusters.

Suppose your global cluster configuration is as follows:

A primary cluster cluster_01 in Region A
Two secondary clusters:
- cluster_02 in Region B
- cluster_03 in Region C

You’ll be charged for the sum of the following:

Vector database (compute) costs for cluster_01, cluster_02, and cluster_03
Storage costs for cluster_01, cluster_02, and cluster_03.
Data transfer costs from cluster_01 to cluster_02 and cluster_03

For detailed list prices, see Zilliz Cloud List Price.

Limitations

Plan availability: You need to have a project on the Business Critical plan to access the global cluster feature.
Access Control: You need to be a Project Admin to configure a global cluster
Usage:
- You can only add up to 5 secondary clusters.
- You cannot suspend a global cluster and its primary or secondary clusters.
- The cluster type, cloud provider, query CU count, and replica count of the primary and secondary clusters should all be consistent.
- You can scale the replica of a primary cluster but cannot scale the replica of a secondary cluster.
- To drop a global cluster, you need to drop all its primary and secondary clusters.
- The backup policy is configured on the primary cluster only. After a switchover or failover, the backup policy automatically applies to the new primary cluster.

Overview​

Connectivity and routing​

Switchover and failover​

Typical use cases​

Billing​

Limitations​