Version: User Guides (Cloud)

Metrics & Alerts Reference

In this reference, you can find descriptions of monitoring metrics for Zilliz Cloud clusters, as well as alert targets that you can set up at organization and project levels.

Cluster metrics

The Metrics tab in the Zilliz Cloud console presents various graphical representations.

The table provides a description of each metric and the actions that you are advised to perform when the usage of your cluster resource exceeds a threshold.

📘Notes

To unlock a range of advanced metrics, upgrade your plan tier.

Metric Name	Unit	Description	Recommended Action

Resources
Read vCUs	Count	A measure of vCU consumption of search and query operations. This metric is available only for Free or Serverless clusters. For more information on cluster plan tiers, refer to Select the Right Cluster Plan.	-
Write vCUs	Count	A measure of vCU consumption of insert, delete, and upsert operations. This metric is available only for Free or Serverless clusters. For more information on cluster plan tiers, refer to Select the Right Cluster Plan.	-
CU Computation	%	A measure of the utilized computational power relative to the total computational capacity of the CU. This metric is available only for Dedicated or BYOC clusters. For more information on cluster plan tiers, refer to Select the Right Cluster Plan.	70%-80%: Check service status and prepare for scaling up. > 90%: Scale up immediately to avoid service interruption.
CU Capacity	%	A measure of the used capacity relative to the total capacity of the CU. This metric is available for Free, Dedicated or BYOC clusters. For more information on cluster plan tiers, refer to Select the Right Cluster Plan.	70%-80%: Check service status and prepare for scaling up. > 90%: Scale up immediately to avoid service interruption. 100%: When CU capacity reaches 100%, you will be unable to write data into the cluster. Please scale up immediately to avoid service interruption.
Storage	GB	The total amount of persistent storage consumed by data and indexes.	Configure alerts for monitoring storage usage.
Performance
QPS/VPS (Read)	QPS/VPS	QPS: The number of read requests (search and query) per second. VPS: The number of read requests (search) on vectors per second. VPS is not available for query requests as query operations do not involve vectors.	Refer to benchmark for system performance monitoring.
QPS/VPS (Write)	QPS/VPS	QPS: The number of write requests (insert, bulk insert, upsert, and delete) per second. VPS: The number of write requests (insert, bulk insert, upsert, and delete) on vectors per second.	Refer to benchmark for system performance monitoring.
Latency (Read)	ms	The time elapsed between a client sending a read request (search and query) to a server and the client receiving a response. Selecting Average or P99 from the expanded dropdown menu on the right displays an average or P99 latency.	-
Latency (Write)	ms	The time elapsed between a client sending a write request (insert, upsert, and delete) to a server and the client receiving a response. Selecting Average or P99 from the expanded dropdown menu on the right displays an average or P99 latency.	-
Request Failure Rate (Read)	%	The percentage of failed read requests (search and query) in all read requests per second.	Configure alerts to monitor read request failure rate.
Request Failure Rate (Write)	%	The percentage of failed write requests (insert, bulk insert, upsert, and delete) in all write requests per second.	Configure alerts to monitor write request failure rate.
Slow Query Count	count/min	The number of slow query operations, including all search and query requests. By default, all requests whose latency is 5 seconds are considered slow queries. This metric type is available only for Dedicated clusters of the Enterprise edition or BYOC clusters. For more information on cluster types, refer to Select the Right Cluster Plan.	Identify problematic queries and tune performance by adjusting cluster configuration as necessary.
Cluster Write Performance Capacity	%	The current rate of write operations/write rate limit. This metric type is available only for Dedicated clusters of the Enterprise edition or BYOC clusters. For more information on cluster types, refer to Select the Right Cluster Plan.	If the current rate is too high (suggested to be over 80%), it is recommended that you lower the write rate.
Number of Flush Operations	count/min	The number of flush operations on a cluster. This metric type is available only for Dedicated clusters of the Enterprise edition or BYOC clusters. For more information on cluster types, refer to Select the Right Cluster Plan.	Performing flush operations too frequently can negatively impact the overall performance of the cluster. For more information, refer to Zilliz Cloud Limits.
Data
Collection Count	count	The number of collections created in a cluster.	-
Entity Count	count	The number of entities inserted into a cluster. Selecting a specific collection from the expanded dropdown menu on the right displays the number of entities at the collection level.	-
Loaded Entities	count	The number of entities loaded (actively served) by a cluster. Selecting a specific collection from the expanded dropdown menu on the right displays the number of loaded entities at the collection level. This metric is available only for Dedicated or BYOC clusters. For more information on cluster plan tiers, refer to Select the Right Cluster Plan.	-
Number of Unloaded Collections	count	The number of unloaded collections in a cluster. This metric type is available only for Dedicated clusters of the Enterprise edition or BYOC clusters. For more information on cluster types, refer to Select the Right Cluster Plan.

Organization alerts

Organization alerts keep you informed about billing-related issues such as expiring credit cards, the status of free credits, balance alerts for advance payments, and notifications regarding usage costs.

Alert Target	Unit	Description	Recommended Action	Default Trigger Condition
Expiration Date of Credit card	Day	Monitor the remaining days until the credit card's expiration to ensure uninterrupted service.	Renew or update credit card information before the expiration date.	WARNING: Trigger alerts within 30 days of card expiration. CRITICAL: Trigger alerts within 7 days of card expiration.
Remaining Credits	$	Track the balance of free credits, alerting the user when it falls low to prompt a top-up.	Top up credits to maintain account functionality.	Trigger WARNING alerts when the balance of free credits falls below $10.
Credit Validity Period	Day	Monitor the remaining validity period of free credits, alerting the user to encourage usage or extension.	Extend the validity period or use the credits before they expire.	Trigger WARNING alerts when the validity period of free credits reaches 0 days.
Advance Pay Balance	$	Monitor the advance pay balance, alerting the user when it falls low to prevent service disruption.	Add funds to the advance pay balance to avoid service interruption.	Trigger CRITICAL alerts when the balance falls below $100.
Usage Amount	$	Track the usage amount, informing the user when it exceeds a set threshold to suggest monitoring and management.	Monitor and manage usage to stay within budget limits.	Trigger WARNING alerts when the amount of usage exceeds $100.

Project alerts

Project alerts focus on the operational aspects of your clusters, including notifications on the CU usage, QPS thresholds, latency issues, and request anomalies, ensuring you maintain optimal cluster performance.

For each project alert target, the trigger condition includes a threshold value and a duration value that must be met for the alert to be triggered. The condition can be set to one of the following operators: >, >=, <, <=, =. The threshold value can be a numeric value, such as a number for metrics like query latency, query QPS, search QPS, CU Capacity, and CU Computation. The duration specifies how long the threshold must be exceeded, which is set to a minimum of 1 minute and a maximum of 30 minutes.

Default alert targets

Zilliz Cloud predefines common alert targets to ensure that critical issues are quickly identified and addressed with the appropriate actions.

For more information about recommended actions, refer to Cluster metrics.

Alert Target	Unit	Default Trigger Condition
CU Computation	%	WARNING: Trigger alerts at >70% utilized computational power for 10+ minutes. CRITICAL: Trigger alerts at >90% utilized computational power for 10+ minutes.
CU Capacity	%	WARNING: Trigger alerts at >70% utilized CU capacity for 10+ minutes. CRITICAL: Trigger alerts at >90% utilized CU capacity for 10+ minutes.
Search (QPS)	QPS	Trigger WARNING alerts at >50 search operations per second for 10+ minutes.
Query (QPS)	QPS	Trigger WARNING alerts at >50 query operations per second for 10+ minutes.
Search Latency (P99)	ms	Trigger WARNING alerts at P99 latency >1,000ms for 10+ minutes.
Query Latency (P99)	ms	Trigger WARNING alerts at P99 latency >1,000ms for 10+ minutes.

Custom alert targets

In addition to the predefined default project alerts , you can also configure custom alert targets as needed.

Alert Target	Description
Resource
Storage	Monitor storage usage and send notifications if the usage exceeds a threshold for a certain duration.
Performance (read/write)
Bulk Insert (QPS)	Monitor the rate of bulk insert operations and send notifications if the rate exceeds a threshold for a certain duration.
Delete (QPS)	Monitor the rate of delete operations and send notifications if the rate exceeds a threshold for a certain duration.
Insert (QPS)	Monitor the rate of insert operations and send notifications if the rate exceeds a threshold for a certain duration.
Insert (VPS)	Monitor the rate of vector insert operations and send notifications if the rate exceeds a threshold for a certain duration.
Search (VPS)	Monitor the rate of vector search operations and send notifications if the rate exceeds a threshold for a certain duration.
Upsert (QPS)	Monitor the rate of upsert operations and send notifications if the rate exceeds a threshold for a certain duration.
Upsert (VPS)	Monitor the rate of vector upsert operations and send notifications if the rate exceeds a threshold for a certain duration.
Writes to Cluster Are Disabled	Monitor the write operations to the cluster to ensure they are not prohibited. Please scale out immediately if write prohibition has been triggered.
Performance (latency)
Delete Latency (Average)	Monitor the average latency for delete requests and send notifications if the latency exceeds a threshold for a certain duration.
Delete Latency (P99)	Monitor the P99 latency for delete requests and send notifications if the latency exceeds a threshold for a certain duration.
Insert Latency (Average)	Monitor the average latency for insert requests and send notifications if the latency exceeds a threshold for a certain duration.
Insert Latency (P99)	Monitor the P99 latency for insert requests and send notifications if the latency exceeds a threshold for a certain duration.
Query Latency (Average)	Monitor the average latency for query requests and send notifications if the latency exceeds a threshold for a certain duration.
Search Request Latency (Average)	Monitor the average latency for search requests and send notifications if the latency exceeds a threshold for a certain duration.
Upsert Latency (Average)	Monitor the average latency for upsert requests and send notifications if the latency exceeds a threshold for a certain duration.
Upsert Latency (P99)	Monitor the P99 latency for upsert requests and send notifications if the latency exceeds a threshold for a certain duration.
Performance (request failure rate)
Bulk Insert Failure Rate	Monitor the failure rate of bulk insert requests and send notifications if the rate exceeds a threshold for a certain duration.
Delete Failure Rate	Monitor the failure rate of delete requests and send notifications if the rate exceeds a threshold for a certain duration.
Insert Failure Rate	Monitor the failure rate of insert requests and send notifications if the rate exceeds a threshold for a certain duration.
Query Failure Rate	Monitor the failure rate of query requests and send notifications if the rate exceeds a threshold for a certain duration.
Search Failure Rate	Monitor the failure rate of search requests and send notifications if the rate exceeds a threshold for a certain duration.
Slow Query Count	Monitor the number of slow queries and send notifications if the value exceeds a threshold for a certain duration. By default, all requests whose latency is 5 seconds are considered slow queries. This alert target is available only for BYOC clusters or clusters of the Enterprise edition.
Upsert Failure Rate	Monitor the failure rate of upsert requests and send notifications if the rate exceeds a threshold for a certain duration.
Data
Loaded Entities	Monitor the number of loaded entities and send notifications if the count exceeds a threshold for a certain duration.
Total Collections	Monitor the number of total collections and send notifications if the count exceeds a threshold for a certain duration.
Total Entities	Monitor the number of total entities and send notifications if the count exceeds a threshold for a certain duration.
Others
Cluster Is Abnormal	Monitor the status of a cluster to ensure it is functioning properly. This includes checking the cluster load and usage.

Cluster metrics​

Organization alerts​

Project alerts​

Default alert targets​

Custom alert targets​

Related topics​

Cluster metrics

Organization alerts

Project alerts

Default alert targets

Custom alert targets

Related topics