Skip to main content
Version: User Guides (Cloud)

Metrics & Alerts Reference

In this reference, you can find descriptions of monitoring metrics for Zilliz Cloud clusters, as well as alert targets that you can set up at organization and project levels.

Cluster metrics

The Metrics tab in the Zilliz Cloud console presents various graphical representations.

The table provides a description of each metric and the actions that you are advised to perform when the usage of your cluster resource exceeds a threshold.

Metric NameUnitDescriptionRecommended Action
Resources
CU Computation
%A measure of the utilized computational power relative to the total computational capacity of the CU.
CU Capacity
%A measure of the used capacity relative to the total capacity of the CU.
StorageGBThe total amount of persistent storage consumed by data and indexes.
Configure alerts for monitoring storage usage.
Performance
QPS/VPS (Read)QPS/VPSQPS: The number of read requests (search and query) per second.
VPS: The number of read requests (search) on vectors per second. VPS is not available for query requests as query operations do not involve vectors.
Refer to benchmark for system performance monitoring.
QPS/VPS (Write)
QPS/VPSQPS: The number of write requests (insert, bulk insert, upsert, and delete) per second.
VPS: The number of write requests (insert, bulk insert,upsert, and delete) on vectors per second.
Refer to benchmark for system performance monitoring.
Latency (Read)
msThe time elapsed between a client sending a read request (search and query) to a server and the client receiving a response.
Selecting Average or P99 from the expanded dropdown menu on the right displays an average or P99 latency.
-
Latency (Write)msThe time elapsed between a client sending a write request (insert, upsert, and delete) to a server and the client receiving a response.
Selecting Average or P99 from the expanded dropdown menu on the right displays an average or P99 latency.
-
Request Failure Rate (Read)%The percentage of timeout read requests (search and query) in all read requests per second.Configure alerts to monitor read request failure rate.
Request Failure Rate (Write)%The percentage of timeout write requests (insert, bulk insert, upsert, and delete) in all write requests per second.Configure alerts to monitor write request failure rate.
Data
Collection CountcountThe number of collections created in a cluster.-
Entity CountcountThe number of entities created in a cluster.
Selecting a specific collection from the expanded dropdown menu on the right displays the number of entities at the collection level.
-
Loaded EntitiescountThe number of entities loaded (actively served) by a cluster.
Selecting a specific collection from the expanded dropdown menu on the right displays the number of entities at the collection level.
-

Organization alerts

Organization alerts keep you informed about billing-related issues such as expiring credit cards, the status of free credits, balance alerts for advance payments, and notifications regarding usage costs.

Alert TargetUnitDescriptionRecommended ActionDefault Trigger Condition
Expiration Date of Credit cardDayMonitor the remaining days until the credit card's expiration to ensure uninterrupted service.Renew or update credit card information before the expiration date.
Remaining Credits$Track the balance of free credits, alerting the user when it falls low to prompt a top-up.Top up credits to maintain account functionality.Trigger WARNING alerts when the balance of free credits falls below $10.
Credit Validity Period
Day
Monitor the remaining validity period of free credits, alerting the user to encourage usage or extension.Extend the validity period or use the credits before they expire.Trigger WARNING alerts when the validity period of free credits reaches 0 days.
Advance Pay Balance$Monitor the advance pay balance, alerting the user when it falls low to prevent service disruption.Add funds to the advance pay balance to avoid service interruption.Trigger CRITICAL alerts when the balance falls below $100.
Usage Amount$Track the usage amount, informing the user when it exceeds a set threshold to suggest monitoring and management.Monitor and manage usage to stay within budget limits.Trigger WARNING alerts when the amount of usage exceeds $100.

Project alerts

Project alerts focus on the operational aspects of your clusters, including notifications on the CU usage, QPS thresholds, latency issues, and request anomalies, ensuring you maintain optimal cluster performance.

Default alert targets

Zilliz Cloud predefines common alert targets to ensure that critical issues are quickly identified and addressed with the appropriate actions.

For more information about recommended actions, refer to Cluster metrics.

Alert TargetUnitDefault Trigger Condition
CU Computation
%
CU Capacity%
Search (QPS)QPSTrigger WARNING alerts at >50 search operations per second for 10+ minutes.
Query (QPS)QPSTrigger WARNING alerts at >50 query operations per second for 10+ minutes.
Search Latency (P99)msTrigger WARNING alerts at P99 latency >1,000ms for 10+ minutes.
Query Latency (P99)msTrigger WARNING alerts at P99 latency >1,000ms for 10+ minutes.

Custom alert targets

In addition to the predefined default project alerts , you can also configure custom alert targets as needed.

Alert TargetDescription
Resource
StorageMonitor storage usage and send notifications if the usage exceeds a threshold for a certain duration.
Performance (read/write)
Bulk Insert (QPS)
Monitor the rate of bulk insert operations and send notifications if the rate exceeds a threshold for a certain duration.
Delete (QPS)
Monitor the rate of delete operations and send notifications if the rate exceeds a threshold for a certain duration.
Insert (QPS)Monitor the rate of insert operations and send notifications if the rate exceeds a threshold for a certain duration.
Insert (VPS)
Monitor the rate of vector insert operations and send notifications if the rate exceeds a threshold for a certain duration.
Search (VPS)
Monitor the rate of vector search operations and send notifications if the rate exceeds a threshold for a certain duration.
Upsert (QPS)
Monitor the rate of upsert operations and send notifications if the rate exceeds a threshold for a certain duration.
Upsert (VPS)Monitor the rate of vector upsert operations and send notifications if the rate exceeds a threshold for a certain duration.
Performance (latency)
Delete Latency (Average)Monitor the average latency for delete requests and send notifications if the latency exceeds a threshold for a certain duration.
Delete Latency (P99)Monitor the P99 latency for delete requests and send notifications if the latency exceeds a threshold for a certain duration.
Insert Latency (Average)Monitor the average latency for insert requests and send notifications if the latency exceeds a threshold for a certain duration.
Insert Latency (P99)Monitor the P99 latency for insert requests and send notifications if the latency exceeds a threshold for a certain duration.
Query Latency (Average)Monitor the average latency for query requests and send notifications if the latency exceeds a threshold for a certain duration.
Search Request Latency (Average)Monitor the average latency for search requests and send notifications if the latency exceeds a threshold for a certain duration.
Upsert Latency (Average)
Monitor the average latency for upsert requests and send notifications if the latency exceeds a threshold for a certain duration.
Upsert Latency (P99)
Monitor the P99 latency for upsert requests and send notifications if the latency exceeds a threshold for a certain duration.
Performance (request failure rate)
Bulk Insert Failure RateMonitor the failure rate of bulk insert requests and send notifications if the rate exceeds a threshold for a certain duration.
Delete Failure RateMonitor the failure rate of delete requests and send notifications if the rate exceeds a threshold for a certain duration.
Insert Failure RateMonitor the failure rate of insert requests and send notifications if the rate exceeds a threshold for a certain duration.
Query Failure RateMonitor the failure rate of query requests and send notifications if the rate exceeds a threshold for a certain duration.
Search Failure RateMonitor the failure rate of search requests and send notifications if the rate exceeds a threshold for a certain duration.
Upsert Failure RateMonitor the failure rate of upsert requests and send notifications if the rate exceeds a threshold for a certain duration.
Data
Loaded EntitiesMonitor the number of loaded entities and send notifications if the count exceeds a threshold for a certain duration.
Total CollectionsMonitor the number of total collections and send notifications if the count exceeds a threshold for a certain duration.
Total EntitiesMonitor the number of total entities and send notifications if the count exceeds a threshold for a certain duration.