Metrics Reference
Zilliz Cloud categorizes metrics in two levels - Organization and Project:
-
Organization-level metrics: Reflect account-wide status (e.g., license credits, usage) across all projects.
-
Project-level metrics: Reflect cluster resources, capacity, performance, and data within a single project.
Most metrics support alerts. An alert evaluates a metric against a condition (operator + threshold) over a time window and notifies you when it’s met. For configuration, refer to Manage Organization Alerts and Manage Project Alerts.
Organization-level metrics
Organization-level metrics help you track billing-related issues across all projects in an organization.
Metric | Unit | Description | Recommended action |
|---|---|---|---|
Usage Amount | $ | Cumulative usage charges over a period. | Monitor vs. budget; optimize usage or adjust budget as needed. |
Credit Validity | day | Days left before free credits expire. | Use or extend credits before expiry. |
Remaining Credits | $ | Balance of free credits. | Top up when low to maintain account functionality. |
Credit Card Validity | day | Days until the saved card expires. | Update or replace card before expiry to avoid payment failures. |
Advance Pay Balance | $ | Remaining pre-paid funds. | Add funds when low to prevent service interruption. |
Project-level metrics (cluster metrics)
These metrics describe resource usage and performance within a project’s clusters.
In this section, Availability refers to the cluster’s plan tier. All means the metric is available across all current cluster plan tiers. For detailed plan tiers, refer to Detailed Plan Comparison.
Resources
Metric | Unit | Description | Recommended action | Availability |
|---|---|---|---|---|
Read vCUs | count | A measure of vCU consumption of search and query operations. Note: Alerts are not supported for this metric. | Monitor trends to understand read cost/throughput. | Free / Serverless |
Write vCUs | count | A measure of vCU consumption of insert, delete, and upsert operations. Note: Alerts are not supported for this metric. | Monitor trends to understand write cost/throughput. | Free / Serverless |
Query CU Computation | % | A measure of the utilized computational power relative to the total computational capacity of the CU. | Dedicated / BYOC | |
Query CU Capacity | % | A measure of the used capacity relative to the total capacity of the CU. | Dedicated / BYOC | |
Total Query CU | count | The total query CU in the current cluster. It is calculated as the product of the numbers of cluster query CU and replica. (Eg. If your cluster has 2 Query CUs and 2 Replicas, the Total Query CU displayed here is 4.) | Track to identify query-CU scaling events. | Dedicated / BYOC |
Replica | count | The number of cluster replicas. | Track to identify replica scaling events. | Dedicated / BYOC |
Storage | GB | The total amount of persistent storage consumed by data and indexes. | Configure alerts for monitoring storage usage. | All |
Performance
Metric | Unit | Description | Recommended action | Availability |
|---|---|---|---|---|
QPS (Read) | - | The number of read requests (search and query) per second. | Refer to benchmark for system performance monitoring. | All |
QPS (Write) | - | The number of write requests (insert, bulk insert, upsert, and delete) per second. | Refer to benchmark for system performance monitoring. | All |
Search NQ per Second | - | The number of query vectors that each search request carries per second. | Refer to benchmark for system performance monitoring. | All |
Write Throughput (Entities/sec) | - | Measures the number of entities written per second across all write operations (insert, upsert, bulk insert, and delete). | Refer to benchmark for system performance monitoring. | All |
Latency (Read) | ms | The time elapsed between a client sending a read request (search and query request) to a server and the client receiving a response. It includes an average latency and a P99 latency. | - | All |
Latency (Write) | ms | The time elapsed between a client sending a write request (insert and upsert request) to a server and the client receiving a response. It includes an average latency and a P99 latency. | - | All |
Request Failure Rate (Read) | % | The percentage of all failed read requests in all requests per second. | Configure alerts for monitoring read request failure rate. | All |
Request Failure Rate (Write) | % | The percentage of all failed write requests in all requests per second. | Configure alerts for monitoring write request failure rate. | All |
Slow Query Count | counts/min | The number of queries that take an unusually long time to execute. | Identify problematic queries and tune performance by adjusting cluster configuration as necessary. | Dedicated (Enterprise) / BYOC |
Cluster Write Performance Capacity | % | Cluster write performance capacity = Current rate of write operations/write rate limit. When it exceeds 80%, it is recommended to reduce the rate of your write operations (insert and upsert). | If the current rate is too high (suggested to be over 80%), it is recommended that you lower the write rate. | Dedicated (Enterprise) / BYOC |
Number of Flush Operations | counts/min | The number of flush operations on a cluster. | Performing flush operations too frequently can negatively impact the overall performance of the cluster. For more information, refer to Zilliz Cloud Limits. | Dedicated (Enterprise) / BYOC |
Data
Metric | Unit | Description | Recommended action | Availability |
|---|---|---|---|---|
Collection Count | count | The number of collections created in a cluster. | Monitor growth; enforce per-project limits if needed. | All |
Entity Count | count | The total number of entities inserted into the cluster, including both single inserts and bulk inserts. | Investigate unexpected growth; plan storage and indexing. | All |
Loaded Entities (Approx.) | count | The approximate number of entities loaded (actively served). | For a more accurate and real-time value, please refer to the 'Loaded Entities' value on the collection overview page or use count(*). | Dedicated / BYOC |
Number of Unloaded Collections | count | The number of unloaded collections in a cluster. | Load critical collections; review memory headroom. | Dedicated (Enterprise) / BYOC |