Metrics Reference
Zilliz Cloud categorizes metrics in two levels - Organization and Project:
-
Organization-level metrics: Reflect account-wide status (e.g., license credits, usage) across all projects.
-
Project-level metrics: Reflect cluster resources, capacity, performance, and data within a single project.
Most metrics support alerts. An alert evaluates a metric against a condition (operator + threshold) over a time window and notifies you when it’s met. For configuration, refer to Manage Organization Alerts and Manage Project Alerts.
Organization-level metrics
Organization-level metrics help you track billing-related issues across all projects in an organization.
Metric | Unit | Description | Recommended action |
|---|---|---|---|
Usage Amount | $ | Cumulative usage charges over a period. | Monitor vs. budget; optimize usage or adjust budget as needed. |
Credit Validity | day | Days left before free credits expire. | Use or extend credits before expiry. |
Remaining Credits | $ | Balance of free credits. | Top up when low to maintain account functionality. |
Credit Card Validity | day | Days until the saved card expires. | Update or replace card before expiry to avoid payment failures. |
Advance Pay Balance | $ | Remaining pre-paid funds. | Add funds when low to prevent service interruption. |
Project-level metrics (cluster metrics)
These metrics describe resource usage and performance within a project’s clusters.
In this section, Availability refers to the project plan & deployment options. For detailed plan comparison, refer to Detailed Plan Comparison.
Resources
Metric | Unit | Description | Recommended action | Availability |
|---|---|---|---|---|
Read vCUs | count | A measure of vCU consumption of search and query operations. Note: Alerts are not supported for this metric. | Monitor trends to understand read cost/throughput. | Free / Serverless |
Write vCUs | count | A measure of vCU consumption of insert, delete, and upsert operations. Note: Alerts are not supported for this metric. | Monitor trends to understand write cost/throughput. | Free / Serverless |
Query CU Computation | % | A measure of the utilized computational power relative to the total computational capacity of the CU. | Dedicated / BYOC | |
Query CU Capacity | % | A measure of the used capacity relative to the total capacity of the CU. | Dedicated / BYOC | |
Total Query CU | count | The total query CU in the current cluster. It is calculated as the product of the numbers of cluster query CU and replica. (Eg. If your cluster has 2 Query CUs and 2 Replicas, the Total Query CU displayed here is 4.) | Track to identify query-CU scaling events. | Dedicated / BYOC |
Replica | count | The number of cluster replicas. | Track to identify replica scaling events. | Dedicated / BYOC |
Storage | GB | The total amount of persistent storage consumed by data and indexes. | Configure alerts for monitoring storage usage. | All |
Performance
Metric | Unit | Description | Recommended action | Availability |
|---|---|---|---|---|
QPS (Read) | - | The number of read requests (search and query) per second. | Refer to benchmark for system performance monitoring. | All |
QPS (Write) | - | The number of write requests (insert, bulk insert, upsert, and delete) per second. | Refer to benchmark for system performance monitoring. | All |
Search NQ per Second | - | The number of query vectors that each search request carries per second. | Refer to benchmark for system performance monitoring. | All |
Write Throughput (Entities/sec) | - | Measures the number of entities written per second across all write operations (insert, upsert, bulk insert, and delete). | Refer to benchmark for system performance monitoring. | All |
Latency (Read) | ms | The time elapsed between a client sending a read request (search and query request) to a server and the client receiving a response. It includes an average latency and a P99 latency. | - | All |
Latency (Write) | ms | The time elapsed between a client sending a write request (insert and upsert request) to a server and the client receiving a response. It includes an average latency and a P99 latency. | - | All |
Request Failure Rate (Read) | % | The percentage of all failed read requests in all requests per second. | Configure alerts for monitoring read request failure rate. | All |
Request Failure Rate (Write) | % | The percentage of all failed write requests in all requests per second. | Configure alerts for monitoring write request failure rate. | All |
Slow Query Count | counts/min | The number of queries that take an unusually long time to execute. | Identify problematic queries and tune performance by adjusting cluster configuration as necessary. | Dedicated (Enterprise or Business Critical) / BYOC |
Cluster Write Performance Capacity | % | Cluster write performance capacity = Current rate of write operations/write rate limit. When it exceeds 80%, it is recommended to reduce the rate of your write operations (insert and upsert). | If the current rate is too high (suggested to be over 80%), it is recommended that you lower the write rate. | Dedicated (Enterprise or Business Critical) / BYOC |
Number of Flush Operations | counts/min | The number of flush operations on a cluster. | Performing flush operations too frequently can negatively impact the overall performance of the cluster. For more information, refer to Zilliz Cloud Limits. | Dedicated (Enterprise or Business Critical) / BYOC |
Data
Metric | Unit | Description | Recommended action | Availability |
|---|---|---|---|---|
Collection Count | count | The number of collections created in a cluster. | Monitor growth; enforce per-project limits if needed. | All |
Entity Count | count | The total number of entities inserted into the cluster, including both single inserts and bulk inserts. | Investigate unexpected growth; plan storage and indexing. | All |
Loaded Entities (Approx.) | count | The approximate number of entities loaded (actively served). | For a more accurate and real-time value, please refer to the 'Loaded Entities' value on the collection overview page or use count(*). | Dedicated / BYOC |
Number of Unloaded Collections | count | The number of unloaded collections in a cluster. | Load critical collections; review memory headroom. | Dedicated (Enterprise or Business Critical) / BYOC |