Integrate with Prometheus
Prometheus is a monitoring system that collects metrics from configured targets at specified intervals, evaluates rule expressions, displays the results, and can trigger alerts based on specific conditions.
By integrating Zilliz Cloud with Prometheus, you can collect and monitor metrics related to your Zilliz Cloud deployment.
This feature is available only to Dedicated clusters in an Enterprise project.
Configure Prometheus to scrape Zilliz Cloud metrics
To monitor Zilliz Cloud clusters with Prometheus, follow these steps:
Access the Prometheus.yml configuration file on your Prometheus server. For more information, refer to Configuration.
Add the following snippet to the scrape_configs section of the Prometheus.yml file. Replace the placeholders with the appropriate values:
-
{{apiKey}}: Your Zilliz Cloud API key for accessing cluster metrics. -
{{clusterId}}: The ID of the Zilliz Cloud cluster you wish to monitor.
scrape_configs:
- job_name: {{clusterId}}
scheme: https
metrics_path: /v2/clusters/{{clusterId}}/metrics/export
scrape_interval: 60s
scrape_timeout: 30s
authorization:
type: Bearer
credentials: {{apiKey}}
static_configs:
- targets: ["api.cloud.zilliz.com"]
The cluster must contain no more than 10,000 collections. Clusters exceeding this limit may experience incomplete or degraded metrics export.
Parameter | Description |
|---|---|
| Human-readable label assigned to scraped metrics. |
| The protocol scheme used to scrape metrics from the Zilliz Cloud endpoints, which is set to |
| The path on the target service that provides the metric data. |
| How frequently to scrape the target. The minimum supported value is |
| The authentication type used to access the Zilliz Cloud metrics. Set the value to |
| The API key used for authorization to access the Zilliz Cloud metrics endpoints. |
| The static target that Prometheus will scrape, which should be . |
Save the changes to the Prometheus.yml file.
For more details, refer to Prometheus official documentation.
Example scraped metrics
The following are example Prometheus metrics scraped from the Zilliz Cloud /metrics/export endpoint. Per-collection metrics include collection_name and db_name labels, while cluster-only metrics remain unchanged.
# HELP zilliz_entities Total number of entities stored
# TYPE zilliz_entities gauge
zilliz_entities{cluster_id="in01-xxx", collection_name="prod_embedding", db_name="default"} 5000000
zilliz_entities{cluster_id="in01-xxx", collection_name="user_profile", db_name="default"} 120000
# HELP zilliz_loaded_entities Number of entities loaded in memory
# TYPE zilliz_loaded_entities gauge
zilliz_loaded_entities{cluster_id="in01-xxx", collection_name="prod_embedding", db_name="default"} 3000000
zilliz_loaded_entities{cluster_id="in01-xxx", collection_name="user_profile", db_name="default"} 200000
# HELP zilliz_requests_total Total number of requests processed
# TYPE zilliz_requests_total counter
zilliz_requests_total{cluster_id="in01-xxx", request_type="search", status="success", collection_name="prod_embedding", db_name="default"} 30000
zilliz_requests_total{cluster_id="in01-xxx", request_type="search", status="success", collection_name="user_profile", db_name="default"} 12850
# HELP zilliz_request_duration_seconds_bucket Latency distribution of requests
# TYPE zilliz_request_duration_seconds_bucket histogram
zilliz_request_duration_seconds_bucket{cluster_id="in01-xxx", request_type="search", le="0.1", collection_name="prod_embedding", db_name="default"} 28000
zilliz_request_duration_seconds_bucket{cluster_id="in01-xxx", request_type="search", le="0.1", collection_name="user_profile", db_name="default"} 10000
# HELP zilliz_request_vectors_total Total number of vectors in requests
# TYPE zilliz_request_vectors_total counter
zilliz_request_vectors_total{cluster_id="in01-xxx", request_type="search", collection_name="prod_embedding", db_name="default"} 50000
zilliz_request_vectors_total{cluster_id="in01-xxx", request_type="insert", collection_name="prod_embedding", db_name="default"} 10000
# --- Cluster-only metrics ---
# HELP zilliz_cluster_capacity Cluster capacity ratio
# TYPE zilliz_cluster_capacity gauge
zilliz_cluster_capacity 0.88
# HELP zilliz_cluster_computation Cluster computation ratio
# TYPE zilliz_cluster_computation gauge
zilliz_cluster_computation 0.1
# HELP zilliz_storage_bytes Cluster storage usage
# TYPE zilliz_storage_bytes gauge
zilliz_cluster_storage_bytes 8.9342782E7
Zilliz Cloud metric labels
The metrics exposed by Zilliz Cloud are labeled with the following identifiers.
Label Name | Description | Values |
|---|---|---|
| The ID of the Zilliz Cloud cluster that the metrics are from. | - |
| The ID of the organization that owns the Zilliz Cloud cluster. | - |
| The ID of the project within the organization that the cluster belongs to. | - |
| The name of the collection. Present on all per-collection metrics, including request metrics ( | - |
| The name of the database the collection belongs to. Present on all per-collection metrics alongside | Defaults to |
| The type of operation performed on the data. |
|
| The outcome of the data operation. |
|
Available metrics
The following table lists the metrics available for Zilliz Cloud, along with their types, descriptions, and associated labels. Per-collection metrics are returned with collection_name and db_name labels, producing separate time series for each collection. Cluster-only metrics are returned as a single series per cluster.
Metric Name | Type | Description | Labels |
|---|---|---|---|
| Gauge | The current computation capacity utilization. |
|
| Gauge | The current storage capacity utilization. |
|
| Gauge | The total storage space used. |
|
| Gauge | The current write throughput. |
|
| Counter | The total number of requests processed. |
|
| Counter | The total number of vectors manipulated across all requests. |
|
| Histogram | The latency distribution of requests processed. |
|
| Counter | The number of queries exceeding the latency threshold. |
|
| Gauge | The total number of entities stored. |
|
| Gauge | The number of entities currently loaded in memory. |
|
| Gauge | The number of entities that have been indexed. |
|
| Gauge | The total number of collections. |
|
| Gauge | The number of unloaded collections. |
|
Example Prometheus queries
Here are some example queries you can use to analyze Zilliz Cloud metrics with Prometheus.
-
Calculate insert QPS
rate(zilliz_requests_total{cluster_id='in01-xxxxx',request_type='insert'}[$__rate_interval]) -
Calculate insert VPS
rate(zilliz_request_vectors_total{cluster_id='in01-xxxxx',request_type='insert'}[$__rate_interval]) -
Calculate 70th percentile insert latency
histogram_quantile(0.70,sum(rate(zilliz_request_duration_seconds_bucket{cluster_id='in01-xxxxx',request_type='insert'}[$__rate_interval])) by (le)) -
Calculate insert request fail rate
rate(zilliz_requests_total{cluster_id=?,status!='success'}[$__rate_interval])/rate(zilliz_requests_total{cluster_id=?}[$__rate_interval]) -
Calculate the number of slow queries per 1 minute
sum(increase(zilliz_slow_queries_total{cluster_id=?}[1m])) -
Calculate the number of slow queries per 5 minutes
sum(increase(zilliz_slow_queries_total{cluster_id=?}[5m]))