Image Data
The Zilliz Cloud web UI provides a simplified and intuitive way of creating, running, and managing Pipelines while the RESTful API offers more flexibility and customization compared to the Web UI.
This guide walks you through the necessary steps to create image pipelines, conduct a reverse image search on your embedded image data, and delete the pipeline if it is no longer needed.
Prerequisites and limitations
-
Ensure you have created a cluster deployed in us-west1 on Google Cloud Platform (GCP).
-
In one project, you can only create up to 100 pipelines of the same type. For more information, refer to Zilliz Cloud Limits.
Ingest image data
To ingest any data, you need to first create an ingestion pipeline and then run it.
Create image ingestion pipeline
- Cloud Console
- Bash
-
Navigate to your project.
-
Click on Pipelines from the navigation panel. Then switch to the Overview tab and click Pipelines. To create a pipeline, click + Pipeline.
-
Choose the type of pipeline to create. Click on + Pipeline button in the Ingestion Pipeline column.
-
Configure the Ingestion pipeline you wish to create.
Parameters
Description
Target Cluster
The cluster where a new collection will be automatically created with this Ingestion pipeline. Currently, this can only be a cluster deployed on GCP us-west1.
Collection Name
The name of the auto-created collection.
Pipeline Name
Name of the new Ingestion pipeline. It should only contain lowercase letters, numbers, and underscores.
Description (Optional)
The description of the new Ingestion pipeline.
-
Add an INDEX function to the Ingestion pipeline by clicking + Function. For each Ingestion pipeline, you can add exactly one INDEX function.
-
Enter function name.
-
Select INDEX_IMAGE as the function type. An INDEX_IMAGE function can generate vector embeddings for the images in the provided URLs.
-
Choose the embedding model used to generate vector embeddings. Currently, there are 2 available models: zilliz/vit-base-patch16-224 and zilliz/clip-vit-base-patch32. The following chart briefly introduces each embedding model.
Embedding Model
Description
zilliz/vit-base-patch16-224
The Vision Transformer (ViT) is a transformer encoder model (BERT-like) open-sourced by Google. The model is pretrained on a large collection of images to embed the semantic of image content to a vector space. The model is hosted on Zilliz Cloud to provide the best latency.
zilliz/clip-vit-base-patch32
A multi-modal model released by OpenAI. This vision model and its pairing text model are capable of embedding images and texts into the same vector space, enabling semantic search between visual and textual information. The model is hosted on Zilliz Cloud to provide the best latency.
-
Click Add to save your function.
-
-
(Optional) Continue to add another PRESERVE function if you need to preserve the metadata for your texts. A PRESERVE function adds additional scalar fields to the collection along with data ingestion.
📘NotesFor each Ingestion pipeline, you can add up to 50 PRESERVE functions.
-
Click + Function.
-
Enter function name.
-
Configure the input field name and type. Supported input field types include Bool, Int8, Int16, Int32, Int64, Float, Double, and VarChar.
📘NotesCurrently, the output field name must be identical to the input field name. The input field name defines the field name used when running the Ingestion pipeline. The output field name defines the field name in the vector collection schema where the preserved value is kept.
For VarChar fields, the value should be a string with a maximum length of 4,000 alphanumeric characters.
When storing date-time in scalar fields, it is recommended to use the Int16 data type for year data, and Int32 for timestamps.
- Click Add to save your function.
-
-
Click Create Ingestion Pipeline.
-
Continue creating a Search pipeline and a Deletion pipeline that is auto-configured to be compatible with the just-created Ingestion pipeline.
The following example creates an Ingestion pipeline named my_image_ingestion_pipeline
with an INDEX_IMAGE function and a PRESERVE function added.
curl --request POST \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${YOUR_API_KEY}" \
--url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines" \
-d '{
"name": "my_image_ingestion_pipeline",
"clusterId": "inxx-xxxxxxxxxxxxxxx",
"projectId": "proj-xxxx",
"collectionName": "my_collection",
"description": "A pipeline that converts an image into vector embeddings and store in efficient index for search.",
"type": "INGESTION",
"functions": [
{
"name": "index_my_image",
"action": "INDEX_IMAGE",
"embedding": "zilliz/vit-base-patch16-224"
},
{
"name": "keep_image_tag",
"action": "PRESERVE",
"inputField": "image_title",
"outputField": "image_title",
"fieldType": "VarChar"
}
]
}'
The parameters in the above code are described as follows:
-
YOUR_API_KEY
: The credential used to authenticate API requests. Learn more about how to View API Keys. -
cloud-region
: The ID of the cloud region where your cluster exists. Currently, onlygcp-us-west1
is supported. -
clusterId
: The ID of the cluster in which you want to create a pipeline. Currently, you can only choose a cluster deployed in us-west1 on GCP. Learn more about How can I find my CLUSTER_ID? -
projectId
: The ID of the project in which you want to create a pipeline. Learn more about How Can I Obtain the Project ID? -
collectionName
: The name of the collection automatically generated with the ingestion pipeline to create. Alternatively, you can also specify an existing collection. -
name
: The name of the pipeline to create. The pipeline name should be a string of 3-64 characters and can contain only alphanumeric letters and underscores. -
description
(optional): The description of the pipeline to create. -
type
: The type of the pipeline to create. Currently, available pipeline types includeINGESTION
,SEARCH
, andDELETION
. -
functions
: The function(s) to add in the pipeline. An Ingestion pipeline can have only one INDEX function and up to 50 PRESERVE functions.-
name
: The name of the function. The function name should be a string of 3-64 characters and can contain only alphanumeric letters and underscores. -
action
: The type of the function to add. Currently, available options includeINDEX_DOC
,INDEX_TEXT
,INDEX_IMAGE
andPRESERVE
. -
embedding
: The embedding model used to generate vector embeddings for your image. Available options are as follows. (This parameter is only used in theINDEX
function.)Embedding Model
Description
zilliz/vit-base-patch16-224
The Vision Transformer (ViT) is a transformer encoder model (BERT-like) open-sourced by Google. The model is pretrained on a large collection of images to embed the semantic of image content to a vector space. The model is hosted on Zilliz Cloud to provide the best latency.
zilliz/clip-vit-base-patch32
A multi-modal model released by OpenAI. This vision model and its pairing text model are capable of embedding images and texts into the same vector space, enabling semantic search between visual and textual information. The model is hosted on Zilliz Cloud to provide the best latency.
-
-
inputField
: The name of theinputField
. You can customize the value but it should be identical with theoutputField
.(This parameter is only used in thePRESERVE
function.) -
outputField
: The name of the output field which will be used in the collection schema. Currently, the output field name must be identical to the input field name. (This parameter is only used in thePRESERVE
function.) -
fieldType
: The data type of the input and output fields. Possible values includeBool
,Int8
,Int16
,Int32
,Int64
,Float
,Double
, andVarChar
. (This parameter is only used in thePRESERVE
function.)📘NotesWhen storing date-time in scalar fields, it is recommended to use the Int16 data type for year data, and Int32 for timestamps.
For
VarChar
field type, themax_length
of the data in this field cannot exceed 4,000.
Below is an example output.
{
"code": 200,
"data": {
"pipelineId": "pipe-xxxx",
"name": "my_image_ingestion_pipeline",
"type": "INGESTION",
"createTimestamp": 1721187300000,
"clusterId": "in03-***************",
"collectionName": "my_collection"
"description": "A pipeline that converts an image into vector embeddings and store in efficient index for search.",
"status": "SERVING",
"totalUsage": {
"embedding": 0
},
"functions": [
{
"action": "INDEX_IMAGE",
"name": "index_my_image",
"inputFields": ["image_url", "image_id"],
"embedding": "zilliz/vit-base-patch16-224"
},
{
"action": "PRESERVE",
"name": "keep_image_tag",
"inputField": "image_title",
"outputField": "image_title",
"fieldType": "VarChar"
}
]
}
}
The total usage data could delay by a few hours due to technical limitation.
When the Ingestion pipeline is created, a collection named my_collection
is automatically created.
This collection contains three fields: two output fields of the INDEX_IMAGE function, and one output field for each PRESERVE function. The collection schema is as follows.
image_id (Data Type: Int64) | embedding (Data type: FLOAT_VECTOR) | image_title (Data type: VarChar) |
---|
Run image ingestion pipeline
- Cloud Console
- Bash
-
Click the "▶︎" button next to your Ingestion pipeline. Alternatively, you can also click on the Playground tab.
-
Input the query image ID and URL in the
image_id
andimage_url
fields. If you have added a PRESERVE function, enter the value in the defined preserved field as well. Click Run. -
Check the results.
-
Input other texts to run again.
The following example runs the Ingestion pipeline my_image_ingestion_pipeline
.
curl --request POST \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${YOUR_API_KEY}" \
--url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines/${YOUR_PIPELINE_ID}/run" \
-d '{
"data": {
"image_id": "my-img-123456",
"image_url": "xxx",
"image_title": "A cute yellow cat"
}
}'
The parameters in the above code are described as follows:
-
YOUR_API_KEY
: The credential used to authenticate API requests. Learn more about how to View API Keys. -
cloud-region
: The ID of the cloud region where your cluster exists. Currently, onlygcp-us-west1
is supported. -
image_id
: The ID of the image stored on an object storage. -
image_url
: The URL of the image stored on an object storage. You should use a URL that is either not encoded or encoded in UTF-8. Ensure that the URL remains valid for at least one hour. -
image_title
:The metadata field that needs to be preserved.
Below is an example response.
{
"code": 200,
"data": {
"num_entities": 1,
"usage": {
"embedding": 1
}
}
}
Search image data
To search any data, you need to first create a search pipeline and then run it. Unlike Ingestion and Deletion pipelines, when creating a Search pipeline, the cluster and collection are defined at the function level instead of the pipeline level. This is because Zilliz Cloud allows you to search from multiple collections at a time.
There are two ways to search image data: conduct a reverse image search or search image by text.
Conduct a reverse image search
Create image search pipeline
- Cloud Console
- Bash
-
Navigate to your project.
-
Click on Pipelines from the navigation panel. Then switch to the Overview tab and click Pipelines. To create a pipeline, click + Pipeline.
-
Choose the type of pipeline to create. Click on + Pipeline button in the Search Pipeline column.
-
Configure the Search pipeline you wish to create.
Parameters
Description
Pipeline Name
The name of the new Search pipeline. It should only contain lowercase letters, numbers, and underscores only.
Description (Optional)
The description of the new Search pipeline.
-
Add a function to the Search pipeline by clicking + Function. You can add exactly one function.
-
Enter function name.
-
Choose Target Cluster and Target collection. The Target Cluster must be a cluster deployed in us-west1 on Google Cloud Platform (GCP). and the Target Collection must be created by an Ingestion pipeline, otherwise the Search pipeline will not be compatible.
-
Select SEARCH_IMAGE_BY_IMAGE as the Function Type. A SEARCH_IMAGE_BY_IMAGE function can convert the query image to a vector embedding and retrieve topK most similar images.
-
Click Add to save your function.
-
-
Click Create Search Pipeline.
The following example creates a Search pipeline named my_image_search_pipeline
with a SEARCH_IMAGE_BY_IMAGE function added.
curl --request POST \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${YOUR_API_KEY}" \
--url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines" \
-d '{
"projectId": "proj-xxxx",
"name": "my_image_search_pipeline",
"description": "A pipeline that searches image by image.",
"type": "SEARCH",
"functions": [
{
"name": "search_image_by_image",
"action": "SEARCH_IMAGE_BY_IMAGE",
"embedding": "zilliz/vit-base-patch16-224",
"clusterId": "inxx-xxxxxxxxxxxxxxx",
"collectionName": "my_collection"
}
]
}'
The parameters in the above code are described as follows:
-
YOUR_API_KEY
: The credential used to authenticate API requests. Learn more about how to View API Keys. -
cloud-region
: The ID of the cloud region where your cluster exists. Currently, onlygcp-us-west1
is supported. -
projectId
: The ID of the project in which you want to create a pipeline. Learn more about How Can I Obtain the Project ID? -
name
: The name of the pipeline to create. The pipeline name should be a string of 3-64 characters and can contain only alphanumeric letters and underscores. -
description
(optional): The description of the pipeline to create. -
type
: The type of the pipeline to create. Currently, available pipeline types includeINGESTION
,SEARCH
, andDELETION
. -
functions
: The function(s) to add in the pipeline. A Search pipeline can only have one function.-
name
: The name of the function. The function name should be a string of 3-64 characters and can contain only alphanumeric letters and underscores. -
action
: The type of the function to add. Currently, available options includeSEARCH_DOC_CHUNK
,SEARCH_TEXT
,SEARCH_IMAGE_BY_IMAGE
, andSEARCH_IMAGE_BY_TEXT
. -
clusterId
: The ID of the cluster in which you want to create a pipeline. Currently, you can only choose a cluster deployed in us-west1 on GCP. Learn more about How can I find my CLUSTER_ID? -
collectionName
: The name of the collection in which you want to create a pipeline. -
embedding
: The embedding model used during vector search. The model should be consistent with the one chosen in the compatible collection.
-
Below is an example output.
{
"code": 200,
"data": {
"pipelineId": "pipe-xxxx",
"name": "my_image_search_pipeline",
"type": "SEARCH",
"createTimestamp": 1721187300000,
"description": "A pipeline that searches image by image.",
"status": "SERVING",
"totalUsage": {
"embedding": 0
},
"functions":
{
"action": "SEARCH_IMAGE_BY_IMAGE",
"name": "search_image_by_image",
"inputFields": ["query_image_url"],
"clusterId": "in03-***************",
"collectionName": "my_collection",
"embedding": "zilliz/vit-base-patch16-224"
}
}
}
The total usage data could delay by a few hours due to technical limitation.
Run image search pipeline
- Cloud Console
- Bash
-
Click the "▶︎" button next to your Search pipeline. Alternatively, you can also click on the Playground tab.
-
Input the query image URL. Click Run.
-
Check the results.
-
Enter a new query image URL to rerun the pipeline.
The following example runs the Search pipeline named my_image_search_pipeline
.
curl --request POST \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${YOUR_API_KEY}" \
--url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines/${YOUR_PIPELINE_ID}/run" \
-d '{
"data": {
"query_image_url": "xxx"
},
"params":{
"limit": 1,
"offset": 0,
"outputFields": ["image_id", "image_title"],
"filter": "id >= 0"
}
}'
The parameters in the above code are described as follows:
-
YOUR_API_KEY
: The credential used to authenticate API requests. Learn more about how to View API Keys. -
cloud-region
: The ID of the cloud region where your cluster exists. Currently, onlygcp-us-west1
is supported. -
query_image_url
: The URL of the query image used to conduct a similarity search. -
params
: The search parameters to configure.-
limit
: The maximum number of entities to return. The value should be an integer ranging from 1 to 500. The sum of this value of that ofoffset
should be less than 1024. -
offset
: The number of entities to skip in the search results.The sum of this value and that of
limit
should not be greater than 1024.The maximum value is 1024. -
outputFields
: An array of fields to return along with the search results. Note thatid
(entity ID),distance
will be returned in the search result by default. If you need other output fields in the returned result, you can configure this parameter. -
filter
: The filter in boolean expression used to find matches for the search
-
Below is an example response.
{
"code": 200,
"data": {
"result": [
{
"id": "my-img-123456",
"distance": 0.40448662638664246,
"image_id": "my-img-123456",
"image_title": "A cute yellow cat"
}
],
"usage": {
"embedding": 1
}
}
}
The usage data could delay by a few hours due to technical limitation.
Search image by text
Create image search pipeline
- Cloud Console
- Bash
-
Navigate to your project.
-
Click on Pipelines from the navigation panel. Then switch to the Overview tab and click Pipelines. To create a pipeline, click + Pipeline.
-
Choose the type of pipeline to create. Click on + Pipeline button in the Search Pipeline column.
-
Configure the Search pipeline you wish to create.
Parameters
Description
Pipeline Name
The name of the new Search pipeline. It should only contain lowercase letters, numbers, and underscores only.
Description (Optional)
The description of the new Search pipeline.
-
Add a function to the Search pipeline by clicking + Function. You can add exactly one function.
-
Enter function name.
-
Choose Target Cluster and Target collection. The Target Cluster must be a cluster deployed in us-west1 on Google Cloud Platform (GCP). and the Target Collection must be created by an Ingestion pipeline, otherwise the Search pipeline will not be compatible.
📘NotesThe SEARCHIMAGEBY_TEXT function is only available when there is a compatible image ingestion pipeline using the multimodal image model service
zilliz/clip-vit-base-patch32
. -
Select SEARCH_IMAGE_BY_TEXT as the Function Type. A SEARCH_IMAGE_BY_TEXT function can convert the query text to a vector embedding and retrieve topK most similar images.
If you choose the SEARCH_IMAGE_BY_TEXT function, the multimodal text embedding service
zilliz/clip-vit-base-patch32-multilingual-v1
will be used by default to match the corresponding ingestion pipeline and target collection. -
Click Add to save your function.
-
-
Click Create Search Pipeline.
The following example creates a Search pipeline named my_image_search_pipeline
with a SEARCH_IMAGE_BY_TEXT function added.
curl --request POST \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${YOUR_API_KEY}" \
--url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines" \
-d '{
"projectId": "proj-xxxx",
"name": "my_image_search_pipeline",
"description": "A pipeline that searches image by text.",
"type": "SEARCH",
"functions": [
{
"name": "search_image_by_text",
"action": "SEARCH_IMAGE_BY_TEXT",
"embedding": "zilliz/clip-vit-base-patch32-multilingual-v1",
"clusterId": "inxx-xxxxxxxxxxxxxxx",
"collectionName": "my_collection"
}
]
}'
The parameters in the above code are described as follows:
-
YOUR_API_KEY
: The credential used to authenticate API requests. Learn more about how to View API Keys. -
cloud-region
: The ID of the cloud region where your cluster exists. Currently, onlygcp-us-west1
is supported. -
projectId
: The ID of the project in which you want to create a pipeline. Learn more about How Can I Obtain the Project ID? -
name
: The name of the pipeline to create. The pipeline name should be a string of 3-64 characters and can contain only alphanumeric letters and underscores. -
description
(optional): The description of the pipeline to create. -
type
: The type of the pipeline to create. Currently, available pipeline types includeINGESTION
,SEARCH
, andDELETION
. -
functions
: The function(s) to add in the pipeline. A Search pipeline can only have one function.-
name
: The name of the function. The function name should be a string of 3-64 characters and can contain only alphanumeric letters and underscores. -
action
: The type of the function to add. Currently, available options includeSEARCH_DOC_CHUNK
,SEARCH_TEXT
,SEARCH_IMAGE_BY_IMAGE
, andSEARCH_IMAGE_BY_TEXT
. -
clusterId
: The ID of the cluster in which you want to create a pipeline. Currently, you can only choose a cluster deployed in us-west1 on GCP. Learn more about How can I find my CLUSTER_ID? -
collectionName
: The name of the collection in which you want to create a pipeline. -
embedding
: The embedding model used during vector search. Here, you should use the embedding modelzilliz/clip-vit-base-patch32-multilingual-v1
. This model is a multi-lingual variant of OpenAI's CLIP-ViT-B32 model. It is designed to work together withzilliz/clip-vit-base-patch32
vision model and can process text in more than 50 languages.
-
Below is an example output.
{
"code": 200,
"data": {
"pipelineId": "pipe-xxxx",
"name": "my_image_search_pipeline",
"type": "SEARCH",
"createTimestamp": 1721187300000,
"description": "A pipeline that searches image by image.",
"status": "SERVING",
"totalUsage": {
"embedding": 0
},
"functions":
{
"action": "SEARCH_IMAGE_BY_TEXT",
"name": "search_image_by_text",
"inputFields": ["query_text"],
"clusterId": "in03-***************",
"collectionName": "my_collection",
"embedding": "zilliz/clip-vit-base-patch32-multilingual-v1"
}
}
}
The total usage data could delay by a few hours due to technical limitation.
Run image search pipeline
- Cloud Console
- Bash
-
Click the "▶︎" button next to your Search pipeline. Alternatively, you can also click on the Playground tab.
-
Input the query text. Click Run.
-
Check the results.
-
Enter a new query text to rerun the pipeline.
The following example runs the Search pipeline named my_image_search_pipeline
.
curl --request POST \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${YOUR_API_KEY}" \
--url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines/${YOUR_PIPELINE_ID}/run" \
-d '{
"data": {
"query_text": "Can you show me the image of a cat?",
},
"params":{
"limit": 1,
"offset": 0,
"outputFields": ["image_id", "image_title"],
"filter": "id >= 0"
}
}'
The parameters in the above code are described as follows:
-
YOUR_API_KEY
: The credential used to authenticate API requests. Learn more about how to View API Keys. -
cloud-region
: The ID of the cloud region where your cluster exists. Currently, onlygcp-us-west1
is supported. -
query_text
: The query text used to conduct a similarity search. -
params
: The search parameters to configure.-
limit
: The maximum number of entities to return. The value should be an integer ranging from 1 to 100. The sum of this value of that ofoffset
should be less than 1024. -
offset
: The number of entities to skip in the search results.The sum of this value and that of
limit
should not be greater than 1024.The maximum value is 1024. -
outputFields
: An array of fields to return along with the search results. Note thatid
(entity ID),distance
will be returned in the search result by default. If you need other output fields in the returned result, you can configure this parameter. -
filter
: The filter in boolean expression used to find matches for the search
-
Below is an example response.
{
"code": 200,
"data": {
"result": [
{
"id": "my-img-123456",
"distance": 0.40448662638664246,
"image_id": "my-img-123456",
"image_title": "A cute yellow cat"
}
],
"usage": {
"embedding": 1
}
}
}
The usage data could delay by a few hours due to technical limitation.
Delete image data
To delete any data, you need to first create a deletion pipeline and then run it.
You must create an Ingestion pipeline first. Upon successful creation of an Ingestion pipeline, you can create a Search pipeline and a Deletion pipeline to work with your newly created Ingestion pipeline.
Create image deletion pipeline
- Cloud Console
- Bash
-
Navigate to your project.
-
Click on Pipelines from the navigation panel. Then switch to the Overview tab and click Pipelines. To create a pipeline, click + Pipeline.
-
Choose the type of pipeline to create. Click on + Pipeline button in the Deletion Pipeline column.
-
Configure the Deletion pipeline you wish to create.
Parameters
Description
Pipeline Name
The name of the new Deletion pipeline. It should only contain lowercase letters, numbers, and underscores.
Description (Optional)
The description of the new Deletion pipeline.
-
Add a function to the Deletion pipeline by clicking + Function. You can add exactly one function.
-
Enter function name.
-
Select either PURGE_IMAGE_INDEX or PURGE_BY_EXPRESSION as the Function Type. A PURGE_IMAGE_INDEX function can delete all images with the specified image_id while a PURGE_BY_EXPRESSION function can delete all text entities matching the specified filter expression.
-
Click Add to save your function.
-
-
Click Create Deletion Pipeline.
The example below creates a Deletion pipeline named my_image_deletion_pipeline
with a PURGE_IMAGE_INDEX function added.
curl --request POST \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${YOUR_API_KEY}" \
--url "https://controller.api.{cloud-region}.zilliz.cloud.com/v1/pipelines" \
-d '{
"projectId": "proj-xxxx",
"name": "my_image_deletion_pipeline",
"description": "A pipeline that deletes image by id",
"type": "DELETION",
"functions": [
{
"name": "purge_image_by_id",
"action": "PURGE_IMAGE_INDEX"
}
],
"clusterId": "inxx-xxxxxxxxxxxxxxx",
"collectionName": "my_collection"
}'
The parameters in the above code are described as follows:
-
YOUR_API_KEY
: The credential used to authenticate API requests. Learn more about how to View API Keys. -
cloud-region
: The ID of the cloud region where your cluster exists. Currently, onlygcp-us-west1
is supported. -
projectId
: The ID of the project in which you want to create a pipeline. Learn more about How Can I Obtain the Project ID? -
name
: The name of the pipeline to create. The pipeline name should be a string of 3-64 characters and can contain only alphanumeric letters and underscores. -
description
(optional): The description of the pipeline to create. -
type
: The type of the pipeline to create. Currently, available pipeline types includeINGESTION
,SEARCH
, andDELETION
. -
functions
: The function(s) to add in the pipeline. A Deletion pipeline can only have one function.-
name
: The name of the function. The function name should be a string of 3-64 characters and can contain only alphanumeric letters and underscores. -
action
: The type of the function to add. Available options includePURGE_DOC_INDEX
,PURGE_TEXT_INDEX
,PURGE_BY_EXPRESSION
, andPURGE_IMAGE_INDEX
.
-
-
clusterId
: The ID of the cluster in which you want to create a pipeline. Currently, you can only choose a cluster deployed on GCP us-west1. Learn more about How can I find my CLUSTER_ID? -
collectionName
: The name of the collection in which you want to create a pipeline.
Below is an example output.
{
"code": 200,
"data": {
"id": 0,
"name": "my_image_deletion_pipeline",
"type": "DELETION",
"createTimestamp": 1721187655000,
"description": "A pipeline that deletes image by id",
"status": "SERVING",
"functions": [
{
"name": "purge_image_by_id",
"action": "PURGE_IMAGE_INDEX",
"inputFields": ["image_id"]
}
],
"clusterId": "in03-xxxx",
"collectionName":" my_collection"
}
}
Run image deletion pipeline
- Cloud Console
- Bash
-
Click the "▶︎" button next to your Deletion pipeline. Alternatively, you can also click on the Playground tab.
-
Input the filter expression. Click Run.
-
Check the results.
The following example runs the Deletion pipeline named my_image_deletion_pipeline
.
curl --request POST \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${YOUR_API_KEY}" \
--url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines/${YOUR_PIPELINE_ID}/run" \
-d '{
"data": {
"image_id": "my-img-123456"
}
}'
The parameters in the above code are described as follows:
-
YOUR_API_KEY
: The credential used to authenticate API requests. Learn more about how to View API Keys. -
cloud-region
: The ID of the cloud region where your cluster exists. Currently, onlygcp-us-west1
is supported. -
image_id
: The ID of the image to delete.
Below is an example response.
{
"code": 200,
"data": {
"num_deleted_entities": 1
}
}
Manage pipeline
The following are relevant operations that manages the created pipelines in the aforementioned steps.
View pipeline
- Cloud Console
- Bash
Click Pipelines on the left navigation. Choose the Pipelines tab. You will see all the available pipelines.
Click on a specific pipeline to view its detailed information including its basic information, total usage, functions, and related connectors.
The total usage data could delay by a few hours due to technical limitation.
You can also check the pipeline activities on the web UI.
You can call the API to list all existing pipelines or view the details of a particular pipeline.
-
View all existing pipelines
Follow the example below and specify your
projectId
. Learn more about how to obtain the project ID.curl --request GET \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${YOUR_API_KEY}" \
--url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines?projectId=proj-xxxx"Below is an example output.
{
"code": 200,
"data": [
{
"pipelineId": "pipe-xxxx",
"name": "my_text_ingestion_pipeline",
"type": "INGESTION",
"createTimestamp": 1721187655000,
"clusterId": "in03-***************",
"collectionName": "my_collection"
"description": "A pipeline that generates text embeddings and stores additional fields.",
"status": "SERVING",
"totalUsage": {
"embedding": 0
},
"functions": [
{
"action": "INDEX_TEXT",
"name": "index_my_text",
"inputFields": ["text_list"],
"language": "ENGLISH",
"embedding": "zilliz/bge-base-en-v1.5"
},
{
"action": "PRESERVE",
"name": "keep_text_info",
"inputField": "source",
"outputField": "source",
"fieldType": "VarChar"
}
]
},
{
"pipelineId": "pipe-xxxx",
"name": "my_text_search_pipeline",
"type": "SEARCH",
"createTimestamp": 1721187655000,
"description": "A pipeline that receives text and search for semantically similar texts",
"status": "SERVING",
"totalUsage": {
"embedding": 0,
"rerank": 0
},
"functions":
{
"action": "SEARCH_TEXT",
"name": "search_text",
"inputFields": "query_text",
"clusterId": "in03-***************",
"collectionName": "my_collection",
"embedding": "zilliz/bge-base-en-v1.5",
"reranker": "zilliz/bge-reranker-base"
}
},
{
"pipelineId": "pipe-xxxx",
"name": "my_text_deletion_pipeline",
"type": "DELETION",
"createTimestamp": 1721187655000,
"description": "A pipeline that deletes entities by expression",
"status": "SERVING",
"functions":
{
"action": "PURGE_BY_EXPRESSION",
"name": "purge_data_by_expression",
"inputFields": ["expression"]
},
"clusterId": "in03-***************",
"collectionName": "my_collection"
}
]
}📘NotesThe total usage data could delay by a few hours due to technical limitation.
-
View the details of a specific pipeline
Follow the example below to view the details of a pipeline.
curl --request GET \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${YOUR_API_KEY}" \
--url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines/${YOUR_PIPELINE_ID}"Below is example output.
{
"code": 200,
"data": {
"pipelineId": "pipe-xxx",
"name": "my_text_ingestion_pipeline",
"type": "INGESTION",
"createTimestamp": 1721187300000,
"description": "A pipeline that generates text embeddings and stores additional fields.",
"status": "SERVING",
"totalUsage": {
"embedding": 0
},
"functions": [
{
"name": "index_my_text",
"action": "INDEX_TEXT",
"inputFields": ["text_list"],
"language": "ENGLISH",
"embedding": "zilliz/bge-base-en-v1.5"
},
{
"name": "keep_text_info",
"action": "PRESERVE",
"inputField": "source",
"outputField": "source",
"fieldType": "VarChar"
}
],
"clusterId": "inxx-xxxx",
"collectionName": "my_collection"
}
}📘NotesThe total usage data could delay by a few hours due to technical limitation.
Delete pipeline
If you no longer need a pipeline, you can drop it. Note that dropping a pipeline will not remove the auto-created collection where it ingested data.
Dropped pipelines cannot be recovered. Please be cautious with the action.
Dropping a data-ingestion pipeline does not affect the collection created along with the pipeline. Your data is safe.
- Cloud Console
- Bash
To drop a pipeline on the web UI, click the ... button under the Actions column. Then click Drop.
Follow the example below to drop a pipeline.
curl --request GET \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${YOUR_API_KEY}" \
--url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines/${YOUR_PIPELINE_ID}"
The following is an example output.
{
"code": 200,
"data": {
"pipelineId": "pipe-xxx",
"name": "my_text_ingestion_pipeline",
"type": "INGESTION",
"createTimestamp": 1721187300000,
"description": "A pipeline that generates text embeddings and stores additional fields.",
"status": "SERVING",
"totalUsage": {
"embedding": 0
},
"functions": [
{
"name": "index_my_text",
"action": "INDEX_TEXT",
"inputFields": ["text_list"],
"language": "ENGLISH",
"embedding": "zilliz/bge-base-en-v1.5"
},
{
"name": "keep_text_info",
"action": "PRESERVE",
"inputField": "source",
"outputField": "source",
"fieldType": "VarChar"
}
],
"clusterId": "inxx-xxxx",
"collectionName": "my_collection"
}
}
The total usage data could delay by a few hours due to technical limitation.