Version: User Guides (Cloud)

Image Data
About to Deprecate

The Zilliz Cloud web UI provides a simplified and intuitive way of creating, running, and managing Pipelines while the RESTful API offers more flexibility and customization compared to the Web UI.

This guide walks you through the necessary steps to create image pipelines, conduct a reverse image search on your embedded image data, and delete the pipeline if it is no longer needed.

📘Notes

Zilliz Cloud Pipelines will be discontinued by the end of Q2 2025 and replaced by a new feature, “Data In, Data Out,” to streamline embedding generation in both Milvus and Zilliz Cloud. As of December 24, 2024, new user registrations are no longer accepted. Current users can continue using the service within the $20 monthly free allowance until the sunset date; however, no SLA is provided. Please consider using embedding APIs from model providers or open-source models to generate vector embeddings.

Prerequisites and limitations

Ensure you have created a cluster deployed in us-west1 on Google Cloud Platform (GCP).
In one project, you can only create up to 100 pipelines of the same type. For more information, refer to Zilliz Cloud Limits.

Ingest image data

To ingest any data, you need to first create an ingestion pipeline and then run it.

Create image ingestion pipeline

Cloud Console
Bash

Navigate to your project.
Click on Pipelines from the navigation panel. Then switch to the Overview tab and click Pipelines. To create a pipeline, click + Pipeline.
Choose the type of pipeline to create. Click on + Pipeline button in the Ingestion Pipeline column.

Configure the Ingestion pipeline you wish to create.

Parameters	Description
Target Cluster	The cluster where a new collection will be automatically created with this Ingestion pipeline. Currently, this can only be a cluster deployed on GCP us-west1.
Collection Name	The name of the auto-created collection.
Pipeline Name	Name of the new Ingestion pipeline. It should only contain lowercase letters, numbers, and underscores.
Description (Optional)	The description of the new Ingestion pipeline.

configure-ingestion-pipeline

Add an INDEX function to the Ingestion pipeline by clicking + Function. For each Ingestion pipeline, you can add exactly one INDEX function.

Enter function name.
Select INDEX_IMAGE as the function type. An INDEX_IMAGE function can generate vector embeddings for the images in the provided URLs.

Choose the embedding model used to generate vector embeddings. Currently, there are 2 available models: zilliz/vit-base-patch16-224 and zilliz/clip-vit-base-patch32. The following chart briefly introduces each embedding model.

Embedding Model	Description
zilliz/vit-base-patch16-224	The Vision Transformer (ViT) is a transformer encoder model (BERT-like) open-sourced by Google. The model is pretrained on a large collection of images to embed the semantic of image content to a vector space. The model is hosted on Zilliz Cloud to provide the best latency.
zilliz/clip-vit-base-patch32	A multi-modal model released by OpenAI. This vision model and its pairing text model are capable of embedding images and texts into the same vector space, enabling semantic search between visual and textual information. The model is hosted on Zilliz Cloud to provide the best latency.

add-index-image-function

Click Add to save your function.

(Optional) Continue to add another PRESERVE function if you need to preserve the metadata for your texts. A PRESERVE function adds additional scalar fields to the collection along with data ingestion.

📘Notes
For each Ingestion pipeline, you can add up to 50 PRESERVE functions.
1. Click + Function.
2. Enter function name.
3. Configure the input field name and type. Supported input field types include Bool, Int8, Int16, Int32, Int64, Float, Double, and VarChar.
  📘Notes
  - Currently, the output field name must be identical to the input field name. The input field name defines the field name used when running the Ingestion pipeline. The output field name defines the field name in the vector collection schema where the preserved value is kept.
  - For VarChar fields, the value should be a string with a maximum length of 4,000 alphanumeric characters.
  - When storing date-time in scalar fields, it is recommended to use the Int16 data type for year data, and Int32 for timestamps.
1. Click Add to save your function.
Click Create Ingestion Pipeline.
Continue creating a Search pipeline and a Deletion pipeline that is auto-configured to be compatible with the just-created Ingestion pipeline.

The following example creates an Ingestion pipeline named my_image_ingestion_pipeline with an INDEX_IMAGE function and a PRESERVE function added.

curl --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer ${YOUR_API_KEY}" \
    --url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines" \
    -d '{
        "name": "my_image_ingestion_pipeline",
        "clusterId": "inxx-xxxxxxxxxxxxxxx",
        "projectId": "proj-xxxx",
        "collectionName": "my_collection",
        "description": "A pipeline that converts an image into vector embeddings and store in efficient index for search.",
        "type": "INGESTION",  
        "functions": [
            { 
                "name": "index_my_image",
                "action": "INDEX_IMAGE", 
                "embedding": "zilliz/vit-base-patch16-224"
            },
            {
                "name": "keep_image_tag",
                "action": "PRESERVE", 
                "inputField": "image_title", 
                "outputField": "image_title",
                "fieldType": "VarChar" 
            }
        ]   
    }'

The parameters in the above code are described as follows:

YOUR_API_KEY: The credential used to authenticate API requests. Learn more about how to View API Keys.
cloud-region: The ID of the cloud region where your cluster exists. Currently, only gcp-us-west1 is supported.
clusterId: The ID of the cluster in which you want to create a pipeline. Currently, you can only choose a cluster deployed in us-west1 on GCP. Learn more about How can I find my CLUSTER_ID?
projectId: The ID of the project in which you want to create a pipeline. Learn more about How Can I Obtain the Project ID?
collectionName: The name of the collection automatically generated with the ingestion pipeline to create. Alternatively, you can also specify an existing collection.
name: The name of the pipeline to create. The pipeline name should be a string of 3-64 characters and can contain only alphanumeric letters and underscores.
description (optional): The description of the pipeline to create.
type: The type of the pipeline to create. Currently, available pipeline types include INGESTION, SEARCH, and DELETION.

functions: The function(s) to add in the pipeline. An Ingestion pipeline can have only one INDEX function and up to 50 PRESERVE functions.

name: The name of the function. The function name should be a string of 3-64 characters and can contain only alphanumeric letters and underscores.
action: The type of the function to add. Currently, available options include INDEX_DOC, INDEX_TEXT, INDEX_IMAGE and PRESERVE.

embedding: The embedding model used to generate vector embeddings for your image. Available options are as follows. (This parameter is only used in the INDEX function.)

Embedding Model	Description
zilliz/vit-base-patch16-224	The Vision Transformer (ViT) is a transformer encoder model (BERT-like) open-sourced by Google. The model is pretrained on a large collection of images to embed the semantic of image content to a vector space. The model is hosted on Zilliz Cloud to provide the best latency.
zilliz/clip-vit-base-patch32	A multi-modal model released by OpenAI. This vision model and its pairing text model are capable of embedding images and texts into the same vector space, enabling semantic search between visual and textual information. The model is hosted on Zilliz Cloud to provide the best latency.

inputField: The name of the inputField. You can customize the value but it should be identical with the outputField.(This parameter is only used in the PRESERVE function.)
outputField: The name of the output field which will be used in the collection schema. Currently, the output field name must be identical to the input field name. (This parameter is only used in the PRESERVE function.)
fieldType: The data type of the input and output fields. Possible values include Bool, Int8, Int16, Int32, Int64, Float, Double, and VarChar. (This parameter is only used in the PRESERVE function.)

📘Notes
When storing date-time in scalar fields, it is recommended to use the Int16 data type for year data, and Int32 for timestamps.
For VarChar field type, the max_length of the data in this field cannot exceed 4,000.

Below is an example output.

{
  "code": 200,
  "data": {
    "pipelineId": "pipe-xxxx",
    "name": "my_image_ingestion_pipeline",
    "type": "INGESTION",
    "createTimestamp": 1721187300000,
    "clusterId": "in03-***************",
    "collectionName": "my_collection"
    "description": "A pipeline that converts an image into vector embeddings and store in efficient index for search.",
    "status": "SERVING",
    "totalUsage": {
      "embedding": 0
    },
    "functions": [
      {
        "action": "INDEX_IMAGE",
        "name": "index_my_image",
        "inputFields": ["image_url", "image_id"],
        "embedding": "zilliz/vit-base-patch16-224"
      },
      {
        "action": "PRESERVE",
        "name": "keep_image_tag",
        "inputField": "image_title",
        "outputField": "image_title",
        "fieldType": "VarChar"
      }
    ]
  }
}

📘Notes

The total usage data could delay by a few hours due to technical limitation.

When the Ingestion pipeline is created, a collection named my_collection is automatically created.

This collection contains three fields: two output fields of the INDEX_IMAGE function, and one output field for each PRESERVE function. The collection schema is as follows.

image_id

(Data Type: Int64)

embedding

(Data type: FLOAT_VECTOR)

image_title

(Data type: VarChar)

Run image ingestion pipeline

Cloud Console
Bash

Click the "▶︎" button next to your Ingestion pipeline. Alternatively, you can also click on the Playground tab.
Input the query image ID and URL in the image_id and image_url fields. If you have added a PRESERVE function, enter the value in the defined preserved field as well. Click Run.
Check the results.
Input other texts to run again.

The following example runs the Ingestion pipeline my_image_ingestion_pipeline.

curl --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer ${YOUR_API_KEY}" \
    --url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines/${YOUR_PIPELINE_ID}/run" \
    -d '{
        "data": {
            "image_id": "my-img-123456",
            "image_url": "xxx",
            "image_title": "A cute yellow cat"
        }
    }'

The parameters in the above code are described as follows:

YOUR_API_KEY: The credential used to authenticate API requests. Learn more about how to View API Keys.
cloud-region: The ID of the cloud region where your cluster exists. Currently, only gcp-us-west1 is supported.
image_id: The ID of the image stored on an object storage.
image_url: The URL of the image stored on an object storage. You should use a URL that is either not encoded or encoded in UTF-8. Ensure that the URL remains valid for at least one hour.
image_title：The metadata field that needs to be preserved.

Below is an example response.

{
  "code": 200,
  "data": {
    "num_entities": 1,
    "usage": {
      "embedding": 1
    }
  }
}

Search image data

To search any data, you need to first create a search pipeline and then run it. Unlike Ingestion and Deletion pipelines, when creating a Search pipeline, the cluster and collection are defined at the function level instead of the pipeline level. This is because Zilliz Cloud allows you to search from multiple collections at a time.

There are two ways to search image data: conduct a reverse image search or search image by text.

Conduct a reverse image search

Create image search pipeline

Cloud Console
Bash

Navigate to your project.
Click on Pipelines from the navigation panel. Then switch to the Overview tab and click Pipelines. To create a pipeline, click + Pipeline.
Choose the type of pipeline to create. Click on + Pipeline button in the Search Pipeline column.

Configure the Search pipeline you wish to create.

Parameters	Description
Pipeline Name	The name of the new Search pipeline. It should only contain lowercase letters, numbers, and underscores only.
Description (Optional)	The description of the new Search pipeline.

Add a function to the Search pipeline by clicking + Function. You can add exactly one function.
1. Enter function name.
2. Choose Target Cluster and Target collection. The Target Cluster must be a cluster deployed in us-west1 on Google Cloud Platform (GCP). and the Target Collection must be created by an Ingestion pipeline, otherwise the Search pipeline will not be compatible.
3. Select SEARCH_IMAGE_BY_IMAGE as the Function Type. A SEARCH_IMAGE_BY_IMAGE function can convert the query image to a vector embedding and retrieve topK most similar images.
4. Click Add to save your function.
Click Create Search Pipeline.

The following example creates a Search pipeline named my_image_search_pipeline with a SEARCH_IMAGE_BY_IMAGE function added.

curl --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer ${YOUR_API_KEY}" \
    --url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines" \
    -d '{
        "projectId": "proj-xxxx",       
        "name": "my_image_search_pipeline",
        "description": "A pipeline that searches image by image.",
        "type": "SEARCH",
        "functions": [
            {
                "name": "search_image_by_image",
                "action": "SEARCH_IMAGE_BY_IMAGE",
                "embedding": "zilliz/vit-base-patch16-224",
                "clusterId": "inxx-xxxxxxxxxxxxxxx",
                "collectionName": "my_collection"
            }
        ]
    }'

The parameters in the above code are described as follows:

YOUR_API_KEY: The credential used to authenticate API requests. Learn more about how to View API Keys.
cloud-region: The ID of the cloud region where your cluster exists. Currently, only gcp-us-west1 is supported.
projectId: The ID of the project in which you want to create a pipeline. Learn more about How Can I Obtain the Project ID?
name: The name of the pipeline to create. The pipeline name should be a string of 3-64 characters and can contain only alphanumeric letters and underscores.
description (optional): The description of the pipeline to create.
type: The type of the pipeline to create. Currently, available pipeline types include INGESTION, SEARCH, and DELETION.
functions: The function(s) to add in the pipeline. A Search pipeline can only have one function.
- name: The name of the function. The function name should be a string of 3-64 characters and can contain only alphanumeric letters and underscores.
- action: The type of the function to add. Currently, available options include SEARCH_DOC_CHUNK, SEARCH_TEXT, SEARCH_IMAGE_BY_IMAGE, and SEARCH_IMAGE_BY_TEXT.
- clusterId: The ID of the cluster in which you want to create a pipeline. Currently, you can only choose a cluster deployed in us-west1 on GCP. Learn more about How can I find my CLUSTER_ID?
- collectionName: The name of the collection in which you want to create a pipeline.
- embedding: The embedding model used during vector search. The model should be consistent with the one chosen in the compatible collection.

Below is an example output.

{
  "code": 200,
  "data": {
    "pipelineId": "pipe-xxxx",
    "name": "my_image_search_pipeline",
    "type": "SEARCH",
    "createTimestamp": 1721187300000,
    "description": "A pipeline that searches image by image.",
    "status": "SERVING",
    "totalUsage": {
      "embedding": 0
    },
    "functions": 
      {
        "action": "SEARCH_IMAGE_BY_IMAGE",
        "name": "search_image_by_image",
        "inputFields": ["query_image_url"],
        "clusterId": "in03-***************",
        "collectionName": "my_collection",
        "embedding": "zilliz/vit-base-patch16-224"
      }
  }
}

📘Notes

The total usage data could delay by a few hours due to technical limitation.

Run image search pipeline

Cloud Console
Bash

Click the "▶︎" button next to your Search pipeline. Alternatively, you can also click on the Playground tab.
Input the query image URL. Click Run.
Check the results.
Enter a new query image URL to rerun the pipeline.

The following example runs the Search pipeline named my_image_search_pipeline.

curl --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer ${YOUR_API_KEY}" \
    --url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines/${YOUR_PIPELINE_ID}/run" \
    -d '{
      "data": {
        "query_image_url": "xxx"
      },
      "params":{
          "limit": 1,
          "offset": 0,
          "outputFields": ["image_id", "image_title"],
          "filter": "id >= 0"
      }
    }'

The parameters in the above code are described as follows:

YOUR_API_KEY: The credential used to authenticate API requests. Learn more about how to View API Keys.
cloud-region: The ID of the cloud region where your cluster exists. Currently, only gcp-us-west1 is supported.
query_image_url: The URL of the query image used to conduct a similarity search.
params: The search parameters to configure.
- limit: The maximum number of entities to return. The value should be an integer ranging from 1 to 500. The sum of this value of that of offset should be less than 1024.
- offset: The number of entities to skip in the search results.
  
  The sum of this value and that of limit should not be greater than 1024.The maximum value is 1024.
- outputFields: An array of fields to return along with the search results. Note that id（entity ID）, distance will be returned in the search result by default. If you need other output fields in the returned result, you can configure this parameter.
- filter: The filter in boolean expression used to find matches for the search

Below is an example response.

{
  "code": 200,
  "data": {
    "result": [
      {
        "id": "my-img-123456",
        "distance": 0.40448662638664246,
        "image_id": "my-img-123456",
        "image_title": "A cute yellow cat"
      }
    ],
    "usage": {
      "embedding": 1
    }
  }
}

📘Notes

The usage data could delay by a few hours due to technical limitation.

Search image by text

Create image search pipeline

Cloud Console
Bash

Navigate to your project.
Click on Pipelines from the navigation panel. Then switch to the Overview tab and click Pipelines. To create a pipeline, click + Pipeline.
Choose the type of pipeline to create. Click on + Pipeline button in the Search Pipeline column.

Configure the Search pipeline you wish to create.

Parameters	Description
Pipeline Name	The name of the new Search pipeline. It should only contain lowercase letters, numbers, and underscores only.
Description (Optional)	The description of the new Search pipeline.

Add a function to the Search pipeline by clicking + Function. You can add exactly one function.
1. Enter function name.
2. Choose Target Cluster and Target collection. The Target Cluster must be a cluster deployed in us-west1 on Google Cloud Platform (GCP). and the Target Collection must be created by an Ingestion pipeline, otherwise the Search pipeline will not be compatible.
  
  📘Notes
  The SEARCHIMAGEBY_TEXT function is only available when there is a compatible image ingestion pipeline using the multimodal image model service zilliz/clip-vit-base-patch32.
3. Select SEARCH_IMAGE_BY_TEXT as the Function Type. A SEARCH_IMAGE_BY_TEXT function can convert the query text to a vector embedding and retrieve topK most similar images.
  
  If you choose the SEARCH_IMAGE_BY_TEXT function, the multimodal text embedding service zilliz/clip-vit-base-patch32-multilingual-v1 will be used by default to match the corresponding ingestion pipeline and target collection.
4. Click Add to save your function.
Click Create Search Pipeline.

The following example creates a Search pipeline named my_image_search_pipeline with a SEARCH_IMAGE_BY_TEXT function added.

curl --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer ${YOUR_API_KEY}" \
    --url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines" \
    -d '{
        "projectId": "proj-xxxx",       
        "name": "my_image_search_pipeline",
        "description": "A pipeline that searches image by text.",
        "type": "SEARCH",
        "functions": [
            {
                "name": "search_image_by_text",
                "action": "SEARCH_IMAGE_BY_TEXT",
                "embedding": "zilliz/clip-vit-base-patch32-multilingual-v1",
                "clusterId": "inxx-xxxxxxxxxxxxxxx",
                "collectionName": "my_collection"
            }
        ]
    }'

The parameters in the above code are described as follows:

YOUR_API_KEY: The credential used to authenticate API requests. Learn more about how to View API Keys.
cloud-region: The ID of the cloud region where your cluster exists. Currently, only gcp-us-west1 is supported.
projectId: The ID of the project in which you want to create a pipeline. Learn more about How Can I Obtain the Project ID?
name: The name of the pipeline to create. The pipeline name should be a string of 3-64 characters and can contain only alphanumeric letters and underscores.
description (optional): The description of the pipeline to create.
type: The type of the pipeline to create. Currently, available pipeline types include INGESTION, SEARCH, and DELETION.
functions: The function(s) to add in the pipeline. A Search pipeline can only have one function.
- name: The name of the function. The function name should be a string of 3-64 characters and can contain only alphanumeric letters and underscores.
- action: The type of the function to add. Currently, available options include SEARCH_DOC_CHUNK, SEARCH_TEXT, SEARCH_IMAGE_BY_IMAGE, and SEARCH_IMAGE_BY_TEXT.
- clusterId: The ID of the cluster in which you want to create a pipeline. Currently, you can only choose a cluster deployed in us-west1 on GCP. Learn more about How can I find my CLUSTER_ID?
- collectionName: The name of the collection in which you want to create a pipeline.
- embedding: The embedding model used during vector search. Here, you should use the embedding model zilliz/clip-vit-base-patch32-multilingual-v1. This model is a multi-lingual variant of OpenAI's CLIP-ViT-B32 model. It is designed to work together with zilliz/clip-vit-base-patch32 vision model and can process text in more than 50 languages.

Below is an example output.

{
  "code": 200,
  "data": {
    "pipelineId": "pipe-xxxx",
    "name": "my_image_search_pipeline",
    "type": "SEARCH",
    "createTimestamp": 1721187300000,
    "description": "A pipeline that searches image by image.",
    "status": "SERVING",
    "totalUsage": {
      "embedding": 0
    },
    "functions": 
      {
        "action": "SEARCH_IMAGE_BY_TEXT",
        "name": "search_image_by_text",
        "inputFields": ["query_text"],
        "clusterId": "in03-***************",
        "collectionName": "my_collection",
        "embedding": "zilliz/clip-vit-base-patch32-multilingual-v1"
      }
  }
}

📘Notes

The total usage data could delay by a few hours due to technical limitation.

Run image search pipeline

Cloud Console
Bash

Click the "▶︎" button next to your Search pipeline. Alternatively, you can also click on the Playground tab.
Input the query text. Click Run.
Check the results.
Enter a new query text to rerun the pipeline.

The following example runs the Search pipeline named my_image_search_pipeline.

curl --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer ${YOUR_API_KEY}" \
    --url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines/${YOUR_PIPELINE_ID}/run" \
    -d '{
      "data": {
        "query_text": "Can you show me the image of a cat?",
      },
      "params":{
          "limit": 1,
          "offset": 0,
          "outputFields": ["image_id", "image_title"],
          "filter": "id >= 0"
      }
    }'

The parameters in the above code are described as follows:

YOUR_API_KEY: The credential used to authenticate API requests. Learn more about how to View API Keys.
cloud-region: The ID of the cloud region where your cluster exists. Currently, only gcp-us-west1 is supported.
query_text: The query text used to conduct a similarity search.
params: The search parameters to configure.
- limit: The maximum number of entities to return. The value should be an integer ranging from 1 to 100. The sum of this value of that of offset should be less than 1024.
- offset: The number of entities to skip in the search results.
  
  The sum of this value and that of limit should not be greater than 1024.The maximum value is 1024.
- outputFields: An array of fields to return along with the search results. Note that id（entity ID）, distance will be returned in the search result by default. If you need other output fields in the returned result, you can configure this parameter.
- filter: The filter in boolean expression used to find matches for the search

Below is an example response.

{
  "code": 200,
  "data": {
    "result": [
      {
        "id": "my-img-123456",
        "distance": 0.40448662638664246,
        "image_id": "my-img-123456",
        "image_title": "A cute yellow cat"
      }
    ],
    "usage": {
      "embedding": 1
    }
  }
}

📘Notes

The usage data could delay by a few hours due to technical limitation.

Delete image data

To delete any data, you need to first create a deletion pipeline and then run it.

📘Notes

You must create an Ingestion pipeline first. Upon successful creation of an Ingestion pipeline, you can create a Search pipeline and a Deletion pipeline to work with your newly created Ingestion pipeline.

Create image deletion pipeline

Cloud Console
Bash

Navigate to your project.
Click on Pipelines from the navigation panel. Then switch to the Overview tab and click Pipelines. To create a pipeline, click + Pipeline.
Choose the type of pipeline to create. Click on + Pipeline button in the Deletion Pipeline column.

Configure the Deletion pipeline you wish to create.

Parameters	Description
Pipeline Name	The name of the new Deletion pipeline. It should only contain lowercase letters, numbers, and underscores.
Description (Optional)	The description of the new Deletion pipeline.

configure-deletion-pipeline

Add a function to the Deletion pipeline by clicking + Function. You can add exactly one function.
1. Enter function name.
2. Select either PURGE_IMAGE_INDEX or PURGE_BY_EXPRESSION as the Function Type. A PURGE_IMAGE_INDEX function can delete all images with the specified image_id while a PURGE_BY_EXPRESSION function can delete all text entities matching the specified filter expression.
3. Click Add to save your function.
Click Create Deletion Pipeline.

The example below creates a Deletion pipeline named my_image_deletion_pipeline with a PURGE_IMAGE_INDEX function added.

curl --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer ${YOUR_API_KEY}" \
    --url "https://controller.api.{cloud-region}.zilliz.cloud.com/v1/pipelines" \
    -d '{
        "projectId": "proj-xxxx",
        "name": "my_image_deletion_pipeline",
        "description": "A pipeline that deletes image by id",
        "type": "DELETION",
        "functions": [
            {
                "name": "purge_image_by_id",
                "action": "PURGE_IMAGE_INDEX"
            }
        ], 
        "clusterId": "inxx-xxxxxxxxxxxxxxx",
        "collectionName": "my_collection"
    }'

The parameters in the above code are described as follows:

YOUR_API_KEY: The credential used to authenticate API requests. Learn more about how to View API Keys.
cloud-region: The ID of the cloud region where your cluster exists. Currently, only gcp-us-west1 is supported.
projectId: The ID of the project in which you want to create a pipeline. Learn more about How Can I Obtain the Project ID?
name: The name of the pipeline to create. The pipeline name should be a string of 3-64 characters and can contain only alphanumeric letters and underscores.
description (optional): The description of the pipeline to create.
type: The type of the pipeline to create. Currently, available pipeline types include INGESTION, SEARCH, and DELETION.
functions: The function(s) to add in the pipeline. A Deletion pipeline can only have one function.
- name: The name of the function. The function name should be a string of 3-64 characters and can contain only alphanumeric letters and underscores.
- action: The type of the function to add. Available options include PURGE_DOC_INDEX, PURGE_TEXT_INDEX, PURGE_BY_EXPRESSION, and PURGE_IMAGE_INDEX.
clusterId: The ID of the cluster in which you want to create a pipeline. Currently, you can only choose a cluster deployed on GCP us-west1. Learn more about How can I find my CLUSTER_ID?
collectionName: The name of the collection in which you want to create a pipeline.

Below is an example output.

{
    "code": 200,
    "data": {
        "id": 0,
        "name": "my_image_deletion_pipeline",
        "type": "DELETION",
        "createTimestamp": 1721187655000,
        "description": "A pipeline that deletes image by id",
        "status": "SERVING",
        "functions": [
            {
                "name": "purge_image_by_id",
                "action": "PURGE_IMAGE_INDEX",
                "inputFields": ["image_id"]
            }
        ],
        "clusterId": "in03-xxxx",
        "collectionName":" my_collection"
    }
}

Run image deletion pipeline

Cloud Console
Bash

Click the "▶︎" button next to your Deletion pipeline. Alternatively, you can also click on the Playground tab.
Input the filter expression. Click Run.
Check the results.

The following example runs the Deletion pipeline named my_image_deletion_pipeline.

curl --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer ${YOUR_API_KEY}" \
    --url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines/${YOUR_PIPELINE_ID}/run" \
    -d '{
        "data": {
            "image_id": "my-img-123456"
        }
    }'

The parameters in the above code are described as follows:

YOUR_API_KEY: The credential used to authenticate API requests. Learn more about how to View API Keys.
cloud-region: The ID of the cloud region where your cluster exists. Currently, only gcp-us-west1 is supported.
image_id: The ID of the image to delete.

Below is an example response.

{
  "code": 200,
  "data": {
    "num_deleted_entities": 1
  }
}

Manage pipeline

The following are relevant operations that manages the created pipelines in the aforementioned steps.

View pipeline

Cloud Console
Bash

Click Pipelines on the left navigation. Choose the Pipelines tab. You will see all the available pipelines.

view-pipelines-on-web-ui

Click on a specific pipeline to view its detailed information including its basic information, total usage, functions, and related connectors.

view-pipeline-details

📘Notes

The total usage data could delay by a few hours due to technical limitation.

You can also check the pipeline activities on the web UI.

view-pipelines-activities-on-web-ui

You can call the API to list all existing pipelines or view the details of a particular pipeline.

View all existing pipelines

Follow the example below and specify your projectId. Learn more about how to obtain the project ID.

curl --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer ${YOUR_API_KEY}" \
    --url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines?projectId=proj-xxxx"

Below is an example output.

{
  "code": 200,
  "data": [
    {
      "pipelineId": "pipe-xxxx",
      "name": "my_text_ingestion_pipeline",
      "type": "INGESTION",
      "createTimestamp": 1721187655000,
      "clusterId": "in03-***************",
      "collectionName": "my_collection"
      "description": "A pipeline that generates text embeddings and stores additional fields.",
      "status": "SERVING",
      "totalUsage": {
        "embedding": 0
        },
      "functions": [
        {
          "action": "INDEX_TEXT",
          "name": "index_my_text",
          "inputFields": ["text_list"],
          "language": "ENGLISH",
          "embedding": "zilliz/bge-base-en-v1.5"
        },
        {
          "action": "PRESERVE",
          "name": "keep_text_info",
          "inputField": "source",
          "outputField": "source",
          "fieldType": "VarChar"
        }
      ]
    },
    {
      "pipelineId": "pipe-xxxx",
      "name": "my_text_search_pipeline",
      "type": "SEARCH",
      "createTimestamp": 1721187655000,
      "description": "A pipeline that receives text and search for semantically similar texts",
      "status": "SERVING",
      "totalUsage": {
        "embedding": 0,
        "rerank": 0
        },
      "functions": 
        {
          "action": "SEARCH_TEXT",
          "name": "search_text",
          "inputFields": "query_text",
          "clusterId": "in03-***************",
          "collectionName": "my_collection",
          "embedding": "zilliz/bge-base-en-v1.5",
          "reranker": "zilliz/bge-reranker-base"
        }
    },
    {
      "pipelineId": "pipe-xxxx",
      "name": "my_text_deletion_pipeline",
      "type": "DELETION",
      "createTimestamp": 1721187655000,
      "description": "A pipeline that deletes entities by expression",
      "status": "SERVING",
      "functions": 
        {
        "action": "PURGE_BY_EXPRESSION",
        "name": "purge_data_by_expression",
        "inputFields": ["expression"]
        },
    "clusterId": "in03-***************",
    "collectionName": "my_collection"
    }
  ]
}

📘Notes

The total usage data could delay by a few hours due to technical limitation.

View the details of a specific pipeline

Follow the example below to view the details of a pipeline.

curl --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer ${YOUR_API_KEY}" \
    --url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines/${YOUR_PIPELINE_ID}"

Below is example output.

{
  "code": 200,
  "data": {
    "pipelineId": "pipe-xxx",
    "name": "my_text_ingestion_pipeline",
    "type": "INGESTION",
    "createTimestamp": 1721187300000,
    "description": "A pipeline that generates text embeddings and stores additional fields.",
    "status": "SERVING",
    "totalUsage": {
      "embedding": 0
    },
    "functions": [
      {
        "name": "index_my_text",
        "action": "INDEX_TEXT",
        "inputFields": ["text_list"],
        "language": "ENGLISH",
        "embedding": "zilliz/bge-base-en-v1.5"
      },
      {
        "name": "keep_text_info",
        "action": "PRESERVE",
        "inputField": "source",
        "outputField": "source",
        "fieldType": "VarChar"
      }
    ],
    "clusterId": "inxx-xxxx",
    "collectionName": "my_collection"
  }
}

📘Notes

The total usage data could delay by a few hours due to technical limitation.

Delete pipeline

If you no longer need a pipeline, you can drop it. Note that dropping a pipeline will not remove the auto-created collection where it ingested data.

🚧Warning

Dropped pipelines cannot be recovered. Please be cautious with the action.
Dropping a data-ingestion pipeline does not affect the collection created along with the pipeline. Your data is safe.

Cloud Console
Bash

To drop a pipeline on the web UI, click the ... button under the Actions column. Then click Drop.

delete-pipeline

Follow the example below to drop a pipeline.

curl --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer ${YOUR_API_KEY}" \
    --url "https://controller.api.{cloud-region}.zillizcloud.com/v1/pipelines/${YOUR_PIPELINE_ID}"

The following is an example output.

{
  "code": 200,
  "data": {
    "pipelineId": "pipe-xxx",
    "name": "my_text_ingestion_pipeline",
    "type": "INGESTION",
    "createTimestamp": 1721187300000,
    "description": "A pipeline that generates text embeddings and stores additional fields.",
    "status": "SERVING",
    "totalUsage": {
      "embedding": 0
    },
    "functions": [
      {
        "name": "index_my_text",
        "action": "INDEX_TEXT",
        "inputFields": ["text_list"],
        "language": "ENGLISH",
        "embedding": "zilliz/bge-base-en-v1.5"
      },
      {
        "name": "keep_text_info",
        "action": "PRESERVE",
        "inputField": "source",
        "outputField": "source",
        "fieldType": "VarChar"
      }
    ],
    "clusterId": "inxx-xxxx",
    "collectionName": "my_collection"
  }
}

📘Notes

The total usage data could delay by a few hours due to technical limitation.

Prerequisites and limitations​

Ingest image data​

Create image ingestion pipeline​

Run image ingestion pipeline​

Search image data​

Conduct a reverse image search​

Create image search pipeline​

Run image search pipeline​

Search image by text​

Create image search pipeline​

Run image search pipeline​

Delete image data​

Create image deletion pipeline​

Run image deletion pipeline​

Manage pipeline​

View pipeline​

Delete pipeline​

Prerequisites and limitations

Ingest image data

Create image ingestion pipeline

Run image ingestion pipeline

Search image data

Conduct a reverse image search

Create image search pipeline

Run image search pipeline

Search image by text

Create image search pipeline

Run image search pipeline

Delete image data

Create image deletion pipeline

Run image deletion pipeline

Manage pipeline

View pipeline

Delete pipeline