Skip to main content
Version: User Guides (Cloud)

Import Data (RESTful API)

This page introduces how to import the prepared data via the Zilliz Cloud RESTful API.

Before you start

Make sure the following conditions are met:

  • You have obtained an API key for your cluster. For details, see API Keys.

  • You have prepared your data in either of the supported formats.

    For details on how to prepare your data, refer to Storage Options and Format Options. You can also refer to the end-to-end notebook Data Import Hands-On to get more.

  • You have created a collection with a schema matching the example dataset.

    For details on creating a collection, see Manage Collections (Console).

Import data via volume

To import data from a volume, first create a managed or external volume. For a managed volume, upload your data files to the volume. For an external volume, ensure the data files are in the mapped cloud storage bucket. Then import the data as follows:

curl --request POST \
--url "https://api.cloud.zilliz.com/v2/vectordb/jobs/import/create" \
--header "Authorization: Bearer ${API_KEY}" \
--header "Content-Type: application/json" \
-d '{
"clusterId": "inxx-xxxxxxxxxxxxxxx",
"dbName": "default",
"collectionName": "medium_articles",
"partitionName": "",
"volumeName": "my_volume",
"dataPaths": [
[
"json-folder/1.json"
]
]
}'

To import data into a specific partition, you need to include partitionName in the request.

After Zilliz Cloud processes the above request, you will receive a job ID. Use this job ID to monitor the import progress with the following command:

curl --request POST \
--url "https://api.cloud.zilliz.com/v2/vectordb/jobs/import/getProgress" \
--header "Authorization: Bearer ${API_KEY}" \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
-d '{
"clusterId": "inxx-xxxxxxxxxxxxxxx",
"jobId": "job-xxxxxxxxxxxxxxxxxxxxx"
}'

Import data via external storage

To import data from files via external storage, you must first upload the files to an object storage bucket, such as AWS S3 or Google Cloud Storage (GCS). Once uploaded, obtain the path to the files in the remote bucket and bucket credentials for Zilliz Cloud to pull data from your bucket. For details on supported object paths, refer to Storage Options.

Based on your data security requirements, you can use either long-term or short-term credentials during data import.

For more information about obtaining credentials, refer to:

For more information about using session tokens, refer to this FAQ.

📘Notes

For successful data import, ensure the target collection has less than 10,000 running or pending import jobs.

Once the object path and bucket credentials are obtained, call the API as follows:

# replace url and token with your own
curl --request POST \
--url "https://api.cloud.zilliz.com/v2/vectordb/jobs/import/create" \
--header "Authorization: Bearer ${API_KEY}" \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
-d '{
"clusterId": "inxx-xxxxxxxxxxxxxxx",
"collectionName": "medium_articles",
"partitionName": "",
"objectUrl": "https://assets.zilliz.com/docs/example-data-import.json",
"accessKey": "",
"secretKey": ""
}'

To import data into a specific partition, you need to include partitionName in the request.

After Zilliz Cloud processes the above request, you will receive a job ID. Use this job ID to monitor the import progress with the following command:

curl --request POST \
--url "https://api.cloud.zilliz.com/v2/vectordb/jobs/import/getProgress" \
--header "Authorization: Bearer ${API_KEY}" \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
-d '{
"clusterId": "inxx-xxxxxxxxxxxxxxx",
"jobId": "job-xxxxxxxxxxxxxxxxxxxxx"
}'

For details, see Import and Get Import Progress.

Verify the result

If the command output is similar as follows, the import job is successfully submitted:

{
"code": 0,
"data": {
"jobId": "job-xxxxxxxxxxxxxxxxxxxxx"
}
}

You can also call RESTful APIs to get the progress of the current import job and list all import jobs to get more. As an alternative, you can also go to the job center on the Zilliz Cloud console to view the result and job details.

FAQ

What is the difference between an external volume and importing directly from external storage?

Both allow you to import data from your own S3 or GCS bucket. The key differences are:

  • External volume uses a storage integration for credential management. Credentials are set up once and reused across multiple volumes and operations. Data engineers do not need direct access to cloud storage keys.

  • Direct external storage import requires you to provide credentials (access key, secret key) inline with each import request. This is simpler for one-time imports but does not offer credential separation or reusability.