Import Data (RESTful API)
This page introduces how to import the prepared data via the Zilliz Cloud RESTful API.
Before you start
Make sure the following conditions are met:
-
You have obtained an API key for your cluster. For details, see API Keys and On Zilliz Cloud Console.
-
You have prepared your data in either of the supported formats.
For details on how to prepare your data, refer to Prepare Source Data. You can also refer to the end-to-end notebook Data Import from Zero to Hero to get more.
-
You have created a collection with a schema matching the example dataset and already have the collection indexed and loaded. For details, see Example Dataset and Manage Collections.
Import data using the RESTful API
To import data from files using the RESTful API, you must first upload the files to an object storage bucket, such as AWS S3 or Google Cloud Storage (GCS). Once uploaded, obtain the path to the files in the remote bucket and bucket credentials for Zilliz Cloud to pull data from your bucket. For details on supported object paths, refer to Import Data (RESTful API).
For successful data import, ensure the target collection has less than 10 running or pending import jobs.
Once the object path and bucket credentials are obtained, call the API as follows:
# replace url and token with your own
curl --request POST \
--url "https://controller.api.${CLOUD_REGION_ID}.zillizcloud.com/v1/vector/collections/import" \
--header "Authorization: Bearer ${TOKEN}" \
--header "accept: application/json" \
--header "content-type: application/json" \
-d '{
"clusterId": "${CLUSTER_ID}",
"collectionName": "medium_articles",
"objectUrl": "gs://publicdataset-zillizcloud-com/medium_articles_2020.json"
"accessKey": "your-access-key"
"secretKey": "your-secret-key"
}'
In the command above, replace ${CLOUD_REGION_ID}
, ${TOKEN}
, and ${CLUSTER_ID}
with your cloud region identifier, API key, and cluster ID, respectively.
You can obtain CLOUD_REGION_ID
and CLUSTER_ID
from your cluster's public endpoint. For instance, in the public endpoint https://in03-3bf3c31f4248e22.api.aws-us-east1.zillizcloud.com
, CLOUD_REGION_ID
is aws-us-east1
and CLUSTER_ID
is in03-3bf3c31f4248e22
. To find your cluster endpoint on the Zilliz Cloud console, refer to On Zilliz Cloud Console.
Upon executing the request, you will receive a job ID. Use this job ID to monitor the import progress with the following command:
curl --request GET \
--url "https://controller.api.${CLOUD_REGION_ID}.zillizcloud.com/v1/vector/collections/import/get?jobId=${JOBID}&clusterId=${CLUSTERID}" \
--header "Authorization: Bearer ${TOKEN}" \
--header "accept: application/json" \
--header "content-type: application/json" \
For details, see Import and Get Import Progress.
Verify the result
If the command output is similar as follows, the data is imported successfully:
{
"code": 200,
"data": {
"jobId": "string"
}
}
You can also call RESTful APIs to get the progress of the current import job and list all import jobs to get more. As an alternative, you can also go to the Zilliz Cloud console to view the result and job details: