Version: User Guides (Cloud)

Import Data (RESTful API)

This page introduces how to import the prepared data via the Zilliz Cloud RESTful API.

Before you start

Make sure the following conditions are met:

You have obtained an API key for your cluster. For details, see API Keys.
You have prepared your data in either of the supported formats.

For details on how to prepare your data, refer to Storage Options and Format Options. You can also refer to the end-to-end notebook Data Import Hands-On to get more.
You have created a collection with a schema matching the example dataset.

For details on creating a collection, see Manage Collections (Console).

Import data via stage
Private Preview

To import data from files via stage, you must first create a stage and upload the files to it. Once that's done, obtain the path to the files in the stage. For details, refer to Manage Stages.

Then you can import the uploaded data into a specific collection as follows:

curl --request POST \
--url "https://api.cloud.zilliz.com/v2/vectordb/jobs/import/create" \
--header "Authorization: Bearer ${TOKEN}" \
--header "Content-Type: application/json" \
-d '{
    "clusterId": "inxx-xxxxxxxxxxxxxxx",
    "dbName": "default",
    "collectionName": "medium_articles",
    "partitionName": "",
    "stageName": "my_stage",
    "dataPaths": [
        [
            "1.parquet"
        ]
    ]
}'

To import data into a specific partition, you need to include partitionName in the request.

After Zilliz Cloud processes the above request, you will receive a job ID. Use this job ID to monitor the import progress with the following command:

curl --request POST \
     --url "https://api.cloud.zilliz.com/v2/vectordb/jobs/import/getProgress" \
     --header "Authorization: Bearer ${TOKEN}" \
     --header "Accept: application/json" \
     --header "Content-Type: application/json" \
     -d '{
        "clusterId": "inxx-xxxxxxxxxxxxxxx",
        "jobId": "job-xxxxxxxxxxxxxxxxxxxxx"
    }'

Import data via external storage

To import data from files via external storage, you must first upload the files to an object storage bucket, such as AWS S3 or Google Cloud Storage (GCS). Once uploaded, obtain the path to the files in the remote bucket and bucket credentials for Zilliz Cloud to pull data from your bucket. For details on supported object paths, refer to Storage Options.

Based on your data security requirements, you can use either long-term or short-term credentials during data import.

For more information about obtaining credentials, refer to:

Amazon S3: Authenticate using long-term credentials
Google Cloud Storage: Manage HMAC keys for service accounts
Azure Blob Storage: View account access keys

For more information about using session tokens, refer to this FAQ.

📘Notes

For successful data import, ensure the target collection has less than 10,000 running or pending import jobs.

Once the object path and bucket credentials are obtained, call the API as follows:

# replace url and token with your own
curl --request POST \
     --url "https://api.cloud.zilliz.com/v2/vectordb/jobs/import/create" \
     --header "Authorization: Bearer ${TOKEN}" \
     --header "Accept: application/json" \
     --header "Content-Type: application/json" \
     -d '{
        "clusterId": "inxx-xxxxxxxxxxxxxxx",
        "collectionName": "medium_articles",
        "partitionName": "",
        "objectUrl": "https://assets.zilliz.com/docs/example-data-import.json",
        "accessKey": "",
        "secretKey": ""
    }'

To import data into a specific partition, you need to include partitionName in the request.

After Zilliz Cloud processes the above request, you will receive a job ID. Use this job ID to monitor the import progress with the following command:

curl --request POST \
     --url "https://api.cloud.zilliz.com/v2/vectordb/jobs/import/getProgress" \
     --header "Authorization: Bearer ${TOKEN}" \
     --header "Accept: application/json" \
     --header "Content-Type: application/json" \
     -d '{
        "clusterId": "inxx-xxxxxxxxxxxxxxx",
        "jobId": "job-xxxxxxxxxxxxxxxxxxxxx"
    }'

For details, see Import and Get Import Progress.

Verify the result

If the command output is similar as follows, the import job is successfully submitted:

{
    "code": 0,
    "data": {
        "jobId": "job-xxxxxxxxxxxxxxxxxxxxx"
    }
}

You can also call RESTful APIs to get the progress of the current import job and list all import jobs to get more. As an alternative, you can also go to the job center on the Zilliz Cloud console to view the result and job details.

Before you start​

Import data via stagePrivate Preview​

Import data via external storage​

Verify the result​

Before you start

Import data via stage
Private Preview

Import data via external storage

Verify the result