Skip to main content
Version: User Guides (Cloud)

Set Collection TTL

Once data is inserted into a collection, it remains there by default. However, in some scenarios, you may want to remove or clean up data after a certain period. In such cases, you can configure the collection’s Time-to-Live (TTL) property so that Zilliz Cloud automatically deletes the data once the TTL expires.

Overview

Time-to-Live (TTL) is commonly used in databases for scenarios where data should only remain valid or accessible for a certain period after any insertion or modification. Then, the data can be automatically removed.

For instance, if you ingest data daily but only need to retain records for 14 days, you can configure Zilliz Cloud to automatically remove any data older than that by setting the collection’s TTL to 14 × 24 × 3600 = 1209600 seconds. This ensures that only the most recent 14 days’ worth of data remain in the collection.

📘Notes

Expired entities will not appear in any search or query results. However, they may stay in the storage until the subsequent data compaction, which should be carried out within the next 24 hours.

The TTL property in a Zilliz Cloud collection is specified as an integer in seconds. Once set, any data that surpasses its TTL will be automatically deleted from the collection.

Because the deletion process is asynchronous, data might not be removed from search results exactly once the specified TTL has elapsed. Instead, there may be a delay, as the removal depends on the garbage collection (GC) and compaction processes, which occur at non-deterministic intervals.

Examples

Generally, Collection TTL is closely related to when TTL settings are enforced and when entities are inserted or updated. To better understand the TTL mechanism, consider the following examples.

Example 1: Set TTL upon collection creation

You set the TTL to 2592000 (30 days) when you create a collection.

At 00:00 on January 1st, you inserted 10 billion entities, and no other write operations followed.

After 00:00 on January 31st, the 10 billion entities will become unsearchable, and the result of a query with the output fields set to count(*) will be 0.

Example 2: Set TTL for an existing collection

You have created a collection without TTL.

At 00:00 on January 1st, you insert 10 billion entities.

At 00:00 on January 31st, you insert another 20 billion entities, and no other write operations follow.

At 10:00 on Feburary 28th, you set TTL to 2592000 (30 days) for the collection.

The 10 billion entities inserted on January 1st become unsearchable immediately after the TTL has been set, and the result of a query with the output fields set to count(*) will be 20 billion.

Example 3: Upsert entities

You set the TTL to 2592000 (30 days) when you create a collection.

At 00:00 on January 1st, you inserted 20 billion entities, and no other write operations followed.

From 00:00 to 23:59:59 on January 15th, you upsert all the 20 billion entities in merge mode, and no other write operations follow.

During the period from January 31st to February 13th, the 20 billion entities remain searchable, and the query count remains 20 billion.

Since 00:00 on February 14th, the query count has been decreasing and reached 0 at 00:00 on February 15th.

Set TTL

You can set the TTL property when you

Set TTL when creating a collection

The following code snippet demonstrates how to set the TTL property when you create a collection.

from pymilvus import MilvusClient

# With TTL
client.create_collection(
collection_name="my_collection",
schema=schema,
properties={
"collection.ttl.seconds": 1209600
}
)

Set TTL for an existing collection

The following code snippet demonstrates how to alter the TTL property in an existing collection.

client.alter_collection_properties(
collection_name="my_collection",
properties={"collection.ttl.seconds": 1209600}
)

Drop TTL setting

If you decide to keep the data in a collection indefinitely, you can simply drop the TTL setting from that collection.

client.drop_collection_properties(
collection_name="my_collection",
property_keys=["collection.ttl.seconds"]
)

FAQs

When does data expire due to TTL settings?

Currently, the data expires based on the time point at which it was inserted or upserted. Expired data will not be displayed in search results. For details, refer to Examples.

When will the expired data be physically deleted?

Once the data expires, it will not be included in any search results. However, it will be physically deleted only after the subsequent system compaction, according to your cluster's compaction policies.

If you need to delete the data shortly after it expires, contact us.

When will the CU capacity decrease?

The CU capacity of a cluster is whichever is higher between memory usage and storage usage. If storage usage applies, you can view the decrease in the CU capacity on the Zilliz Cloud console after the expired data is physically deleted.