Skip to main content
Version: User Guides (Cloud)

Release Notes (Sept 4, 2024)

This release brings several key updates to Zilliz Cloud, starting with the GA of Zilliz Cloud Serverless, which offers automatic scaling and enables up to 50x cost savings. Milvus 2.4 features are now GA, introducing capabilities such as sparse vectors, multi-vector hybrid search, and inverted index with fuzzy matching. This release also includes multi-replica functionality in public preview, allowing users to improve query throughput and availability by distributing workloads across replicas in multiple Availability Zones (AZs). Additionally, Zilliz Cloud’s new Migration Service supports migration from open-source Milvus, pgvector, Elasticsearch, and enables intra-organization and cross-organization data migrations within Zilliz Cloud. The expanded RESTful APIs for backup, restore, migration, and job management empower users to build automated operational workflows. Further enhancements include support for Project Read-only roles and the ability to rename clusters and snapshots.

Milvus Compatibility

This release is compatible with Milvus 2.4.x.

Serverless GA

After a year of refinement, Zilliz Cloud Serverless is now generally available. Designed as a hassle-free serverless vector database for GenAI applications, Zilliz Cloud Serverless offers automatic scaling that adjusts to your app's demands, delivering up to 50x cost savings. Its cost-efficiency is made possible by a tiered storage system that optimizes data placement across DRAM, SSD, and object storage, ensuring quick access to active data while reducing costs for less frequently used data — all without manual intervention.

Unlike dedicated clusters, the serverless service ensures you only pay for what you use, eliminating the costs for idle servers. With convenient migration features, you can easily transfer your data from open-source Milvus to Zilliz Cloud Serverless, or from Serverless to a Dedicated Cluster, to accommodate your changing needs.

Learn more or get a free trial.

Milvus 2.4.x New Features GA on Zilliz Cloud

Milvus 2.4 offers many highly practical capabilities for RAG and multimodal data search. If you want to try these new features, you can update your cluster to Public Preview. Since Milvus 2.4 has not reached a stable version, be cautious when you adopt Milvus 2.4 features in your production environment.

Sparse Vector

Sparse vectors are different from their dense counterparts as they tend to have several magnitude higher number of dimensions with only a handful being non-zero. This feature offers better interpretability due to its term-based nature and can be more effective in certain domains. Learned sparse models such as SPLADEv2/BGE-M3 have proven to be very useful for common first-stage ranking tasks. The main use case for this new feature is to allow efficient approximate semantic nearest neighbor search over sparse vectors generated by neural models such as SPLADEv2/BGE-M3 and statistics models such as the BM25 algorithm. Zilliz Cloud now supports effective and high-performance storage, indexing, and searching (MIPS, Maximum Inner Product Search) of sparse vectors.

Example code can be found in hello_sparse.py.

Multi-vector support is the cornerstone for applications that require multi-model data processing or a mix of dense and sparse vectors. With multi-vector support, now you can:

  • Store vector embeddings generated for unstructured text, image, or audio samples from multiple models.

  • Conduct ANN searches that include multiple vectors of each entity.

  • Customize search strategies by assigning weights to different embedding models.

  • Experiment with various embedding models to find the optimal model combination.

Multi-vector support allows storing, indexing, and applying reranking strategies to multiple vector fields of different types, such as FLOAT_VECTOR and SPARSE_FLOAT_VECTOR, in a collection. Currently, two reranking strategies are available: Reciprocal Rank Fusion (RRF) and Average Weighted Scoring. Both strategies combine the search results from different vector fields into a unified result set. RRF considers the positions of items in the original rankings, giving higher importance to those that rank higher across multiple lists and prioritizing entities that consistently appear in different vector fields. Average Weighted Scoring assigns weights to the search results of each vector field to determine their importance in the final result set.

Example code can be found in hybrid_search.py.

Improved Metadata Filtering and Substring Matching

In this release, we made two key improvements to metadata filtering. First, we improved the performance for filtering scalar data types by introducing a new scalar inverted index. Second, we expanded our support for substring matching during metadata filtering

In previous releases of Milvus, metadata filtering was implemented with memory-based binary search indexes and Marisa Trie indexes. These methods are memory-intensive. The latest release of Zilliz Cloud now employs the Tantivy-based inverted index, which can be applied to all numeric and string data types. This new index improves the performance of scalar query on strings by 10x. It also consumes less memory, by applying data compression and Memory-mapped storage (MMap) mechanism on the internal indexing structure. Example code can be found in inverted_index_example.py

This release also adds support for more flexible string matching, including prefix, infix, postfix, and wildcard patterns.

You can now aggregate search results by the values in a specific scalar field. This is useful in RAG for retrieving document chunks as well as returning unique document IDs related to the search query. Considering a collection of documents where each document is split into several chunks, with each chunk represented by a vector embedding, you can use the group_by_field argument in the search() operation to group results by document ID, allowing you to find the list of relevant documents while searching for semantically relevant chunks.

Example code can be found in example_group_by.py.

Float16 and BFloat- Vector DataType

Machine learning and neural networks often use half-precision data types, such as Float16 and BFloat. These data types can improve query efficiency and reduce memory usage at a cost of reduced accuracy. With this release, Zilliz Cloud now supports these data types for vector fields.

Example code can be found in float16_example.py and bfloat16_example.py.

Multi-replica

Multi-replica is now available in Zilliz Cloud, enabling cluster-level replication to boost both query throughput and availability.

  • Improved Query Performance: For users requiring high query-per-second (QPS), multi-replica allows query workloads to be distributed across replicas. This parallel processing enhances overall throughput, reduces latency, and improves efficiency for query-intensive applications. In most cases, the overall QPS can be improved linearly as replicas are added.

  • Enhanced Availability: Multi-replica strengthens availability by distributing replicas across multiple Availability Zones (AZs). This setup ensures continuous access to data, even in the event of an AZ outage, offering greater reliability for mission-critical applications.

Currently, the multi-replica feature is in public preview and is available on the Enterprise Plan. To find more, read Manage Replica.

Migration Service

Zilliz Cloud now offers a comprehensive Migration Service, allowing users to complete migration tasks at ease. Currently, three types of migrations are supported:

Backup/Restore/Migration/Jobs RESTful API

With this update, Zilliz Cloud has expanded the control plane API, introducing new functionalities that support backup, restore, migration, and job management.

These RESTful APIs allow users to build their own automated operational workflows, providing greater flexibility and control over their data management and maintenance processes.

Learn more about the API details.

Other Enhancements

This release also includes a series of enhancements: