Skip to main content
Version: User Guides (Cloud)

Release Notes (June 18, 2024)

In this release, Zilliz Cloud has unveiled an array of new features backed by Milvus 2.4, including sparse vector support, enhanced multi-vector and hybrid search, an inverted index and fuzzy matching for faster queries, and grouping search for document-level recall. It also introduces Float16 and BFloat16 data types for improved search efficiency. In addition, the Pipelines feature now tracks token usage stats with every data ingestion and search request, streamlining performance monitoring. You can find detailed stats on the invoice page.

Milvus Compatibility

This release is compatible with Milvus 2.3.x.

If you prefer to upgrade your clusters to BETA, Milvus 2.4.x features are available after the upgrade.

Milvus 2.4.x New Features Available on Zilliz Cloud

Milvus 2.4 offers many efficient capabilities for RAG and multimodal data search. If you want to try these new features, you can update your cluster to BETA.

📘Notes

Milvus 2.4 has not reached a stable version. Be cautious when you adopt Milvus 2.4 features in your production environment.

Sparse Vector

Sparse vectors differ from their dense counterparts as they tend to have several magnitude higher numbers of dimensions with only a handful being non-zero. This feature offers better interpretability due to its term-based nature and can be more effective in certain domains. Learned sparse models such as SPLADEv2/BGE-M3 have proven useful for common first-stage ranking tasks. The main use case for this new feature is to allow efficient approximate semantic nearest neighbor search over sparse vectors generated by neural models such as SPLADEv2/BGE-M3 and statistics models such as the BM25 algorithm. Zilliz Cloud now supports effective and high-performance storage, indexing, and searching (MIPS, Maximum Inner Product Search) of sparse vectors.

For details, check out the Sparse Vector guide and the example code in hello_sparse.py. Be sure to update the connection details in the example code with your Zilliz Cloud cluster credentials.

Multi-vector support is the cornerstone for applications that require multi-model data processing or a mix of dense and sparse vectors. With multi-vector support, now you can:

  • Store vector embeddings generated for unstructured text, image, or audio samples from multiple models.

  • Conduct ANN searches against a collection that carries multiple vectors in each entity.

  • Customize search strategies by assigning weights to different embedding models.

  • Experiment with various embedding models to find the optimal model combination.

Multi-vector support allows storing, indexing, and applying reranking strategies to multiple vector fields of different types, such as FLOAT_VECTOR and SPARSE_FLOAT_VECTOR, in a collection. Currently, two reranking strategies are available: Reciprocal Rank Fusion (RRF) and Average Weighted Scoring. Both strategies combine the search results from different vector fields into a unified result set. The first strategy prioritizes the entities that consistently appear in the search results from various vector fields, and the other strategy assigns weights to the search results from each vector field to determine their importance in the final result set.

For details, check out the Basic ANN Search and Hybrid Search guides and the example code in hybrid_search.py. Be sure to update the connection details in the example code with your Zilliz Cloud cluster credentials.

Inverted Index and Fuzzy Match

In previous releases of Milvus, memory-based binary search indexes and Marisa Trie indexes were used for scalar field indexing. However, these methods were memory-intensive. The latest release of Zilliz Cloud now employs auto-index to optimize these mechanism, which can be applied to all numeric and string data types. This new index dramatically improves scalar query performance, reducing the query of keywords in strings by ten times. In addition, the inverted index consumes less memory, thanks to additional optimizations in data compression and the Memory-mapped storage (MMap) mechanism of the internal indexing structure.

This release also supports fuzzy matches in scalar filtering using prefixes, infixes, and suffixes.

For details, check out the Binary Vector, Index Scalar Fields, and Use the like Operator guides as well as the example code in inverted_index_example.py and fuzzy_match.py. Be sure to update the connection details in the example code with your Zilliz Cloud cluster credentials and use AUTOINDEX instead.

You can now aggregate the search results by the values in a specific scalar field. This helps RAG applications to implement document-level recall. Consider a collection of documents, each document splits into various passages. Each passage is represented by one vector embedding and belongs to one document. To find the most relevant documents instead of scattering passages, you can include the group_by_field argument in the search() operation to group results by the document ID.

For details, check out the Grouping Search guide and the example code in example_group_by.py. Be sure to update the connection details in the example code with your Zilliz Cloud cluster credentials.

Float16 and BFloat- Vector DataType

Machine learning and neural networks often use half-precision data types, such as Float16 and BFloat- While these data types can improve query efficiency and reduce memory usage, they come with a tradeoff of reduced accuracy. With this release, Zilliz Cloud now supports these data types for vector fields.

For deatils, refer to the Search & Rerank and the example code in float16_example.py and bfloat16_example.py. Be sure to update the connection details in the example code with your Zilliz Cloud cluster credentials.

Refined MilvusClient Interfaces

MilvusClient is an easy-to-use alternative to the ORM module. It adopts a purely functional approach to simplify interactions with the server. Instead of maintaining a connection pool, each MilvusClient establishes a gRPC connection to the server. The MilvusClient module has implemented most of the functionalities of the ORM module. To learn more about the MilvusClient module, visit pymilvus and the reference documents.

Pipelines

Zilliz Cloud now monitors token usage for your pipeline requests, with details available on the invoice page and within each API response. However, you will not be charged until this feature is generally available.

The image embedding model has been upgraded to clip-vit-base-patch32 from the previous clip-vit-base-patch16 to fulfill a wider range of requirements. Additionally, support for multilingual text embeddings is scheduled for implementation shortly.

Enhancements

This release also includes a series of enhancements:

  • Now you can scale your dedicated cluster to 256 CUs in a self-service manner. You can also contact us for an even larger cluster.