Analyzer
In text processing, an analyzer is a crucial component that converts raw text into a structured, searchable format. Each analyzer typically consists of two core elements: tokenizer and filter. Together, they transform input text into tokens, refine these tokens, and prepare them for efficient indexing and retrieval. This chapter provides thorough information about using analyzers in Zilliz Cloud.
Overview [READ MORE]
In text processing, an analyzer is a crucial component that converts raw text into a structured, searchable format. Each analyzer typically consists of two core elements tokenizer and filter. Together, they transform input text into tokens, refine these tokens, and prepare them for efficient indexing and retrieval.
Built-in Analyzer [READ MORE]
This section provides detailed information about built-in analyzers.
Tokenizer [READ MORE]
This section provides a detailed reference for tokenizers.
Filter [READ MORE]
This section provides a detailed reference for filters in analyzers.
Multi-language Analyzers [READ MORE]
When Zilliz Cloud performs text analysis, it typically applies a single analyzer across an entire text field in a collection. If that analyzer is optimized for English, it struggles with the very different tokenization and stemming rules required by other languages, such as Chinese, Spanish, or French, resulting a lower recall rate. For instance, a search for the Spanish word "teléfono" (meaning "phone") would trip up an English‑focused analyzer it may drop the accent and apply no Spanish‑specific stemming, causing relevant results to be overlooked.
Best Practice [READ MORE]
This guide helps you select and configure the most suitable analyzer for your text content in Zilliz Cloud.