您可以使用MongoDB Search autocomplete 类型对字符串字段中的文本值索引以支持自动完成。 您可以使用autocomplete操作符查询索引为 autocomplete 类型的字段。
您还可以使用 autocomplete 类型创建索引:
值为字符串数组的字段。要了解更多信息,请参阅如何对数组元素进行索引。
String fields inside an array of documents indexed as the embeddedDocuments type. For index build time considerations, see Index Build Time.
For dynamic mapping considerations, see Dynamic Mappings.
Define the Index for the autocomplete Type
Configure autocomplete Field Properties
MongoDB Search autocomplete 类型采用以下参数:
选项 | 类型 | 必要性 | 说明 | 默认 | ||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 字符串 | 必需 | 标识此字段类型的人类可读标签。值必须是 | |||||||||||||||||||||||
| 字符串 | 可选 |
| |||||||||||||||||||||||
| int | 可选 | 每个索引序列的最大字符数。该值限制索引词元的字符长度。当您搜索比 For |
| ||||||||||||||||||||||
| int | 可选 | 每个索引序列的最小字符数。我们建议将 |
| ||||||||||||||||||||||
| 枚举 | 可选 | 对字段进行索引以支持自动完成时使用的分词策略。值可以是以下值之一:
使用 For performance considerations, see Tokenization Performance. |
| ||||||||||||||||||||||
| 布尔 | 可选 | 指示是否执行 规范化的标记,例如包含或删除索引文本中的变音符号。值可以是以下值之一:
|
| ||||||||||||||||||||||
| 字符串 | 可选 | 在使用 autocomplete 操作符进行评分时,所用的字符串映射的相似度算法名称。值可以是以下之一: 要学习;了解有关可用相似度算法的更多信息,请参阅分数详细信息。 |
|
Try an Example for the autocomplete Type
Considerations
maxGrams 配置
The maxGrams option specifies the maximum length of substrings generated during indexing. Increasing maxGrams improves matching for longer queries by generating more substrings. Setting it beyond what you need can increase index size and affect indexing performance.
Consider the following best practices when you configure maxGrams:
Default to no more than 15. Set
maxGramsto no more than15when possible to avoid unnecessary index growth.Align with query length. Set
maxGramsbased on the typical length of user queries, rather than indexing for worst-case scenarios.Avoid over-indexing. If your queries are shorter than your current
maxGramsvalue, you may be indexing more data than necessary.Use an alternative for longer queries. If your queries regularly exceed 15 characters, use a custom analyzer for
prefix,contains, andsuffixpatterns.
Tokenization Performance
Indexing a field for autocomplete with an edgeGram, rightEdgeGram, or nGram tokenization strategy requires more computation and index storage than indexing a string field.
For the specified tokenization strategy, MongoDB Search concatenates sequential tokens before emitting them ("shingling"). MongoDB Search emits tokens between minGrams and maxGrams characters in length:
保留小于
minGrams的词元。Joins tokens greater than
minGramsbut less thanmaxGramsto the next tokens to create tokens up to the specified maximum number of characters in length.
动态映射
The default field types that MongoDB Search uses for dynamic mappings do not include the autocomplete type. Using the autocomplete type in dynamic mappings can increase index size and resource usage, and produce unexpected scoring results. Use autocomplete in static mappings.
However, if you need to include autocomplete in dynamic mappings, you can add it to a custom typeSet definition. To learn more about autocomplete and custom typeSet configurations, see Index Size and Configuration.
Index Build Time
If your dataset has many documents or a wide data range, building this index for the autocomplete operator can take some time. To reduce the impact on other indexes and queries while the new index builds, create a separate index with only the autocomplete type.
For index performance considerations, see Index Performance Considerations.
了解详情
如要了解有关 autocomplete 操作符的更多信息并查看查询示例,请参阅自动完成。
有关演示如何使用正则表达式运行不区分大小写、前缀、开头为和包含查询的示例,请参阅使用 MongoDB Search 而不是正则表达式查询。