/ /

空格分析器

只要找到空格字符， whitespace分析器就会将文本划分为可搜索词语（词元）。它将所有文本保留为原始字母大小写。

如果您选择 Refine Your Index， Atlas用户界面会在 Index Configurations 部分中显示标题为 View text analysis of your selected index configuration 的部分。如果展开此部分， Atlas用户界面会显示 whitespace分析器为每个示例字符串生成的索引和搜索词元。当您在Atlas用户界面Visual Editor 中创建或编辑索引时，您可以看到 whitespace分析器为内置示例文档和查询字符串创建的词元。

重要

MongoDB Search 不会对分析器词元大小超过 32766 字节的字符串字段索引。如果使用关键字分析器，则不会对超过 32766 字节的字符串字段编制索引。

例子

以下示例索引定义使用 whitespace 分析器指定 sample_mflix.movies 集合中 title 字段上的索引。要跟随此示例，请在集群上加载示例数据，并使用mongosh或按照创建 MongoDB Search 索引教程中的步骤导航到 Atlas UI 中的 Create a Search Index 页面。

然后，使用 movies 集合作为数据源，按照示例过程从 mongosh 或 Atlas 用户界面 Visual Editor 或 JSON editor 创建索引。

➤ 使用选择语言下拉菜单为此页面上的示例设立界面。

单击 Refine Your Index 配置索引。
在 Index Configurations 部分中，将 Dynamic Mapping 切换为 off。
在 Field Mappings 部分中，单击 Add Field 打开 Add Field Mapping 窗口。
从 Field Name 下拉列表中选择 title。
单击 Customized Configuration（连接）。
单击 Data Type 下拉列表并选择 String（如果尚未选择）。

展开 String Properties 并进行以下更改：

索引分析器	从下拉列表中选择 `lucene.whitespace`。
Search Analyzer	从下拉列表中选择 `lucene.whitespace`。
索引选项	使用默认 `offsets`。
Store	使用默认 `true`。
忽略以上内容	保留默认设置。
规范	使用默认 `include`。

单击 Add（连接）。
单击 Save Changes（连接）。
单击 Create Search Index（连接）。

将默认索引定义替换为以下索引定义。
{ "mappings": { "fields": { "title": { "type": "string", "analyzer": "lucene.whitespace", "searchAnalyzer": "lucene.whitespace" } } } }
单击 Next（连接）。
单击 Create Search Index（连接）。

1 db.movies.createSearchIndex(
2   "default",
3   {   
4     "mappings": {
5       "fields": {
6         "title": {
7           "type": "string",
8           "analyzer": "lucene.whitespace",
9           "searchAnalyzer": "lucene.whitespace"
10         }
11       }
12     }
13   }
14 )

以下查询在title字段中搜索词语Lion's 。

单击索引的 Query 按钮。
单击 Edit Query 编辑查询。
单击查询栏上的并选择数据库和集合。

将默认查询替换为以下内容，然后单击 Find：

[
  {
    "$search": {
      "text": {
        "query": "Lion's",
        "path": "title"
      }
    }
  }
]

SCORE: 3.7370920181274414  _id:  "573a13ebf29313caabdcfc8d"
   awards: Object
   cast: Array (4)
   countries: Array (1)
   directors: Array (1)
   fullplot: "A documentary on young actress, Marianna Palka, as she confronts her r…"
   genres: Array (3)
   imdb: Object
   languages: Array (1)
   lastupdated: "2015-09-03 00:37:45.227000000"
   num_mflix_comments: 0
   plot: "A documentary on young actress, Marianna Palka, as she confronts her r…"
   poster: "https://m.media-amazon.com/images/M/MV5BMTgzMTc2OTg2N15BMl5BanBnXkFtZT…"
   released: 2014-01-18T00:00:00.000+00:00
   runtime: 15
   title: "The Lion's Mouth Opens"
   type: "movie"
   writers: Array (1)
   year: 2014

db.movies.aggregate([
  {
    "$search": {
      "text": {
         "query": "Lion's",
         "path": "title"
      }
    }
  },
  {
    "$project": {
      "_id": 0,
      "title": 1
    }
  }
])

[ { title: "The Lion's Mouth Opens" } ]

MongoDB Search 通过使用 lucene.whitespace分析器对 title字段中的文本执行以下操作来返回这些文档：

保留文本的原始字母大小写。
在找到空白字符的地方将文本分割为词元。

下表显示了MongoDB Search 使用空格分析器以及简单分析器和关键字分析器为结果中的文档创建的词元（可搜索词语）：

标题	空白分析器令牌	简单分析器词元	关键字分析器词元
`The Lion's Mouth Opens`	`The`, `Lion's` , `Mouth` , `Opens`	`the`, `lion` , `s` , `mouth` , `opens`	`The Lion's Mouth Opens`

使用 whitespace分析器的索引大小写。因此， MongoDB Search 能够将查询术语Lion's 与 whitespace分析器创建的词元 Lion's 进行匹配。

后退

simple

来年

Keyword

1	db.movies.createSearchIndex(
2	"default",
3	{
4	"mappings": {
5	"fields": {
6	"title": {
7	"type": "string",
8	"analyzer": "lucene.whitespace",
9	"searchAnalyzer": "lucene.whitespace"
10	}
11	}
12	}
13	}
14	)