/ /

/ /

$text（查询谓词运算操作符）

注意

MongoDB提供改进的全文搜索解决方案，MongoDB Search，和语义搜索解决方案，MongoDB Vector Search。我们建议使用 $search、$searchMeta 或 $vectorSearch 阶段，而不是 $text操作符。

本页介绍 $text 操作符，此操作符用于自管理部署。

定义

$text: $text 对使用文本索引进行索引的字段执行文本查询。

兼容性

可以使用 $text 查找托管在以下环境中的部署：

MongoDB Atlas：用于云中 MongoDB 部署的完全托管服务

MongoDB Enterprise：基于订阅、自我管理的 MongoDB 版本
MongoDB Community：源代码可用、免费使用且可自行管理的 MongoDB 版本

语法

$text 表达式的语法如下：

{
  $text: {
    $search: <string>,
    $language: <string>,
    $caseSensitive: <boolean>,
    $diacriticSensitive: <boolean>
  }
}

$text操作符接受以下字段：

字段	类型	说明
`$search`	字符串	MongoDB解析并用于查询文本索引的术语字符串。除非您指定确切的字符串，否则MongoDB对术语执行逻辑 `OR`查询。有关详细信息，请参阅行为。
`$language`	字符串	可选.确定停用词、词干分析器和分词器规则的语言。默认为索引语言。有关支持的语言，请参阅自管理部署上的 $text 查询语言。如果将 `default_language` 值指定为 `none`，文本索引会解析字段中的每个词，包括停用词，并忽略后缀词干。
`$caseSensitive`	布尔	可选.启用区分大小写。默认为 `false`。请参阅不区分大小写。
`$diacriticSensitive`	布尔	可选.为版本 3 文本索引启用区分变音符号。默认为 `false`。早期的文本索引版本始终区分变音符号。请参阅不区分变音符号。

默认下，$text 不按分数对结果进行排序。有关分数排序的详细信息，请参阅文本分数。

行为

限制

一个查询只能指定一个 $text表达式。
$text 不能出现在 $nor 表达式中。
$text 不能出现在 $elemMatch 查询或投影表达式中。
所有 $or 子句必须编制索引才能使用 $text。
如果查询包含 $text 表达式，则不能使用 hint() 指定用于查询的索引。
使用 $text 的查询不能使用 $natural 排序。
不能将需要特殊文本索引的 $text 表达式与需要其他类型特殊索引的查询操作符组合使用。例如，不能 $text 表达式与 $near 运算符组合使用。
视图不支持 $text。
Stable API V1 不支持$text 进行索引创建。

如果在聚合中使用 $text 运算符，则以下限制也同样适用。

包含 $text 的 $match 阶段必须是管道中的第一个阶段。
$text 操作符在此阶段只能出现一次。
$text 操作符表达式不能出现在 $or 或 $not 表达式中。
$text，默认情况下不会按照匹配分数的顺序返回匹配文档。要按分数降序排序，请在 $sort 阶段使用 $meta 聚合表达式。

`$search` 字段

在 $search字段中，指定MongoDB用于查询文本索引的单词。

注意

$search字段与MongoDB Atlas $搜索聚合阶段不同。$search 阶段提供全文搜索，并且仅在MongoDB Atlas上可用。

精确字符串

要匹配精确的多单词字符串而不是单个术语，请将字符串括在转义双引号 (\") 中：如下所示：

"\"ssl certificate\""

如果 $text 操作的 $search 字符串包含多词字符串和单个词，$text 仅会匹配包含该多词字符串的文档。

示例，此 $search string 返回具有完全相同的 string "ssl certificate" 的文档：

"\"ssl certificate\" authority key"

否定

在单词前缀加上连字符减号 (-) 可对其表示否定：

否定词会从结果设立排除包含否定词的文档。
仅包含否定词的 string 不匹配任何文档。
像 pre-market 这样的带连字符的单词不是否定词。MongoDB将连字符视为分隔符。要取反 market，请使用 pre -market。

MongoDB将所有否定应用于具有逻辑 AND 的操作。

匹操作

非索引字

MongoDB会忽略特定语言的停用词，例如英语中的 the 和 and。

词干处理

由于不区分大小写和变音符号，$text 会匹配完整的词干词。如果文档字段包含 blueberry，则 blue 的 $search术语不匹配。但是，blueberry 或 blueberries 会匹配。

区分大小写和词干

启用区分大小写 ($caseSensitive: true) 后，如果后缀词干包含大写字母，则 $text 匹配精确的单词。

变音敏感度和词干

启用区分变音符号 ($diacriticSensitive: true) 后，如果后缀词干包含变音符号，则 $text 会匹配精确的单词。

不区分大小写。

$text 默认为不区分大小写的文本索引：

3 版本文本索引对于带或不带变音符号的拉丁字符以及西里尔字母等非拉丁字母不区分大小写。
早期版本对于不带变音符号的拉丁字符不区分大小写 ([A-z])。

启用区分大小写

指定 $caseSensitive: true 可在文本索引不区分大小写时启用区分大小写功能。

区分大小写的流程

当 $caseSensitive: true 且文本索引不区分大小写时，$text：

查询文本索引是否存在不区分大小写和不区分变音符号的匹配项。
筛选结果以仅返回与指定大小写匹配的文档。

当 $caseSensitive: true 且后缀词干包含大写字母时，$text 匹配精确的单词。

启用 $caseSensitive: true 可能会降低性能。

不区分变音符号

$text 默认为文本索引的不区分变音符号：

版本 3 文本索引。该索引不区分带变音符号的字符和未标记的对应字符（é、ê、e）。
早期版本区分变音符号。

启用区分变音符号

指定 $diacriticSensitive: true 可启用3 版本文本索引区分变音符号的功能。

早期的文本索引版本始终区分变音符号，因此 $diacriticSensitive 不起作用。

区分变音符号的搜索过程

对于 3 版本的文本索引和 $diacriticSensitive: true、$text：

查询不区分变音符号的文本索引。
筛选结果以仅返回与指定术语中的变音符号匹配的文档。

启用 $diacriticSensitive: true 可能会降低性能。

对于早期的文本索引版本，$diacriticSensitive: true 会查询已经区分变音符号的文本索引。

当 $diacriticSensitive: true 且后缀词干包含变音符号时，$text 会匹配精确的单词。

提示

词干处理

文本分数

$text 操作符将为每个结果文档分配分数。该分数表示文档与给定查询的相关性。该分数可以是 sort() 方法规范的一部分，也可以是投影表达式的一部分。{ $meta: "textScore" } 表达式提供了有关 $text 操作的处理信息。有关访问投影或排序分数的详细信息，请参见 $meta 投影操作符。

内存限制

在版本8.3中进行了更改。

从MongoDB 8.3 开始，查询引擎将 TextOr 阶段的内存使用量限制为 100 MB。TextOr 阶段处理读取文本分数元数据的 $text 查询。示例，TextOr 处理按文本分数对结果进行排序的查询。如果 TextOr 阶段超过此限制：

如果 allowDiskUse 为 true，则该阶段会将中间结果溢出到磁盘。
如果 allowDiskUse 为 false，则查询失败，并显示超出内存限制错误。

在早期版本中，TextOr 阶段没有内存限制，并且会无限制地使用RAM ，因此存在出现内存不足 (OOM) 错误的风险。

示例

本页上的示例使用sample_mflix示例数据集中的数据。有关如何将此数据集加载到自管理MongoDB 部署中的详细信息，请参阅加载示例数据集。如果对示例数据库进行了任何修改，则可能需要删除并重新创建数据库才能运行本页上的示例。

这些示例假定 title 和 fullplot 字段上有一个版本 3 文本索引：

db.movies.createIndex( { title: "text", fullplot: "text" } )

搜索单个单词

此示例在 $search string 中指定了 baseball。该查询返回在带索引的 title 或 fullplot 字段中包含 baseball 的词干提取版本的文档：

db.movies.find(
   { $text: { $search: "baseball" }, runtime: { $gt: 1000 } },
   { _id: 0, title: 1, year: 1, runtime: 1 }
)

[ { title: 'Baseball', year: 1994, runtime: 1140 } ]

匹配任何搜索词

以空格分隔的 $search 字符串对每个术语执行逻辑 OR。MongoDB返回包含任何术语的文档。

此示例指定了两个以空格分隔的术语。此查询返回在带索引的 title 或 fullplot 字段中包含词干提取版本的 baseball 或 colorado 的文档：

db.movies.find(
   { $text: { $search: "baseball colorado" },
     runtime: { $gt: 1000 } },
   { _id: 0, title: 1, year: 1, runtime: 1, fullplot: 1 }
)

[
  {
    runtime: 1140,
    title: 'Baseball',
    fullplot: 'Ken Burns relates the history of baseball in a fashion similar to that of his Civil War mini series. Old-time photos and illustrations depict the games early years, while newsreels and video clips highlight more recent developments. Players and participants speak in their own words, and sports writers and broadcasters offer commentary on the sport and events they witnessed.',
    year: 1994
  },
  {
    runtime: 1256,
    title: 'Centennial',
    fullplot: 'This is the story of the evolution of the town Centennial, Colorado. It follows the paths of dozens of people who come to the area for many reasons: money, freedom, or crime. It also shows the bigoted treatment of the Native Indians by the advancing US colonists. It is topped off with a murder mystery that takes 100 years to solve.',
    year: 1978
  }
]

搜索精确字符串

对引号进行转义以匹配精确的多单词字符串。

此示例与短语 ken burns 完全匹配：

db.movies.find(
   { $text: { $search: "\"ken burns\"" },
     runtime: { $gt: 1000 } },
   { _id: 0, title: 1, year: 1, runtime: 1, fullplot: 1 }
)

[
  {
    runtime: 1140,
    title: 'Baseball',
    fullplot: 'Ken Burns relates the history of baseball in a fashion similar to that of his Civil War mini series. Old-time photos and illustrations depict the games early years, while newsreels and video clips highlight more recent developments. Players and participants speak in their own words, and sports writers and broadcasters offer commentary on the sport and events they witnessed.',
    year: 1994
  }
]

此示例对两个精确字符串执行逻辑 OR：

db.movies.find(
   { $text: { $search: "\'ken burns\' \'centennial\'" },
     runtime: { $gt: 1000 } },
   { _id: 0, title: 1, year: 1, runtime: 1, fullplot: 1 }
)

[
  {
    runtime: 1140,
    title: 'Baseball',
    fullplot: 'Ken Burns relates the history of baseball in a fashion similar to that of his Civil War mini series. Old-time photos and illustrations depict the games early years, while newsreels and video clips highlight more recent developments. Players and participants speak in their own words, and sports writers and broadcasters offer commentary on the sport and events they witnessed.',
    year: 1994
  },
  {
    runtime: 1256,
    title: 'Centennial',
    fullplot: 'This is the story of the evolution of the town Centennial, Colorado. It follows the paths of dozens of people who come to the area for many reasons: money, freedom, or crime. It also shows the bigoted treatment of the Native Indians by the advancing US colonists. It is topped off with a murder mystery that takes 100 years to solve.',
    year: 1978
  }
]

排除包含术语的文档

在术语前加上-可排除包含该术语的文档。

此示例匹配包含 baseball 或 colorado 但不包含 sport 的文档（词干版本）：

db.movies.find(
   { $text: { $search: "baseball colorado -sport" },
     runtime: { $gt: 1000 } },
   { _id: 0, title: 1, year: 1, runtime: 1 }
)

[ { title: 'Centennial', year: 1978, runtime: 1256 } ]

查询另一种语言

本页上的其余示例使用 articles集合，其版本 3 文本索引位于 subject：

db.articles.createIndex( { subject: "text" } )

该集合包含以下文档：

db.articles.insertMany( [
   { _id: 1, subject: "coffee", author: "xyz", views: 50 },
   { _id: 2, subject: "Coffee Shopping", author: "efg", views: 5 },
   { _id: 3, subject: "Baking a cake", author: "abc", views: 90  },
   { _id: 4, subject: "baking", author: "xyz", views: 100 },
   { _id: 5, subject: "Café Con Leche", author: "abc", views: 200 },
   { _id: 6, subject: "Сырники", author: "jkl", views: 80 },
   { _id: 7, subject: "coffee and cream", author: "efg", views: 10 },
   { _id: 8, subject: "Cafe con Leche", author: "xyz", views: 10 }
] )

使用 $language 指定用于确定 $search string 的停用词、词干分析器和分词器规则的语言。

如果将 default_language 值指定为 none，文本索引会解析字段中的每个词，包括停用词，并忽略后缀词干。

此示例将 es（西班牙语）指定为语言：

db.articles.find(
   { $text: { $search: "leche", $language: "es" } }
)

[
  { _id: 5, subject: 'Café Con Leche', author: 'abc', views: 200 },
  { _id: 8, subject: 'Cafe con Leche', author: 'xyz', views: 10 }
]

您也可以按名称指定语言，例如 spanish。有关支持的语言，请参阅自管理部署上的 $text 查询语言。

按相关性得分排序

您可以在 sort() 中指定 { $meta: "textScore" } 表达式，而无需在投影中指定表达式。例如：

db.articles.find(
   { $text: { $search: "cake" } }
).sort( { score: { $meta: "textScore" } } )

因此，您可以根据相关性对结果文档进行排序，而无需对 textScore 投影。

如果您在{ $meta: "textScore" } 投影和中都包含sort() 表达式，则投影和排序文档可以为表达式具有不同的字段名称。示例，在以下操作中，投影使用名为 score 的字段作为表达式，而 sort() 使用名为 ignoredName 的字段：

db.articles.find(
   { $text: { $search: "cake" } },
   { score: { $meta: "textScore" } }
).sort( { ignoredName: { $meta: "textScore" } } )

不区分大小写和变音符号

$text 默认为文本索引的不区分大小写和变音符号。3 版本的文本索引对带变音符号的拉丁字符和西里尔字母等非拉丁字母不区分变音符号和大小写。请参阅文本“索引不区分大小写”和文本“索引不区分变音符号”。

此示例执行不区分大小写和变音符号的查询。使用版本 3 文本索引，该查询将匹配包含搜索词词干版本的文档：

db.articles.find( { $text: { $search: "сы́рники CAFÉS" } } )

[
  { _id: 6, subject: 'Сырники', author: 'jkl', views: 80 },
  { _id: 5, subject: 'Café Con Leche', author: 'abc', views: 200 },
  { _id: 8, subject: 'Cafe con Leche', author: 'xyz', views: 10 }
]

早期的文本索引版本无法匹配任何文档。

区分大小写

使用 $caseSensitive: true 启用区分大小写。这可能会降低性能。

区分大小写的术语搜索

此示例对 Coffee 执行区分大小写的查询：

db.articles.find(
   { $text: { $search: "Coffee", $caseSensitive: true } }
)

[ { _id: 2, subject: 'Coffee Shopping', author: 'efg', views: 5 } ]

区分大小写的精确字符串搜索

此示例对精确的多单词字符串执行区分大小写的查询：

db.articles.find( {
   $text: { $search: "\"Café Con Leche\"", $caseSensitive: true }
} )

[ { _id: 5, subject: 'Café Con Leche', author: 'abc', views: 200 } ]

区分大小写的否定词搜索

您可以使用案例大小写与否定术语（以 - 为前缀的术语）。

此示例对包含 Coffee 但不包含 shop 的文档执行区分大小写的查询（词干处理版本）：

db.articles.find(
   { $text: { $search: "Coffee -shop", $caseSensitive: true } }
)

[ { _id: 2, subject: 'Coffee Shopping', author: 'efg', views: 5 } ]

区分变音符号

通过使用 $diacriticSensitive: true 的 3 版本文本索引启用区分变音符号。这可能会降低性能。

区分变音符号的术语搜索

此示例对 CAFÉ（词干提取版本）执行区分变音符号的查询：

db.articles.find(
   { $text: { $search: "CAFÉ", $diacriticSensitive: true } }
)

[ { _id: 5, subject: 'Café Con Leche', author: 'abc', views: 200 } ]

区分变音符号的否定词搜索

您可以使用区分变音符号与否定术语（前缀为 - 的术语）。

此示例对包含 leches 但不包含 cafés 的文档执行区分变音符号的查询（词干提取版本）：

db.articles.find(
   { $text: { $search: "leches -cafés", $diacriticSensitive: true } }
)

[ { _id: 8, subject: 'Cafe con Leche', author: 'xyz', views: 10 } ]

提示

自管理部署上的文本索引
词干处理
精确字符串
否定
不区分大小写。
区分大小写和词干
不区分变音符号
变音敏感度和词干
$meta
聚合管道中的 $text

后退

$text 查询操作符

来年

聚合管道中的 $text 查询

注意

定义

兼容性

语法

行为

限制

$search 字段

注意

精确字符串

否定

匹操作

非索引字

词干处理

区分大小写和词干

变音敏感度和词干

不区分大小写。

启用区分大小写

区分大小写的流程

不区分变音符号

启用区分变音符号

区分变音符号的搜索过程

提示

文本分数

内存限制

示例

搜索单个单词

匹配任何搜索词

搜索精确字符串

排除包含术语的文档

相关性分数示例

返回相关性评分

返回前 2 个匹配文档

将 $text 与其他查询和排序操作相结合

查询另一种语言

按相关性得分排序

不区分大小写和变音符号

区分大小写

区分大小写的术语搜索

区分大小写的精确字符串搜索

区分大小写的否定词搜索

区分变音符号

区分变音符号的术语搜索

区分变音符号的否定词搜索

提示

`$search` 字段