Docs 菜单

Docs 主页启动和管理 MongoDBMongoDB Atlas

使用 Atlas Search 更新 $text 查询,以提升搜索性能

在此页面上

  • Atlas Search 功能优势
  • 举例
  • 了解详情

如果您的查询严重依赖 $text聚合管道阶段,则可以修改这些查询以使用$search ,从而提高这些查询的灵活性和性能。

$search聚合阶段提供了以下功能,这些功能要么通过$text操作符不可用,要么可用但性能较低,或者需要用户进行大量实施工作后才可用:

以下部分中的示例使用针对样本数据中的sample_mflix.movies集合的查询来说明 Atlas Search 相对于$text在灵活性和性能方面的改进。您可以使用以下索引运行这两个示例中的查询:

文本索引
Atlas Search 索引
db.movies.createIndex(
{
genres: "text",
plot: "text",
year: -1
}
)
{
"mappings": {
"dynamic": false,
"fields": {
"genres": {
"type": "string"
},
"plot": {
"type": "string"
},
"year": {
"type": "number"
}
}
}
}

任一索引定义都会将genresplot字段作为文本进行索引,并将year字段作为数字进行索引。有关创建$text索引的说明,请参阅创建文本索引。有关创建 Atlas Search 索引的说明,请参阅创建 Atlas Search 索引。

您可以更新基于$text的查询以使用$search ,从而提高灵活性和便利性。在此示例中,您将查询样本数据中的sample_mflix.movies集合,以检索plot字段中包含单词“poet”的条目,并按年份升序排序。

上一节中列出的索引定义说明了$search的一项灵活性增强功能:为了在sample_mflix.movies上创建$text索引,必须首先删除样本数据上的任何现有文本索引,因为 MongoDB 支持每个集合只有一个文本索引。

相反,您可以为单个collection创建多个Atlas Search 索引,从而允许您的应用程序并行利用不同的全文查询。

以下查询返回plot字段中包含“poet”的五部最新电影,显示它们的标题、类型、情节和上映年份。

正则表达式索引
Atlas Search 索引
db.movies.find(
{
$text: { $search: "poet" }
},
{
_id: 0,
title: 1,
genres: 1,
plot: 1,
year: 1
}
).limit(5)
db.movies.aggregate([
{
"$search": {
"text": {
"path": "plot",
"query": "poet"
}
}
},
{
"$limit": 5
},
{
"$project": {
"_id": 0,
"title": 1,
"genres": 1,
"plot": 1,
"year": 1,
}
}
])

这两个查询都返回以下结果:

{
plot: `It's the story of the murder of a poet, a man, a great film director: Pier Paolo Pasolini. The story begin with the arrest of "Pelosi", a young man then accused of the murder of the poet. ...`,
genres: [ 'Crime', 'Drama' ],
title: 'Who Killed Pasolini?',
year: 1995
},
{
plot: 'Friendship and betrayal between two poets during the French Revolution.',
genres: [ 'Biography', 'Drama' ],
title: 'Pandaemonium',
year: 2000
},
{
year: 2003,
plot: 'Story of the relationship between the poets Ted Hughes and Sylvia Plath.',
genres: [ 'Biography', 'Drama', 'Romance' ],
title: 'Sylvia'
},
{
year: 2003,
plot: 'Story of the relationship between the poets Ted Hughes and Sylvia Plath.',
genres: [ 'Biography', 'Drama', 'Romance' ],
title: 'Sylvia'
},
{
plot: 'A love-struck Italian poet is stuck in Iraq at the onset of an American invasion.',
genres: [ 'Comedy', 'Drama', 'Romance' ],
title: 'The Tiger and the Snow',
year: 2005
}

这是 Atlas Search 的独特功能,您可以在结果中突出显示,在找到匹配项的上下文中显示匹配项。 为此,请将上面的 Atlas Search 搜索查询替换为以下内容:

1db.movies.aggregate([
2 {
3 "$search": {
4 "text": {
5 "path": "plot",
6 "query": "poet"
7 },
8 "highlight": {
9 "path": "plot"
10 }
11 }
12 },
13 {
14 "$limit": 1
15 },
16 {
17 "$project": {
18 "_id": 0,
19 "title": 1,
20 "genres": 1,
21 "plot": 1,
22 "year": 1,
23 "highlights": { "$meta": "searchHighlights" }
24 }
25 }
26])

上述查询的结果包括highlights字段,其中包含所有匹配项发生的上下文以及每个匹配项的相关性分数。 例如,以下显示了$search结果中第一个文档的highlights字段。

{
plot: `It's the story of the murder of a poet, a man, a great film director: Pier Paolo Pasolini. The story begin with the arrest of "Pelosi", a young man then accused of the murder of the poet. ...`,
genres: [ 'Crime', 'Drama' ],
title: 'Who Killed Pasolini?',
year: 1995,
highlights: [
{
score: 1.0902210474014282,
path: 'plot',
texts: [
{ value: "It's the story of the murder of a ", type: 'text' },
{ value: 'poet', type: 'hit' },
{
value: ', a man, a great film director: Pier Paolo Pasolini. ',
type: 'text'
}
]
},
{
score: 1.0202842950820923,
path: 'plot',
texts: [
{
value: 'The story begin with the arrest of "Pelosi", a young man then accused of the murder of the ',
type: 'text'
},
{ value: 'poet', type: 'hit' },
{ value: '. ...', type: 'text' }
]
}
]
}

除了更高的灵活性和便利性之外,与类似的$text查询相比,Atlas Search 还具有显着的性能优势。考虑对sample_mflix.movies集合进行查询,以检索2000和2010之间上映的喜剧类型电影, plot字段中包含“poet”。

运行以下查询:

文本索引
Atlas Search 索引
db.movies.aggregate([
{
$match: {
year: {$gte: 2000, $lte: 2010},
$text: { $search: "poet" },
genres : { $eq: "Comedy" }
}
},
{ "$sort": { "year": 1 } },
{
"$limit": 3
},
{
"$project": {
"_id": 0,
"title": 1,
"genres": 1,
"plot": 1,
"year": 1
},
}
])
db.movies.aggregate([
{
"$search": {
"compound": {
"filter": [{
"range": {
"gte": 2000,
"lte": 2010,
"path": "year"
}
},
{
"text": {
"path": "plot",
"query": "poet"
}
},
{
"text": {
"path": "genres",
"query": "comedy"
}
}]
}
}
},
{ "$sort": { "year": 1 } },
{
"$limit": 3
},
{
"$project": {
"_id": 0,
"title": 1,
"genres": 1,
"plot": 1,
"year": 1
}
}
])

这两个查询都将返回以下三个文档。

{
year: 2000,
plot: 'A film poem inspired by the Peruvian poet Cèsar Vallejo. A story about our need for love, our confusion, greatness and smallness and, most of all, our vulnerability. It is a story with many...',
genres: [ 'Comedy', 'Drama' ],
title: 'Songs from the Second Floor'
},
{
plot: 'When his mother, who has sheltered him his entire 40 years, dies, Elling, a sensitive, would-be poet, is sent to live in a state institution. There he meets Kjell Bjarne, a gentle giant and...',
genres: [ 'Comedy', 'Drama' ],
title: 'Elling',
year: 2001
},
{
plot: 'Heart-broken after several affairs, a woman finds herself torn between a Poet and a TV Host.',
genres: [ 'Comedy', 'Romance', 'Drama' ],
title: 'Easy',
year: 2003
}

虽然$text 对于像这样的简单、狭义的搜索来说已经足够,但随着数据集大小和查询广度的增加,$search 的性能优势将显着提高应用程序的响应能力。我们建议您通过 $search 聚合管道阶段使用 Atlas Search 查询 。

← 如何在 Atlas Search 中使用同义词