使用 Atlas Search 更新 $text 查询，以提升搜索性能

在此页面上

Atlas Search 功能优势

举例
了解详情

如果您的查询严重依赖 $text聚合管道阶段，则可以修改这些查询以使用$search ，从而提高这些查询的灵活性和性能。

Atlas Search 功能优势

$search聚合阶段提供了以下功能，这些功能要么通过$text操作符不可用，要么可用但性能较低，或者需要用户进行大量实施工作后才可用：

语言认知
不区分大小写和不区分变音符号的搜索
结果文本突出显示
地理空间感知查询
使用不同分词策略的字符和单词自动完成
模糊匹配
使用复合操作符筛选 10 个或更多字符串
可自定义的相关性评分和排序
数组上的单个复合索引
同义词搜索
用于分面导航的分桶
自定义分析器
部分匹配
短语查询

举例

创建索引

以下部分中的示例使用针对样本数据中的sample_mflix.movies集合的查询来说明 Atlas Search 相对于$text在灵活性和性能方面的改进。您可以使用以下索引运行这两个示例中的查询：

文本索引

Atlas Search 索引

db.movies.createIndex(
  {
    genres: "text",
    plot: "text",
    year: -1
  }
)

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "genres": {
        "type": "string"
      },
      "plot": {
        "type": "string"
      },
      "year": {
        "type": "number"
      }
    }
  }
}

任一索引定义都会将genres和plot字段作为文本进行索引，并将year字段作为数字进行索引。有关创建$text索引的说明，请参阅创建文本索引。有关创建 Atlas Search 索引的说明，请参阅创建 Atlas Search 索引。

使用 Atlas Search 提高全文查询的灵活性

您可以更新基于$text的查询以使用$search ，从而提高灵活性和便利性。在此示例中，您将查询样本数据中的sample_mflix.movies集合，以检索plot字段中包含单词“poet”的条目，并按年份升序排序。

上一节中列出的索引定义说明了$search的一项灵活性增强功能：为了在sample_mflix.movies上创建$text索引，必须首先删除样本数据上的任何现有文本索引，因为 MongoDB 支持每个集合只有一个文本索引。

相反，您可以为单个collection创建多个Atlas Search 索引，从而允许您的应用程序并行利用不同的全文查询。

以下查询返回plot字段中包含“poet”的五部最新电影，显示它们的标题、类型、情节和上映年份。

正则表达式索引

Atlas Search 索引

db.movies.find(
   {
     $text: { $search: "poet" }
   },
   {
     _id: 0,
     title: 1,
     genres: 1,
     plot: 1,
     year: 1
   }
).limit(5)

db.movies.aggregate([
   {
     "$search": {
       "text": {
         "path": "plot",
         "query": "poet"
       }
     }
   },
   {
     "$limit": 5
   },
   {
     "$project": {
       "_id": 0,
       "title": 1,
       "genres": 1,
       "plot": 1,
       "year": 1,
     }
   }
])

这两个查询都返回以下结果：

{
 plot: `It's the story of the murder of a poet, a man, a great film director: Pier Paolo Pasolini. The story begin with the arrest of "Pelosi", a young man then accused of the murder of the poet. ...`,
 genres: [ 'Crime', 'Drama' ],
 title: 'Who Killed Pasolini?',
 year: 1995
},
{
 plot: 'Friendship and betrayal between two poets during the French Revolution.',
 genres: [ 'Biography', 'Drama' ],
 title: 'Pandaemonium',
 year: 2000
},
{
 year: 2003,
 plot: 'Story of the relationship between the poets Ted Hughes and Sylvia Plath.',
 genres: [ 'Biography', 'Drama', 'Romance' ],
 title: 'Sylvia'
},
{
 year: 2003,
 plot: 'Story of the relationship between the poets Ted Hughes and Sylvia Plath.',
 genres: [ 'Biography', 'Drama', 'Romance' ],
 title: 'Sylvia'
},
{
 plot: 'A love-struck Italian poet is stuck in Iraq at the onset of an American invasion.',
 genres: [ 'Comedy', 'Drama', 'Romance' ],
 title: 'The Tiger and the Snow',
 year: 2005
}

这是 Atlas Search 的独特功能，您可以在结果中突出显示，在找到匹配项的上下文中显示匹配项。为此，请将上面的 Atlas Search 搜索查询替换为以下内容：

1 db.movies.aggregate([
2   {
3     "$search": {
4       "text": {
5         "path": "plot",
6         "query": "poet"
7       },
8       "highlight": {
9         "path": "plot"
10       }
11     }
12   },
13   {
14     "$limit": 1
15   },
16   {
17     "$project": {
18       "_id": 0,
19       "title": 1,
20       "genres": 1,
21       "plot": 1,
22       "year": 1,
23       "highlights": { "$meta": "searchHighlights" }
24     }
25   }
26 ])

上述查询的结果包括highlights字段，其中包含所有匹配项发生的上下文以及每个匹配项的相关性分数。例如，以下显示了$search结果中第一个文档的highlights字段。

{
  plot: `It's the story of the murder of a poet, a man, a great film director: Pier Paolo Pasolini. The story begin with the arrest of "Pelosi", a young man then accused of the murder of the poet. ...`,
  genres: [ 'Crime', 'Drama' ],
  title: 'Who Killed Pasolini?',
  year: 1995,
  highlights: [
    {
      score: 1.0902210474014282,
      path: 'plot',
      texts: [
        { value: "It's the story of the murder of a ", type: 'text' },
        { value: 'poet', type: 'hit' },
        {
          value: ', a man, a great film director: Pier Paolo Pasolini. ',
          type: 'text'
        }
      ]
    },
    {
      score: 1.0202842950820923,
      path: 'plot',
      texts: [
        {
          value: 'The story begin with the arrest of "Pelosi", a young man then accused of the murder of the ',
          type: 'text'
        },
        { value: 'poet', type: 'hit' },
        { value: '. ...', type: 'text' }
      ]
    }
  ]
}

使用 Atlas Search 提高查询性能

除了更高的灵活性和便利性之外，与类似的$text查询相比，Atlas Search 还具有显着的性能优势。考虑对sample_mflix.movies集合进行查询，以检索2000和2010之间上映的喜剧类型电影， plot字段中包含“poet”。

运行以下查询：

文本索引

Atlas Search 索引

db.movies.aggregate([
  {
    $match: {
      year: {$gte: 2000, $lte: 2010},
      $text: { $search: "poet" },
      genres : { $eq: "Comedy" }
    }
  },
  { "$sort": { "year": 1 } },
  {
    "$limit": 3
  },
  {
    "$project": {
      "_id": 0,
      "title": 1,
      "genres": 1,
      "plot": 1,
      "year": 1
    },
  }
])

db.movies.aggregate([
  {
    "$search": {
      "compound": {
        "filter": [{
          "range": {
            "gte": 2000,
            "lte": 2010,
            "path": "year"
          }
        },
        {
          "text": {
            "path": "plot",
            "query": "poet"
          }
        },
        {
          "text": {
            "path": "genres",
            "query": "comedy"
          }
        }]
      }
    }
  },
  { "$sort": { "year": 1 } },
  {
    "$limit": 3
  },
  {
    "$project": {
      "_id": 0,
      "title": 1,
      "genres": 1,
      "plot": 1,
      "year": 1
    }
  }
])

这两个查询都将返回以下三个文档。

   {
  year: 2000,
  plot: 'A film poem inspired by the Peruvian poet Cèsar Vallejo. A story about our need for love, our confusion, greatness and smallness and, most of all, our vulnerability. It is a story with many...',
  genres: [ 'Comedy', 'Drama' ],
  title: 'Songs from the Second Floor'
},
{
  plot: 'When his mother, who has sheltered him his entire 40 years, dies, Elling, a sensitive, would-be poet, is sent to live in a state institution. There he meets Kjell Bjarne, a gentle giant and...',
  genres: [ 'Comedy', 'Drama' ],
  title: 'Elling',
  year: 2001
},
{
  plot: 'Heart-broken after several affairs, a woman finds herself torn between a Poet and a TV Host.',
  genres: [ 'Comedy', 'Romance', 'Drama' ],
  title: 'Easy',
  year: 2003
}

虽然$text 对于像这样的简单、狭义的搜索来说已经足够，但随着数据集大小和查询广度的增加，$search 的性能优势将显着提高应用程序的响应能力。我们建议您通过 $search 聚合管道阶段使用 Atlas Search 查询。

了解详情

要了解有关 Atlas Search 查询的更多信息，请参阅创建和运行 Atlas Search 查询。
MongoDB University 提供有关优化 MongoDB 性能的免费课程。要了解更多信息，请参阅监控和见解。

← 如何在 Atlas Search 中使用同义词

使用 Atlas Search 进行全文正则表达式查询 →

1	db.movies.aggregate([
2	{
3	"$search": {
4	"text": {
5	"path": "plot",
6	"query": "poet"
7	},
8	"highlight": {
9	"path": "plot"
10	}
11	}
12	},
13	{
14	"$limit": 1
15	},
16	{
17	"$project": {
18	"_id": 0,
19	"title": 1,
20	"genres": 1,
21	"plot": 1,
22	"year": 1,
23	"highlights": { "$meta": "searchHighlights" }
24	}
25	}
26	])