Docs 菜单
Docs 主页
/
MongoDB 阿特拉斯
/ /

使用 Atlas Vector Search 和 Atlas Search 执行混合搜索

在此页面上

  • 为何使用混合搜索?
  • 什么是倒数排名融合?
  • 先决条件
  • 创建 Atlas Vector Search 和 Atlas Search 索引
  • 步骤
  • 运行组合语义搜索和全文搜索查询
  • 步骤
  • 关于查询
  • 通过观看学习

您可以将 Atlas Vector Search 和 Atlas Search 查询组合成混合搜索,以获得统一结果。

本教程演示如何对 sample_mflix.embedded_movies集合运行混合搜索,其中包含有关电影的详细信息。具体来说,本教程将指导您完成以下步骤:

  1. plot_embeddings字段上创建 Atlas Vector Search 索引。该字段包含表示电影情节摘要的向量嵌入。

  2. sample_mflix.embedded_movies集合中的title字段创建 Atlas Search 索引。此字段包含文本字符串形式的电影名称。

  3. 运行一个查询,该查询使用倒数排名融合来合并针对plot_embeddings字段的 $vectorSearch查询和针对title字段的$search查询的结果。

混合搜索是针对同一查询条件的不同搜索方法(例如全文搜索和语义搜索)的聚合。虽然全文可以有效地查找查询词的精确匹配项,但语义搜索的另一个好处是可以识别语义相似的文档,即使文档不包含精确的查询词也是如此。这可确保同义词和上下文相似的匹配项也包含在两种搜索方法的组合结果中。

相反,如果数据集中有专有名词或特定关键字的词元,您不希望在训练嵌入模型时考虑这些词元,而这些词元在数据集中使用的上下文相同,则向量搜索可能会受益于与全文搜索相结合。

您还可以为每个查询的每种搜索方法设置权重。根据全文搜索结果还是语义搜索结果与查询最相关和最合适,您可以增加每个查询的搜索方法的权重。

倒数排名融合是一种通过执行以下操作将不同搜索方法(例如语义搜索和全文搜索)的结果合并为单个结果集的技术:

  1. 计算结果中文档的排名倒数。

    对于每个搜索结果中的每个排名文档,首先将文档的排名 ( r ) 与常量数字60相加,以平滑分数 ( rank_constant ),然后将1除以rrank_constant表示文档在结果中的排名倒数。

    reciprocal_rank = 1 / ( r + rank_constant )

    对于每种搜索方法,应用不同的权重 ( w ) 以提高该搜索方法的重要性。对于每个文档,通过将权重乘以文档的排名倒数来计算加权排名倒数。

    weighted_reciprocal_rank = w x reciprocal_rank
  2. 将结果中文档的排名派生分数和加权分数相结合。

    对于所有搜索结果中的每个文档,将计算出的倒数排名相加,得到该文档的单个分数。

  3. 按结果中文档的组合分数对结果进行排序。

    根据结果中单个组合排名文档列表的结果的组合分数,对结果中的文档进行排序。

在开始之前,您必须具备以下条件:

  • MongoDB 版本为 v 6.0.11或 v 7.0.2或更高版本的 Atlas 集群。

    注意

    确保您的 Atlas 集群有足够的内存来存储 Atlas Search 和 Atlas Vector Search 索引并运行高性能查询。

  • 示例数据已加载到您的 Atlas 集群中。

  • 用于在 Atlas 集群上运行查询的以下应用程序之一:

    • 搜索测试器

    • mongosh

    • Compass

    • C#中

    • 爪哇

    • MongoDB 节点驱动程序

    • pymongo

  • Project Data Access Admin 访问该项目以创建 Atlas Vector Search 和 Atlas Search 索引。

本部分演示如何在sample_mflix.embedded_moviescollection中的字段上创建以下索引:

  • plot_embeddings字段上的 Atlas Vector Search 索引,用于针对该字段运行向量查询。

  • title字段上的 Atlas Search 索引,用于对该字段运行全文搜索。

1
  1. 如果尚未显示,请选择包含所需项目的组织导航栏中的Organizations菜单。

  2. 如果尚未显示,请从导航栏的Projects菜单中选择所需的项目。

  3. 如果 Clusters(数据库部署)页面尚未出现,请单击侧边栏中的 Database(数据库)。

    此时会显示“集群”页面。

2

您可以从侧边栏、 Data Explorer或集群详细信息页面转到 Atlas Search 页面。

  1. 在侧边栏中,单击Services标题下的Atlas Search

  2. Select data source下拉列表中,选择您的集群并单击Go to Atlas Search

    显示Atlas Search页面。

  1. 单击集群的对应 Browse Collections 按钮。

  2. 展开数据库并选择集合。

  3. 单击集合的Search Indexes标签页。

    显示Atlas Search页面。

  1. 单击集群名称。

  2. 单击 Atlas Search 标签页。

    显示Atlas Search页面。

3
4
  1. 单击 Create Search Index(连接)。

  2. Atlas Vector Search下,选择JSON Editor ,然后单击Next

  3. Database and Collection(数据库和集合)部分中找到 sample_mflix 数据库,然后选择 embedded_movies 集合。

  4. Index Name 字段中输入 rrf-vector-search

  5. 将默认定义替换为以下索引定义,然后单击 Next

5
  1. 将默认定义替换为以下索引定义。

    此索引定义将plot_embedding字段索引为向量类型。 plot_embedding字段包含使用 OpenAI 的text-embedding-ada-002嵌入模型创建的嵌入。 索引定义指定了1536向量维度,并使用dotProduct来衡量相似性。

    1{
    2 "fields": [
    3 {
    4 "type": "vector",
    5 "path": "plot_embedding",
    6 "numDimensions": 1536,
    7 "similarity": "dotProduct"
    8 }
    9 ]
    10}
6

此时将显示一个模态窗口,让您知道索引正在构建中。

7

构建索引大约需要一分钟时间。在构建时,Status 列显示 Initial Sync。构建完成后,Status 列显示 Active

8
9
  • 要获得引导式体验,请选择 Atlas Search Visual Editor

  • 要编辑原始索引定义,请选择 Atlas SearchJSON Editor

10
  1. Index Name 字段中输入 rrf-full-text-search

    注意

    如果将索引命名为 default,则在使用 $search 管道阶段时,您无需指定 index 参数。否则,您必须使用 index 参数指定索引名称。

  2. Database and Collection(数据库和集合)部分中找到 sample_mflix 数据库,然后选择 embedded_movies 集合。

11

以下索引定义将title字段索引为用于查询该字段的字符串类型。

您可以使用 Atlas 用户界面中的 Atlas Search Visual Editor 或 Atlas Search JSON Editor 来创建索引。

  1. 单击 Next(连接)。

  2. 单击 Refine Your Index(连接)。

  3. Index Configurations 部分中,切换以禁用 Dynamic Mapping

  4. Field Mappings 部分中,单击 Add Field 显示 Add Field Mapping 窗口。

  5. 单击 Customized Configuration(连接)。

  6. Field Name 下拉列表中选择 title

  7. Data Type 下拉列表中选择 String

  8. 单击 Add(连接)。

  1. 将默认索引定义替换为以下定义。

    1{
    2 "mappings": {
    3 "dynamic": false,
    4 "fields": {
    5 "title": [{
    6 "type": "string"
    7 }]
    8 }
    9 }
    10}
  2. 单击 Next(连接)。

12

此时将显示一个模态窗口,让您知道索引正在构建中。

13

构建索引大约需要一分钟时间。在构建时,Status 列显示 Initial Sync。构建完成后,Status 列显示 Active

本节演示如何使用 和$vectorSearch$search 管道阶段查询 和 字段中sample_mflix.embedded_movies 星球大战 的 集合中的数据,并结合来自两个阶段的每个文档的分数以重新生成对结果中的文档进行排序。这可确保两次搜索中出现的文档显示在合并结果的顶部。plot_embeddingstitle

1

在终端窗口中打开mongosh并连接到集群。 有关连接的详细说明,请参阅通过mongosh连接。

2

mongosh 提示符下运行以下命令:

use sample_mflix
switched to db sample_mflix
3
1var vector_weight = 0.1;
2var full_text_weight = 0.9;
3db.embedded_movies.aggregate([
4 {
5 "$vectorSearch": {
6 "index": "rrf-vector-search",
7 "path": "plot_embedding",
8 "queryVector": [-0.003091304,-0.018973768,-0.021206735,-0.031700388,-0.027724933,-0.008654361,-0.022407115,-0.01577275,-0.0026105063,-0.03009988,0.023659125,0.0020603316,0.009796659,0.0029944992,0.012152246,0.03180365,0.026343849,-0.022265134,0.022562003,-0.030358028,0.0007740361,0.012294226,0.008228419,-0.018638177,0.012462022,-0.0106098205,0.0037366704,-0.033507414,0.00019018135,-0.02445938,0.02609861,-0.030564545,-0.020574275,-0.021787563,-0.011500426,-0.012384578,0.008796342,-0.013449432,0.016856967,0.0030412883,0.016766615,0.01871562,-0.019709485,-0.01524355,-0.028396113,0.0024749795,0.022587817,-0.024097975,-0.0053339517,0.010500109,-0.0028057296,0.040942032,-0.03585655,-0.013772115,-0.008880239,0.008525288,-0.0075314236,-0.007621775,0.02909311,-0.016844058,0.0084413905,0.0137979295,-0.023839828,0.014572369,-0.03270716,0.0031042115,-0.00033397702,-0.008564009,0.009828928,0.0016053484,0.041458327,0.016844058,0.0059954524,0.0076088677,-0.006840882,-0.009996722,0.010500109,0.014223872,-0.006450435,0.009757937,0.016960224,-0.0116940355,-0.0078992825,0.008796342,0.021348715,-0.006756984,-0.013230007,0.0031687482,-0.031622946,0.0047628027,0.0011084165,-0.0027347393,0.009473976,0.023426794,-0.020961495,0.019980539,-0.009048034,0.03337834,0.0051629297,-0.035133738,0.0029089882,0.0053436323,-0.014946681,-0.02368494,-0.009538513,0.009396533,0.016805336,0.0078992825,0.0183284,-0.0040625804,-0.011351991,0.021593954,0.0052952296,-0.039186638,-0.016856967,-0.022058617,-0.00041948803,-0.00967404,0.007563692,-0.029144738,0.026718162,0.017011853,0.016431024,-0.012087709,0.028860778,0.02199408,-0.015785657,0.00085349684,-0.019709485,-0.017153835,0.035469327,0.012655632,0.024007624,-0.0023087976,-0.029970808,0.022226412,0.0058566984,0.016637541,-0.016108342,-0.010680811,0.0043917173,0.025337078,-0.007821838,-0.0019619134,-0.0063568573,0.009906371,0.012223236,0.030074066,0.010048352,-0.020651719,0.017683035,-0.023052482,0.021529417,0.00477571,0.007757302,0.0057986155,-0.0046756784,-0.0053210445,-0.0015359716,-0.006744077,-0.00543721,0.004853154,0.01087442,-0.017863737,0.021723026,0.021581046,0.017670127,0.0116940355,-0.018367123,-0.0026153466,-0.015333901,0.030512914,-0.018225143,0.028525187,0.00008117497,0.024381936,0.009841835,0.019864373,-0.032836232,-0.027466787,-0.023710756,0.012003812,0.021116383,0.027879821,-0.0039270534,0.0070925746,0.0060664425,0.01288151,0.01794118,-0.011939275,0.00028375947,0.023465516,0.010635635,0.0024523917,-0.6790285,-0.02618896,0.019477153,-0.008680176,0.009693401,0.017360352,0.022794334,0.017295815,-0.006134206,-0.013655949,-0.014314223,0.005788935,0.011235826,-0.012074802,-0.011042216,-0.007783117,0.018405845,-0.012087709,0.008241327,-0.00088011817,-0.026330942,0.02324609,0.0039431877,-0.015217735,0.023568774,0.013513968,0.00024806263,-0.009067396,-0.028241226,0.026318034,-0.020509738,0.020148333,0.0045788735,0.004349768,0.044375382,0.002174884,-0.031080836,0.03714728,0.024807878,0.020535553,-0.012119978,-0.0034720702,0.0059567303,0.008131614,-0.015398438,0.017876644,0.027079567,-0.0037721656,-0.014159335,-0.0016731119,0.0030719433,-0.0023910818,0.005634047,0.0029380298,0.010151611,0.0074088043,0.025711391,-0.0116294995,0.002274916,0.018354215,0.00010487201,0.0156824,-0.0054565715,-0.0010777616,-0.006489157,0.03719891,-0.01601799,0.016650448,0.0004594201,-0.018831786,0.013081573,0.010784069,-0.0026718162,-0.0036624533,0.01170049,0.038024977,0.020612998,-0.000036679994,-0.0066795405,0.024381936,0.008531742,0.0004892683,-0.024446473,0.008428483,0.023749476,0.0026992443,-0.028499372,0.014533647,0.0126879,0.0044239853,0.015656585,0.003555968,-0.013552691,-0.007931551,0.009706308,-0.0113649,-0.034049522,0.0019651402,0.01577275,-0.030332213,-0.0024991806,0.013617227,-0.0048789685,-0.0025830783,0.0066795405,0.0022733025,0.008860879,0.020083796,0.03392045,-0.04269743,0.0071764723,-0.010093528,-0.023078296,-0.012216782,0.008409122,-0.021929543,0.036269583,0.00044772282,0.012642724,-0.024627175,0.023220276,-0.003823795,0.004243283,-0.007512063,0.0021361622,0.0027234454,0.0041367975,-0.027492601,-0.003910919,-0.0054952935,-0.0027089247,0.020651719,0.015953453,-0.008209058,0.0069957697,-0.0016666582,-0.0041335705,-0.008467205,0.004033539,-0.030693617,-0.02334935,0.006931233,-0.0067505306,0.012016719,0.0041529317,-0.025905,-0.0015287113,-0.005840564,-0.009796659,0.00021236582,-0.01117129,-0.0013730166,-0.017786292,0.0052468274,0.005705037,-0.0032106969,-0.022652354,-0.008525288,-0.00045538653,-0.018689806,0.005059671,0.0007611288,-0.0021603634,-0.008931869,0.017915366,-0.020651719,0.0014464271,0.011132567,-0.026214777,-0.016560096,0.028551001,-0.0038334753,-0.0042142416,0.028551001,-0.024704618,0.026692348,-0.023697848,-0.02373657,0.0077056726,0.0016521375,0.0005279902,0.003762485,-0.013029944,0.013785022,0.019425523,0.007699219,0.012068348,0.0094288,0.0043917173,0.0024604588,-0.01847038,0.015411345,-0.01432713,0.0035688751,-0.00634395,-0.016147062,-0.007860561,0.009377171,0.02315574,0.020961495,0.034746516,-0.013436525,0.020225778,-0.029118923,0.009002859,-0.04796362,0.0033107288,-0.020729164,0.01668917,-0.000113342445,0.019696577,0.013552691,0.0073378137,-0.015927639,0.021309992,0.03487559,-0.0053920345,0.00051024265,-0.021671398,0.00791219,-0.0033817189,-0.014623999,0.009048034,0.0013923777,-0.02436903,0.007860561,0.019851465,0.016895687,0.017050575,-0.020380665,-0.008331678,-0.012132885,-0.0057857083,0.026163146,-0.015269365,0.015475882,0.010945411,-0.027776562,0.031080836,-0.0027557136,-0.0065181986,0.0029218956,0.039005935,0.012765344,-0.0005126628,0.006957048,0.04274906,-0.008964137,-0.010674357,-0.0029138285,-0.010280684,-0.016727893,-0.004817659,0.0148176085,0.00536622,-0.03193272,0.015475882,0.0024120563,0.03944478,0.020032167,0.014572369,0.013720485,0.009106117,-0.013501061,-0.0060406276,0.013371988,0.0017086071,0.025943723,0.0030316077,0.007344268,0.00026258337,-0.00012907325,0.008150975,-0.01582438,-0.0011447184,-0.012662085,0.009086756,0.0021232548,0.0012544306,0.013513968,0.011055123,-0.0315455,0.010190332,0.011777934,-0.009996722,-0.0227298,-0.01934808,-0.022329671,0.0027476468,0.02870589,-0.02195536,0.021180918,0.013423617,-0.004885422,-0.037947536,0.0068666968,0.0133203585,-0.01582438,0.022639448,0.010938957,-0.002100667,0.012455568,-0.014288408,-0.020587182,0.009893464,-0.009828928,0.005521108,-0.024214141,0.014933774,-0.018173512,-0.005959957,-0.0067376234,-0.030796876,-0.0040625804,0.0027815285,0.002558877,-0.017734664,-0.006208423,0.048170134,0.0101387035,0.009461069,-0.014830516,-0.0038818778,0.002010316,0.074655965,0.0007425745,0.0125781875,0.011487518,0.0021668172,-0.0100031765,-0.024485195,-0.0022023122,-0.014223872,-0.017153835,-0.0016569778,-0.007144204,0.01949006,0.010319406,-0.0013334879,0.012468475,0.018263863,-0.0052629616,0.012739529,0.001032586,-0.01683115,-0.011907007,0.019309357,-0.0053984886,0.028551001,-0.030306397,0.0108808745,0.011080938,-0.009499791,-0.037018206,-0.02407216,-0.006379445,-0.020587182,0.013939911,0.011777934,-0.0063310424,0.0047079464,0.015282272,0.016289044,-0.02137453,0.0012996062,0.020187056,-0.0010172585,-0.013552691,0.0045401515,-0.008118707,-0.0118295625,0.027286084,-0.005563057,-0.03381719,0.018496197,-0.011500426,-0.012907324,-0.000307759,-0.00030332213,0.011513334,-0.004946732,-0.01275889,-0.019541688,-0.005743759,-0.011139021,-0.030228954,0.018534917,0.0074023507,-0.007344268,-0.013042851,-0.015475882,-0.02301376,-0.007931551,-0.001060014,-0.008363946,0.005708264,0.0013342947,0.006976409,0.019128654,-0.02049683,0.014159335,0.00548884,0.013746301,0.021000218,-0.011732758,0.0008591438,-0.008731805,-0.018831786,0.011532694,0.00048684815,0.026924679,-0.0046950392,0.0024959538,0.0025330624,0.019890187,-0.0016271296,0.0036979485,-0.00046305027,0.015475882,0.005133888,0.007970273,-0.0005586451,0.017205464,0.006685994,-0.0046982663,-0.015695307,0.01126164,0.0057857083,-0.002473366,-0.0038334753,0.009248098,0.014056076,-0.014933774,-0.010099981,-0.007944458,-0.028886592,0.004791844,-0.009609503,0.004736988,0.033481598,-0.0008470432,0.0063955793,0.002445938,-0.02248456,0.0040399926,-0.040270854,-0.0066279112,0.023710756,-0.0056275935,0.0008333291,0.01177148,-0.01934808,-0.003113892,0.0031848822,-0.024665898,0.013668857,0.009383624,0.019502968,-0.040270854,-0.007292638,-0.017631406,0.016740799,-0.00464341,0.0052984566,-0.03676006,-0.013346174,0.01799281,-0.024678804,0.003475297,-0.026511645,-0.010480748,-0.0022862097,-0.007492702,0.005156476,-0.022987945,0.008822156,-0.0011713397,-0.02199408,-0.0045369244,-0.0437042,-0.012216782,-0.03603725,0.026847234,0.020096704,0.036011435,-0.0075765993,0.024175419,-0.014740164,-0.00399159,0.010990587,0.008092892,0.016366487,0.0017925047,0.034178596,0.029454514,-0.0008704377,0.009364264,0.006340723,0.028499372,0.01804444,0.0015504924,0.008344585,-0.008228419,-0.0037528046,-0.005524335,0.013888281,-0.008822156,-0.00588574,-0.014081891,-0.007299092,0.009002859,0.013836652,0.0007349108,0.006363311,0.036682617,-0.022549096,0.018741434,-0.015901824,0.021439066,-0.0162116,0.00012140952,-0.009435254,0.009131932,-0.0062632794,0.01808316,-0.017502332,-0.027983079,0.017153835,-0.0022410343,0.03608888,-0.011151928,0.001871562,0.00022749159,-0.022497466,-0.0065440135,-0.019567505,-0.011894099,-0.044736788,-0.016869873,0.00032772502,-0.004278778,0.023852736,-0.018354215,-0.015024126,0.013836652,0.0062181037,0.025814649,0.0026347076,0.037457056,0.00745398,-0.010629182,-0.0040141777,-0.005459798,-0.0218521,0.0029186688,0.0071893795,0.015230643,-0.025362892,-0.003133253,0.0042336024,-0.016818244,-0.039109193,-0.028137967,0.007202287,0.0004933018,0.029480329,-0.028008893,-0.022820149,-0.032939494,0.0077121262,-0.016637541,0.002531449,-0.02489823,-0.039780375,-0.015811473,-0.0075314236,-0.009880557,0.01996763,-0.010945411,-0.02580174,0.010442025,-0.010119342,0.0070086773,-0.016534282,-0.030564545,0.023168648,-0.0027557136,0.00060906436,0.018625269,0.0084413905,-0.022161877,-0.000673601,0.016250322,0.022936316,-0.014778886,-0.016456839,-0.0030461287,0.005098393,0.02001926,-0.002992886,-0.011939275,-0.017695941,-0.012436207,-0.0036398654,0.006666633,-0.000830909,-0.02171012,-0.020806607,-0.005388808,-0.020858236,-0.016392302,-0.005840564,0.008583371,-0.03131317,-0.006744077,-0.003843156,-0.031003393,0.006014813,-0.0005441244,-0.0100031765,0.0069957697,0.040012706,-0.02754423,-0.010145157,-0.018238049,0.013617227,-0.032681346,0.001777984,-0.0055695106,-0.023568774,0.0253758,-0.020419387,-0.019283542,0.00065424,0.016521376,0.0005844598,0.012352309,0.008860879,0.024588453,0.023697848,-0.010222601,-0.025117652,-0.015024126,0.01177148,-0.0015650131,-0.0005465445,0.010835699,-0.030564545,0.01755396,-0.0050015883,0.011042216,-0.00568245,-0.029170552,-0.010261323,-0.01963204,0.04215532,-0.015540418,-0.011351991,-0.032190867,0.003459163,-0.0073378137,0.034901407,-0.0024523917,-0.008396215,0.0033591313,0.033455785,0.018935045,0.0006772312,0.005653408,0.00340108,0.00967404,-0.018534917,-0.0121006165,-0.0049596396,0.001681179,0.02301376,0.0058954204,-0.016314859,-0.0068279747,0.009190015,-0.019373894,-0.00075185165,-0.0038947852,0.013359081,0.0055275615,-0.0010293592,0.00006201566,0.017760478,-0.010087074,-0.010041898,-0.0036398654,0.015604955,0.023517145,-0.010074167,0.010822792,0.0070603066,-0.022678168,0.0028218639,0.017205464,0.0062019695,0.013849559,-0.0074733407,0.004817659,-0.01046784,-0.019193191,-0.0038528363,-0.005727625,0.017670127,0.014314223,-0.027311899,0.001294766,0.0009309408,0.0044239853,-0.016314859,-0.0021894048,0.019709485,-0.021439066,0.0013157404,0.006095484,-0.021826286,-0.014611091,-0.029454514,0.0101387035,0.007776663,-0.01203608,0.021142198,0.013055759,-0.0035624215,-0.01085506,-0.012887963,0.0039076926,-0.013772115,-0.0018199327,-0.018702714,0.007860561,-0.013100934,0.0043271803,-0.045898445,0.031855278,-0.019219005,0.008351039,-0.026330942,0.014094798,0.004217468,-0.0058115227,0.011726304,0.009073849,0.01504994,-0.013436525,0.00025391128,-0.0007175666,-0.0025604905,0.009073849,0.020625904,-0.0061761546,-0.012042534,0.0017505558,0.0027524868,-0.004569193,0.036889132,0.22551677,-0.011422982,0.0031510005,0.045330524,-0.00017263547,0.03632121,0.016495561,0.003342997,-0.025388706,0.009499791,-0.027002122,0.012326495,0.013694671,0.00037007718,0.0026056662,-0.028576816,-0.01630195,-0.01741198,-0.037353795,-0.019864373,-0.001844134,-0.0023555867,-0.016043805,-0.019231914,0.006769892,-0.011836017,-0.0029218956,-0.0087124435,0.018973768,0.027828192,-0.008525288,-0.0021329354,0.004178746,-0.0054178494,-0.016611727,0.008635,-0.004891876,-0.0011818269,0.0036366386,0.005937369,-0.019606225,0.010596913,-0.00615034,0.030177325,-0.01256528,0.02493695,-0.00948043,0.01263627,0.015075755,0.014791794,-0.027802376,-0.020522647,0.03392045,0.061438866,-0.015669491,0.010261323,-0.003820568,0.003514019,-0.007370082,0.00032328814,-0.0041174367,0.015398438,-0.025479058,0.017670127,-0.012113524,0.009686947,-0.03864453,0.019954724,0.016844058,-0.013643042,0.0046143685,-0.03053873,-0.015992176,-0.01683115,-0.032965306,-0.01640521,0.029015666,0.003910919,-0.010332313,0.017089298,0.011345538,-0.0366568,-0.010054805,-0.021064753,-0.025078932,-0.046931032,0.015927639,-0.0025298356,-0.009777298,0.02402053,-0.013230007,-0.0069828625,-0.015024126,-0.02010961,0.01760559,-0.011371353,0.009396533,0.00726037,-0.026589088,-0.008002541,-0.024317399,-0.013927003,0.009641771,0.005714718,-0.016121248,0.020225778,0.0010366195,0.012784705,-0.01237167,0.0050144955,0.012029626,-0.019412616,0.01073244,0.007099028,-0.019993445,0.018418752,0.0027783015,0.007918644,0.027105382,-0.03193272,0.0015980881,0.011248733,0.012384578,-0.0057243984,0.0045756465,-0.024730433,-0.007563692,0.0094288,0.0025943723,-0.02981592,-0.0077895704,-0.017089298,0.018212235,-0.011061577,-0.0068989648,0.007963819,-0.000080267426,0.0051693833,-0.004314273,0.016327765,-0.01111966,0.0049402784,-0.0058825132,0.020819515,0.022432929,-0.0154242525,0.008880239,0.009015766,0.0031493872,-0.013668857,-0.010112889,-0.01543716,0.00764759,-0.02629222,0.012804066,-0.026356757,-0.036734246,-0.02803471,-0.016469747,0.029273812,-0.030796876,0.010461386,0.02513056,0.002694404,-0.024446473,-0.030693617,-0.16603982,0.03203598,0.02329772,-0.004624049,0.018289678,0.0037366704,0.011777934,0.001595668,0.02010961,0.0014803087,0.021684306,0.0029590041,-0.034953035,0.009712761,0.026460014,0.014198056,0.001739262,0.013100934,0.0018279998,0.008312317,0.023891458,-0.020819515,-0.0058599254,-0.011797295,-0.003005793,-0.012081255,0.007415258,0.022497466,-0.0024201234,0.005459798,-0.017773386,-0.009570781,0.033068564,0.004998361,0.0109518645,0.012971861,-0.01635358,-0.022148969,0.00041041258,0.02909311,0.02151651,0.007834746,0.029867548,-0.0014561075,0.0048047514,0.020264499,0.0057405327,0.00075548183,0.013836652,-0.015992176,-0.006308455,-0.019838559,-0.008964137,-0.010822792,0.009506244,0.023839828,0.014727257,0.007053853,-0.0016400369,-0.02301376,-0.008538195,-0.018457474,0.005369447,-0.017902458,-0.016069619,-0.020483924,-0.0007768596,-0.007279731,-0.010345221,0.012752436,0.00029182652,-0.0003874214,-0.0017973449,0.0029025346,0.016676264,0.000081225386,-0.013759208,0.030409656,-0.01281052,-0.005598552,-0.022252228,0.032991122,0.011093846,-0.0009761164,0.006989316,0.0114939725,-0.010654996,-0.007776663,-0.023258999,-0.015385531,0.020587182,-0.012010265,-0.00366568,-0.0014367466,0.012694353,0.026563274,0.00372699,-0.009712761,0.00733136,0.004069034,-0.0016860192,-0.0072732773,-0.00032490154,0.03087432,0.021284178,0.024420658,0.016882781,0.011132567,0.019141562,0.010209694,-0.004081941,-0.00056832563,0.014456203,0.017373258,0.004010951,0.024975672,0.0059954524,-0.0114939725,0.033791374,0.0020022488,0.0488155,-0.0007268437,-0.021103475,-0.0019231914,-0.010132249,-0.007376536,-0.06908,-0.021981174,0.02320737,-0.00017374469,-0.01452074,0.012203875,-0.008280048,0.00582443,-0.014004447,0.009577234,0.00085027,-0.046724513,-0.0006606937,-0.012081255,0.008822156,-0.0060051326,-0.01053883,-0.001085022,-0.008744712,0.015037033,0.0039786827,0.011887645,0.011429436,0.006553694,-0.011635953,-0.0018167059,-0.021542324,0.035236996,0.009467523,0.012210329,0.0012850855,0.010945411,-0.003685041,-0.01924482,-0.02160686,0.0018957633,-0.021426158,-0.01256528,0.0034882044,-0.056895487,0.008486566,0.025066024,-0.013139656,-0.03211342,-0.014598184,-0.009519151,0.010713079,0.01111966,0.016727893,-0.022213506,-0.034462556,-0.02373657,-0.014172242,0.0023975356,0.023955993,-0.006553694,0.016856967,-0.008157429,-0.00274442,-0.00054896466,-0.0016126088,0.002073239,-0.0033559042,0.017670127,0.00063891255,0.004543378,-0.0064343014,-0.021400344,0.010519469,-0.019167377,-0.020006353,0.0033881727,-0.035004664,-0.0036430922,-0.033507414,-0.016082527,-0.01804444,-0.013552691,0.036966577,-0.01510157,-0.011732758,-0.011584324,0.023413887,-0.023568774,0.03781846,0.019812742,-0.007641136,0.010590459,0.0005154863,-0.00523392,-0.021361621,0.01640521,0.015617862,-0.028137967,-0.008570463,0.015398438,0.006511745,0.026279312,0.015617862,-0.0060470817,-0.0014754685,0.012642724,-0.056482453,0.016663356,-0.00073975103,-0.00044731947,0.015127384,0.0018538145,0.0026314808,-0.0015593661,0.013965725,0.0052113323,-0.020329036,0.0011616593,0.0051242076,0.008822156,-0.03286205,-0.007796024,0.006418167,0.018108977,0.005059671,-0.0050403103,0.0023733343,0.016702078,0.0072668237,-0.00027851586,-0.018935045,-0.012655632,-0.0039044656,0.007370082,-0.019709485,-0.0044562537,0.02010961,-0.027002122,-0.026589088,0.042516727,-0.009544967,-0.031752016,0.008415575,0.008718898,0.032061793,0.018922137,-0.010893782,-0.008951229,0.011861831,-0.026421294,-0.015204828,0.01261691,-0.0047724834,0.017115112,0.013888281,-0.012087709,-0.0188576,0.023930179,0.005362993,-0.015475882,-0.00940944,-0.035133738,0.029996622,-0.0118295625,0.008518834,-0.008835063,0.030074066,0.014533647,0.021619769,0.0013907643,0.014727257,-0.016418116,-0.0070022233,0.008467205,0.011603685,-0.052713513,-0.016624633,-0.006363311,0.013010583,0.018935045,-0.004817659,0.010048352,0.0034688434,-0.0025685576,-0.009351357,0.0162116,0.020432295,-0.008112254,-0.04086459,0.004217468,0.029609403,0.030512914,0.0010366195,0.0035269265,0.00047636093,0.010584006,-0.012074802,0.008757619,0.0042949123,-0.0037108557,-0.018922137,0.040064335,-0.022123154,-0.013384895,0.0016779521,0.016250322,-0.010016084,-0.006169701,0.0044820686,-0.030358028,-0.023271905,0.01679243,-0.029454514,-0.01996763,0.001184247,0.0051984247,0.036992393,0.011061577,-0.017812107,0.0058986475,-0.00928682,0.017115112,-0.0103387665,-0.023452608,-0.027286084,0.019451339,0.018147698,0.022161877,-0.0008631773,-0.03714728,0.010603367,0.024394844,0.026124425,-0.014236779,-0.006279413,0.011739211,0.008209058,-0.011268094,0.008822156,0.0047595757,0.011287455,0.012081255,-0.024007624,0.03226831,-0.017050575,0.03892849,0.009764391,-0.022949222,0.0088996,-0.036114693,0.010164518,0.02137453,-0.004262644,-0.011235826,-0.015863102,-0.013397803,-8.476734e-7,0.025775926,0.0067505306,-0.035185367,-0.014314223,-0.029196369,0.0077895704,-0.002473366,-0.020045076,0.015179014,0.00095272186,0.030616174,-0.009351357,-0.007602414,-0.013617227,0.030667802,0.02195536,-0.0010705012,-0.028783333,-0.0087834345,-0.013384895,-0.017683035,-0.03231994,0.02363331,0.0010196787,0.015540418,-0.0067892526,0.01237167,0.015876008,0.008551102,0.0058728326,-0.020729164,-0.0326039,0.020290313,-0.0016174491,-0.0043045925,0.012739529,-0.012190968],
9 "numCandidates": 100,
10 "limit": 20
11 }
12 }, {
13 "$group": {
14 "_id": null,
15 "docs": {"$push": "$$ROOT"}
16 }
17 }, {
18 "$unwind": {
19 "path": "$docs",
20 "includeArrayIndex": "rank"
21 }
22 }, {
23 "$addFields": {
24 "vs_score": {
25 "$multiply": [
26 vector_weight, {
27 "$divide": [
28 1.0, {
29 "$add": ["$rank", 60]
30 }
31 ]
32 }
33 ]
34 }
35 }
36 }, {
37 "$project": {
38 "vs_score": 1,
39 "_id": "$docs._id",
40 "title": "$docs.title"
41 }
42 },
43 {
44 "$unionWith": {
45 "coll": "movies",
46 "pipeline": [
47 {
48 "$search": {
49 "index": "rrf-full-text-search",
50 "phrase": {
51 "query": "star wars",
52 "path": "title"
53 }
54 }
55 }, {
56 "$limit": 20
57 }, {
58 "$group": {
59 "_id": null,
60 "docs": {"$push": "$$ROOT"}
61 }
62 }, {
63 "$unwind": {
64 "path": "$docs",
65 "includeArrayIndex": "rank"
66 }
67 }, {
68 "$addFields": {
69 "fts_score": {
70 "$multiply": [
71 full_text_weight, {
72 "$divide": [
73 1.0, {
74 "$add": ["$rank", 60]
75 }
76 ]
77 }
78 ]
79 }
80 }
81 },
82 {
83 "$project": {
84 "fts_score": 1,
85 "_id": "$docs._id",
86 "title": "$docs.title"
87 }
88 }
89 ]
90 }
91 },
92 {
93 "$group": {
94 "_id": "$title",
95 "vs_score": {"$max": "$vs_score"},
96 "fts_score": {"$max": "$fts_score"}
97 }
98 },
99 {
100 "$project": {
101 "_id": 1,
102 "title": 1,
103 "vs_score": {"$ifNull": ["$vs_score", 0]},
104 "fts_score": {"$ifNull": ["$fts_score", 0]}
105 }
106 },
107 {
108 "$project": {
109 "score": {"$add": ["$fts_score", "$vs_score"]},
110 "_id": 1,
111 "title": 1,
112 "vs_score": 1,
113 "fts_score": 1
114 }
115 },
116 {"$sort": {"score": -1}},
117 {"$limit": 10}
118])
[
{
_id: 'Star Wars: Episode IV - A New Hope',
vs_score: 0.0016666666666666668,
fts_score: 0,
score: 0.0016666666666666668
},
{
_id: 'Star Wars: Episode I - The Phantom Menace',
vs_score: 0.0016393442622950822,
fts_score: 0,
score: 0.0016393442622950822
},
{
_id: 'Star Wars: Episode V - The Empire Strikes Back',
vs_score: 0.0016129032258064516,
fts_score: 0,
score: 0.0016129032258064516
},
{
_id: 'Star Wars: Episode VI - Return of the Jedi',
vs_score: 0.0015873015873015873,
fts_score: 0,
score: 0.0015873015873015873
},
{
_id: 'Star Wars: The Clone Wars',
vs_score: 0.0015625,
fts_score: 0,
score: 0.0015625
},
{
_id: 'Message from Space',
vs_score: 0.0015384615384615387,
fts_score: 0,
score: 0.0015384615384615387
},
{
_id: 'Star Wars: Episode II - Attack of the Clones',
vs_score: 0.0014925373134328358,
fts_score: 0,
score: 0.0014925373134328358
},
{
_id: 'Guardians of the Galaxy',
vs_score: 0.0014705882352941176,
fts_score: 0,
score: 0.0014705882352941176
},
{
_id: 'Abiogenesis',
vs_score: 0.0014285714285714286,
fts_score: 0,
score: 0.0014285714285714286
},
{
_id: 'Dune',
vs_score: 0.0014084507042253522,
fts_score: 0,
score: 0.0014084507042253522
}
]

如果通过将第 103 行上的score的值替换为1来对结果进行升序排序,则 Atlas Vector Search 将返回以下结果:

[
{
_id: 'Cowboys & Aliens',
vs_score: 0.0012658227848101266,
fts_score: 0,
score: 0.0012658227848101266
},
{
_id: 'Planet of the Apes',
vs_score: 0.001298701298701299,
fts_score: 0,
score: 0.001298701298701299
},
{
_id: 'Starcrash',
vs_score: 0.0013157894736842105,
fts_score: 0,
score: 0.0013157894736842105
},
{
_id: 'Zathura: A Space Adventure',
vs_score: 0.0013333333333333335,
fts_score: 0,
score: 0.0013333333333333335
},
{
_id: 'Space Raiders',
vs_score: 0.0013513513513513514,
fts_score: 0,
score: 0.0013513513513513514
},
{
_id: 'Star Wars: Episode III - Revenge of the Sith',
vs_score: 0.0013698630136986301,
fts_score: 0,
score: 0.0013698630136986301
},
{
_id: 'The Ewok Adventure',
vs_score: 0.001388888888888889,
fts_score: 0,
score: 0.001388888888888889
},
{
_id: 'Dune',
vs_score: 0.0014084507042253522,
fts_score: 0,
score: 0.0014084507042253522
},
{
_id: 'Abiogenesis',
vs_score: 0.0014285714285714286,
fts_score: 0,
score: 0.0014285714285714286
},
{
_id: 'Guardians of the Galaxy',
vs_score: 0.0014705882352941176,
fts_score: 0,
score: 0.0014705882352941176
}
]

示例查询从语义搜索和全文搜索中检索排序的搜索结果,并根据结果数组中的位置为结果中的文档分配倒数排名分数。排名分数的倒数使用以下公式计算:

1.0/{document position in the results + constant value}

然后,该查询将每个文档的两次搜索分数相加,根据组合分数对文档进行排名,并对文档进行排序以返回单个结果。

示例查询定义了以下变量以增加分数的权重,数字越小,权重越高:

  • vector_weight

  • full_text_weight

加权排名分数倒数使用以下公式计算:

weight x reciprocal rank

示例查询使用以下管道阶段对集合执行语义搜索,并检索结果中文档的倒数排名:

plot_embeddings字段中搜索指定为查询的queryVector字段中的向量嵌入的字符串Star Wars 。该查询使用ada-002-text嵌入,与plot_embedding字段中的向量嵌入相同。该查询还指定最多搜索100个最近邻,并将结果限制为仅20文档。此阶段从结果中返回语义搜索的排序文档。
将语义搜索结果中的所有文档分组到名为docs的字段中。
展开docs字段中的文档数组,并将文档在结果数组中的位置存储在名为rank的字段中。
添加名为vs_score的新字段,其中包含结果中每个文档的排名分数倒数。此处,倒数排名分数的计算方法是将1.0除以rank与排名常量值60的总和。然后,通过将vector_weight权重乘以倒数排名分数来计算加权倒数排名。

结果中仅包含以下字段:

  • vs_score

  • _id

  • title

示例查询使用$unionWith阶段对集合执行文本搜索,并检索结果中文档的倒数排名:

搜索在title字段中包含star wars一词的电影。此阶段从结果中的全文搜索中返回已排序的文档。
将输出限制为仅20结果。
将全文搜索中的所有文档分组到名为docs的字段中。
展开docs字段中的文档数组,并将文档在结果数组中的位置存储在名为rank的字段中。
添加名为fts_score的新字段,其中包含结果中每个文档的排名分数倒数。此处,倒数排名分数的计算方法是将1.0除以rank与排名常量值60的总和。然后,通过将full_text_weight权重乘以倒数排名分数来计算加权倒数排名。

结果中仅包含以下字段:

  • fts_score

  • _id

  • title

示例查询使用以下阶段来合并语义搜索和文本搜索的结果,并在结果中返回单个已排序的文档列表:

titlevs_scorefts_score对前面阶段的结果中的文档进行分组。

结果中仅包含以下字段:

  • vs_score

  • fts_score

  • _id

  • title

将名为score的字段添加到结果中,其中包含vs_scorefts_score之和。
score降序对结果进行排序。
将输出限制为仅10结果。

观看应用程序演示,该应用程序展示了混合搜索查询,该查询将 Atlas Search 全文搜索和向量搜索相结合以返回单个合并结果集。该应用程序实施相对分数融合 (RSF) 和倒数排名融合 (RRF),以返回使用排名融合算法创建的合并集。

时长:2.43 分钟

后退

文本的语义搜索

来年

本地 RAG