使用 Atlas Vector Search 和 Atlas Search 执行混合搜索
您可以将 Atlas Vector Search 和 Atlas Search 查询组合成混合搜索,以获得统一结果。
本教程演示如何对 sample_mflix.embedded_movies
集合运行混合搜索,其中包含有关电影的详细信息。具体来说,本教程将指导您完成以下步骤:
在
plot_embeddings
字段上创建 Atlas Vector Search 索引。该字段包含表示电影情节摘要的向量嵌入。对
sample_mflix.embedded_movies
集合中的title
字段创建 Atlas Search 索引。此字段包含文本字符串形式的电影名称。运行一个查询,该查询使用倒数排名融合来合并针对
plot_embeddings
字段的$vectorSearch
查询和针对title
字段的$search
查询的结果。
为何使用混合搜索?
混合搜索是针对同一查询条件的不同搜索方法(例如全文搜索和语义搜索)的聚合。虽然全文可以有效地查找查询词的精确匹配项,但语义搜索的另一个好处是可以识别语义相似的文档,即使文档不包含精确的查询词也是如此。这可确保同义词和上下文相似的匹配项也包含在两种搜索方法的组合结果中。
相反,如果数据集中有专有名词或特定关键字的词元,您不希望在训练嵌入模型时考虑这些词元,而这些词元在数据集中使用的上下文相同,则向量搜索可能会受益于与全文搜索相结合。
您还可以为每个查询的每种搜索方法设置权重。根据全文搜索结果还是语义搜索结果与查询最相关和最合适,您可以增加每个查询的搜索方法的权重。
什么是倒数排名融合?
倒数排名融合是一种通过执行以下操作将不同搜索方法(例如语义搜索和全文搜索)的结果合并为单个结果集的技术:
计算结果中文档的排名倒数。
对于每个搜索结果中的每个排名文档,首先将文档的排名 (
r
) 与常量数字60
相加,以平滑分数 (rank_constant
),然后将1
除以r
和rank_constant
表示文档在结果中的排名倒数。reciprocal_rank = 1 / ( r + rank_constant ) 对于每种搜索方法,应用不同的权重 (
w
) 以提高该搜索方法的重要性。对于每个文档,通过将权重乘以文档的排名倒数来计算加权排名倒数。weighted_reciprocal_rank = w x reciprocal_rank 将结果中文档的排名派生分数和加权分数相结合。
对于所有搜索结果中的每个文档,将计算出的倒数排名相加,得到该文档的单个分数。
按结果中文档的组合分数对结果进行排序。
根据结果中单个组合排名文档列表的结果的组合分数,对结果中的文档进行排序。
先决条件
在开始之前,您必须具备以下条件:
MongoDB 版本为 v 6.0.11或 v 7.0.2或更高版本的 Atlas 集群。
注意
确保您的 Atlas 集群有足够的内存来存储 Atlas Search 和 Atlas Vector Search 索引并运行高性能查询。
示例数据已加载到您的 Atlas 集群中。
用于在 Atlas 集群上运行查询的以下应用程序之一:
Project Data Access Admin
访问该项目以创建 Atlas Vector Search 和 Atlas Search 索引。
创建 Atlas Vector Search 和 Atlas Search 索引
本部分演示如何在sample_mflix.embedded_movies
collection中的字段上创建以下索引:
plot_embeddings
字段上的 Atlas Vector Search 索引,用于针对该字段运行向量查询。title
字段上的 Atlas Search 索引,用于对该字段运行全文搜索。
步骤
在 AtlasClusters 中,转到项目的 页面。
如果尚未显示,请选择包含所需项目的组织导航栏中的Organizations菜单。
如果尚未显示,请从导航栏的Projects菜单中选择所需的项目。
如果 Clusters(数据库部署)页面尚未出现,请单击侧边栏中的 Database(数据库)。
此时会显示“集群”页面。
转到集群的 Atlas Search 页面。
您可以从侧边栏、 Data Explorer或集群详细信息页面转到 Atlas Search 页面。
在侧边栏中,单击Services标题下的Atlas Search 。
从Select data source下拉列表中,选择您的集群并单击Go to Atlas Search 。
显示Atlas Search页面。
单击集群的对应 Browse Collections 按钮。
展开数据库并选择集合。
单击集合的Search Indexes标签页。
显示Atlas Search页面。
单击集群名称。
单击 Atlas Search 标签页。
显示Atlas Search页面。
定义 Atlas Vector Search 索引。
将默认定义替换为以下索引定义。
此索引定义将
plot_embedding
字段索引为向量类型。plot_embedding
字段包含使用 OpenAI 的text-embedding-ada-002
嵌入模型创建的嵌入。 索引定义指定了1536
向量维度,并使用dotProduct
来衡量相似性。1 { 2 "fields": [ 3 { 4 "type": "vector", 5 "path": "plot_embedding", 6 "numDimensions": 1536, 7 "similarity": "dotProduct" 8 } 9 ] 10 }
输入索引名称,并设置数据库和集合。
在 Index Name 字段中输入
rrf-full-text-search
。注意
如果将索引命名为
default
,则在使用 $search 管道阶段时,您无需指定index
参数。否则,您必须使用index
参数指定索引名称。在 Database and Collection(数据库和集合)部分中找到
sample_mflix
数据库,然后选择embedded_movies
集合。
定义 Atlas Search 索引。
以下索引定义将title
字段索引为用于查询该字段的字符串类型。
您可以使用 Atlas 用户界面中的 Atlas Search Visual Editor 或 Atlas Search JSON Editor 来创建索引。
单击 Next(连接)。
单击 Refine Your Index(连接)。
在 Index Configurations 部分中,切换以禁用 Dynamic Mapping 。
在 Field Mappings 部分中,单击 Add Field 显示 Add Field Mapping 窗口。
单击 Customized Configuration(连接)。
从 Field Name 下拉列表中选择
title
。从 Data Type 下拉列表中选择 String。
单击 Add(连接)。
将默认索引定义替换为以下定义。
1 { 2 "mappings": { 3 "dynamic": false, 4 "fields": { 5 "title": [{ 6 "type": "string" 7 }] 8 } 9 } 10 } 单击 Next(连接)。
运行组合语义搜索和全文搜索查询
本节演示如何使用 和$vectorSearch
$search
管道阶段查询 和 字段中sample_mflix.embedded_movies
星球大战 的 集合中的数据,并结合来自两个阶段的每个文档的分数以重新生成对结果中的文档进行排序。这可确保两次搜索中出现的文档显示在合并结果的顶部。plot_embeddings
title
步骤
使用mongosh
连接到您的集群。
在终端窗口中打开mongosh
并连接到集群。 有关连接的详细说明,请参阅通过mongosh
连接。
使用sample_mflix
数据库。
在 mongosh
提示符下运行以下命令:
use sample_mflix
switched to db sample_mflix
针对 集合运行以下 Atlas Searchembedded_movies
查询。
1 var vector_weight = 0.1; 2 var full_text_weight = 0.9; 3 db.embedded_movies.aggregate([ 4 { 5 "$vectorSearch": { 6 "index": "rrf-vector-search", 7 "path": "plot_embedding", 8 "queryVector": [-0.003091304,-0.018973768,-0.021206735,-0.031700388,-0.027724933,-0.008654361,-0.022407115,-0.01577275,-0.0026105063,-0.03009988,0.023659125,0.0020603316,0.009796659,0.0029944992,0.012152246,0.03180365,0.026343849,-0.022265134,0.022562003,-0.030358028,0.0007740361,0.012294226,0.008228419,-0.018638177,0.012462022,-0.0106098205,0.0037366704,-0.033507414,0.00019018135,-0.02445938,0.02609861,-0.030564545,-0.020574275,-0.021787563,-0.011500426,-0.012384578,0.008796342,-0.013449432,0.016856967,0.0030412883,0.016766615,0.01871562,-0.019709485,-0.01524355,-0.028396113,0.0024749795,0.022587817,-0.024097975,-0.0053339517,0.010500109,-0.0028057296,0.040942032,-0.03585655,-0.013772115,-0.008880239,0.008525288,-0.0075314236,-0.007621775,0.02909311,-0.016844058,0.0084413905,0.0137979295,-0.023839828,0.014572369,-0.03270716,0.0031042115,-0.00033397702,-0.008564009,0.009828928,0.0016053484,0.041458327,0.016844058,0.0059954524,0.0076088677,-0.006840882,-0.009996722,0.010500109,0.014223872,-0.006450435,0.009757937,0.016960224,-0.0116940355,-0.0078992825,0.008796342,0.021348715,-0.006756984,-0.013230007,0.0031687482,-0.031622946,0.0047628027,0.0011084165,-0.0027347393,0.009473976,0.023426794,-0.020961495,0.019980539,-0.009048034,0.03337834,0.0051629297,-0.035133738,0.0029089882,0.0053436323,-0.014946681,-0.02368494,-0.009538513,0.009396533,0.016805336,0.0078992825,0.0183284,-0.0040625804,-0.011351991,0.021593954,0.0052952296,-0.039186638,-0.016856967,-0.022058617,-0.00041948803,-0.00967404,0.007563692,-0.029144738,0.026718162,0.017011853,0.016431024,-0.012087709,0.028860778,0.02199408,-0.015785657,0.00085349684,-0.019709485,-0.017153835,0.035469327,0.012655632,0.024007624,-0.0023087976,-0.029970808,0.022226412,0.0058566984,0.016637541,-0.016108342,-0.010680811,0.0043917173,0.025337078,-0.007821838,-0.0019619134,-0.0063568573,0.009906371,0.012223236,0.030074066,0.010048352,-0.020651719,0.017683035,-0.023052482,0.021529417,0.00477571,0.007757302,0.0057986155,-0.0046756784,-0.0053210445,-0.0015359716,-0.006744077,-0.00543721,0.004853154,0.01087442,-0.017863737,0.021723026,0.021581046,0.017670127,0.0116940355,-0.018367123,-0.0026153466,-0.015333901,0.030512914,-0.018225143,0.028525187,0.00008117497,0.024381936,0.009841835,0.019864373,-0.032836232,-0.027466787,-0.023710756,0.012003812,0.021116383,0.027879821,-0.0039270534,0.0070925746,0.0060664425,0.01288151,0.01794118,-0.011939275,0.00028375947,0.023465516,0.010635635,0.0024523917,-0.6790285,-0.02618896,0.019477153,-0.008680176,0.009693401,0.017360352,0.022794334,0.017295815,-0.006134206,-0.013655949,-0.014314223,0.005788935,0.011235826,-0.012074802,-0.011042216,-0.007783117,0.018405845,-0.012087709,0.008241327,-0.00088011817,-0.026330942,0.02324609,0.0039431877,-0.015217735,0.023568774,0.013513968,0.00024806263,-0.009067396,-0.028241226,0.026318034,-0.020509738,0.020148333,0.0045788735,0.004349768,0.044375382,0.002174884,-0.031080836,0.03714728,0.024807878,0.020535553,-0.012119978,-0.0034720702,0.0059567303,0.008131614,-0.015398438,0.017876644,0.027079567,-0.0037721656,-0.014159335,-0.0016731119,0.0030719433,-0.0023910818,0.005634047,0.0029380298,0.010151611,0.0074088043,0.025711391,-0.0116294995,0.002274916,0.018354215,0.00010487201,0.0156824,-0.0054565715,-0.0010777616,-0.006489157,0.03719891,-0.01601799,0.016650448,0.0004594201,-0.018831786,0.013081573,0.010784069,-0.0026718162,-0.0036624533,0.01170049,0.038024977,0.020612998,-0.000036679994,-0.0066795405,0.024381936,0.008531742,0.0004892683,-0.024446473,0.008428483,0.023749476,0.0026992443,-0.028499372,0.014533647,0.0126879,0.0044239853,0.015656585,0.003555968,-0.013552691,-0.007931551,0.009706308,-0.0113649,-0.034049522,0.0019651402,0.01577275,-0.030332213,-0.0024991806,0.013617227,-0.0048789685,-0.0025830783,0.0066795405,0.0022733025,0.008860879,0.020083796,0.03392045,-0.04269743,0.0071764723,-0.010093528,-0.023078296,-0.012216782,0.008409122,-0.021929543,0.036269583,0.00044772282,0.012642724,-0.024627175,0.023220276,-0.003823795,0.004243283,-0.007512063,0.0021361622,0.0027234454,0.0041367975,-0.027492601,-0.003910919,-0.0054952935,-0.0027089247,0.020651719,0.015953453,-0.008209058,0.0069957697,-0.0016666582,-0.0041335705,-0.008467205,0.004033539,-0.030693617,-0.02334935,0.006931233,-0.0067505306,0.012016719,0.0041529317,-0.025905,-0.0015287113,-0.005840564,-0.009796659,0.00021236582,-0.01117129,-0.0013730166,-0.017786292,0.0052468274,0.005705037,-0.0032106969,-0.022652354,-0.008525288,-0.00045538653,-0.018689806,0.005059671,0.0007611288,-0.0021603634,-0.008931869,0.017915366,-0.020651719,0.0014464271,0.011132567,-0.026214777,-0.016560096,0.028551001,-0.0038334753,-0.0042142416,0.028551001,-0.024704618,0.026692348,-0.023697848,-0.02373657,0.0077056726,0.0016521375,0.0005279902,0.003762485,-0.013029944,0.013785022,0.019425523,0.007699219,0.012068348,0.0094288,0.0043917173,0.0024604588,-0.01847038,0.015411345,-0.01432713,0.0035688751,-0.00634395,-0.016147062,-0.007860561,0.009377171,0.02315574,0.020961495,0.034746516,-0.013436525,0.020225778,-0.029118923,0.009002859,-0.04796362,0.0033107288,-0.020729164,0.01668917,-0.000113342445,0.019696577,0.013552691,0.0073378137,-0.015927639,0.021309992,0.03487559,-0.0053920345,0.00051024265,-0.021671398,0.00791219,-0.0033817189,-0.014623999,0.009048034,0.0013923777,-0.02436903,0.007860561,0.019851465,0.016895687,0.017050575,-0.020380665,-0.008331678,-0.012132885,-0.0057857083,0.026163146,-0.015269365,0.015475882,0.010945411,-0.027776562,0.031080836,-0.0027557136,-0.0065181986,0.0029218956,0.039005935,0.012765344,-0.0005126628,0.006957048,0.04274906,-0.008964137,-0.010674357,-0.0029138285,-0.010280684,-0.016727893,-0.004817659,0.0148176085,0.00536622,-0.03193272,0.015475882,0.0024120563,0.03944478,0.020032167,0.014572369,0.013720485,0.009106117,-0.013501061,-0.0060406276,0.013371988,0.0017086071,0.025943723,0.0030316077,0.007344268,0.00026258337,-0.00012907325,0.008150975,-0.01582438,-0.0011447184,-0.012662085,0.009086756,0.0021232548,0.0012544306,0.013513968,0.011055123,-0.0315455,0.010190332,0.011777934,-0.009996722,-0.0227298,-0.01934808,-0.022329671,0.0027476468,0.02870589,-0.02195536,0.021180918,0.013423617,-0.004885422,-0.037947536,0.0068666968,0.0133203585,-0.01582438,0.022639448,0.010938957,-0.002100667,0.012455568,-0.014288408,-0.020587182,0.009893464,-0.009828928,0.005521108,-0.024214141,0.014933774,-0.018173512,-0.005959957,-0.0067376234,-0.030796876,-0.0040625804,0.0027815285,0.002558877,-0.017734664,-0.006208423,0.048170134,0.0101387035,0.009461069,-0.014830516,-0.0038818778,0.002010316,0.074655965,0.0007425745,0.0125781875,0.011487518,0.0021668172,-0.0100031765,-0.024485195,-0.0022023122,-0.014223872,-0.017153835,-0.0016569778,-0.007144204,0.01949006,0.010319406,-0.0013334879,0.012468475,0.018263863,-0.0052629616,0.012739529,0.001032586,-0.01683115,-0.011907007,0.019309357,-0.0053984886,0.028551001,-0.030306397,0.0108808745,0.011080938,-0.009499791,-0.037018206,-0.02407216,-0.006379445,-0.020587182,0.013939911,0.011777934,-0.0063310424,0.0047079464,0.015282272,0.016289044,-0.02137453,0.0012996062,0.020187056,-0.0010172585,-0.013552691,0.0045401515,-0.008118707,-0.0118295625,0.027286084,-0.005563057,-0.03381719,0.018496197,-0.011500426,-0.012907324,-0.000307759,-0.00030332213,0.011513334,-0.004946732,-0.01275889,-0.019541688,-0.005743759,-0.011139021,-0.030228954,0.018534917,0.0074023507,-0.007344268,-0.013042851,-0.015475882,-0.02301376,-0.007931551,-0.001060014,-0.008363946,0.005708264,0.0013342947,0.006976409,0.019128654,-0.02049683,0.014159335,0.00548884,0.013746301,0.021000218,-0.011732758,0.0008591438,-0.008731805,-0.018831786,0.011532694,0.00048684815,0.026924679,-0.0046950392,0.0024959538,0.0025330624,0.019890187,-0.0016271296,0.0036979485,-0.00046305027,0.015475882,0.005133888,0.007970273,-0.0005586451,0.017205464,0.006685994,-0.0046982663,-0.015695307,0.01126164,0.0057857083,-0.002473366,-0.0038334753,0.009248098,0.014056076,-0.014933774,-0.010099981,-0.007944458,-0.028886592,0.004791844,-0.009609503,0.004736988,0.033481598,-0.0008470432,0.0063955793,0.002445938,-0.02248456,0.0040399926,-0.040270854,-0.0066279112,0.023710756,-0.0056275935,0.0008333291,0.01177148,-0.01934808,-0.003113892,0.0031848822,-0.024665898,0.013668857,0.009383624,0.019502968,-0.040270854,-0.007292638,-0.017631406,0.016740799,-0.00464341,0.0052984566,-0.03676006,-0.013346174,0.01799281,-0.024678804,0.003475297,-0.026511645,-0.010480748,-0.0022862097,-0.007492702,0.005156476,-0.022987945,0.008822156,-0.0011713397,-0.02199408,-0.0045369244,-0.0437042,-0.012216782,-0.03603725,0.026847234,0.020096704,0.036011435,-0.0075765993,0.024175419,-0.014740164,-0.00399159,0.010990587,0.008092892,0.016366487,0.0017925047,0.034178596,0.029454514,-0.0008704377,0.009364264,0.006340723,0.028499372,0.01804444,0.0015504924,0.008344585,-0.008228419,-0.0037528046,-0.005524335,0.013888281,-0.008822156,-0.00588574,-0.014081891,-0.007299092,0.009002859,0.013836652,0.0007349108,0.006363311,0.036682617,-0.022549096,0.018741434,-0.015901824,0.021439066,-0.0162116,0.00012140952,-0.009435254,0.009131932,-0.0062632794,0.01808316,-0.017502332,-0.027983079,0.017153835,-0.0022410343,0.03608888,-0.011151928,0.001871562,0.00022749159,-0.022497466,-0.0065440135,-0.019567505,-0.011894099,-0.044736788,-0.016869873,0.00032772502,-0.004278778,0.023852736,-0.018354215,-0.015024126,0.013836652,0.0062181037,0.025814649,0.0026347076,0.037457056,0.00745398,-0.010629182,-0.0040141777,-0.005459798,-0.0218521,0.0029186688,0.0071893795,0.015230643,-0.025362892,-0.003133253,0.0042336024,-0.016818244,-0.039109193,-0.028137967,0.007202287,0.0004933018,0.029480329,-0.028008893,-0.022820149,-0.032939494,0.0077121262,-0.016637541,0.002531449,-0.02489823,-0.039780375,-0.015811473,-0.0075314236,-0.009880557,0.01996763,-0.010945411,-0.02580174,0.010442025,-0.010119342,0.0070086773,-0.016534282,-0.030564545,0.023168648,-0.0027557136,0.00060906436,0.018625269,0.0084413905,-0.022161877,-0.000673601,0.016250322,0.022936316,-0.014778886,-0.016456839,-0.0030461287,0.005098393,0.02001926,-0.002992886,-0.011939275,-0.017695941,-0.012436207,-0.0036398654,0.006666633,-0.000830909,-0.02171012,-0.020806607,-0.005388808,-0.020858236,-0.016392302,-0.005840564,0.008583371,-0.03131317,-0.006744077,-0.003843156,-0.031003393,0.006014813,-0.0005441244,-0.0100031765,0.0069957697,0.040012706,-0.02754423,-0.010145157,-0.018238049,0.013617227,-0.032681346,0.001777984,-0.0055695106,-0.023568774,0.0253758,-0.020419387,-0.019283542,0.00065424,0.016521376,0.0005844598,0.012352309,0.008860879,0.024588453,0.023697848,-0.010222601,-0.025117652,-0.015024126,0.01177148,-0.0015650131,-0.0005465445,0.010835699,-0.030564545,0.01755396,-0.0050015883,0.011042216,-0.00568245,-0.029170552,-0.010261323,-0.01963204,0.04215532,-0.015540418,-0.011351991,-0.032190867,0.003459163,-0.0073378137,0.034901407,-0.0024523917,-0.008396215,0.0033591313,0.033455785,0.018935045,0.0006772312,0.005653408,0.00340108,0.00967404,-0.018534917,-0.0121006165,-0.0049596396,0.001681179,0.02301376,0.0058954204,-0.016314859,-0.0068279747,0.009190015,-0.019373894,-0.00075185165,-0.0038947852,0.013359081,0.0055275615,-0.0010293592,0.00006201566,0.017760478,-0.010087074,-0.010041898,-0.0036398654,0.015604955,0.023517145,-0.010074167,0.010822792,0.0070603066,-0.022678168,0.0028218639,0.017205464,0.0062019695,0.013849559,-0.0074733407,0.004817659,-0.01046784,-0.019193191,-0.0038528363,-0.005727625,0.017670127,0.014314223,-0.027311899,0.001294766,0.0009309408,0.0044239853,-0.016314859,-0.0021894048,0.019709485,-0.021439066,0.0013157404,0.006095484,-0.021826286,-0.014611091,-0.029454514,0.0101387035,0.007776663,-0.01203608,0.021142198,0.013055759,-0.0035624215,-0.01085506,-0.012887963,0.0039076926,-0.013772115,-0.0018199327,-0.018702714,0.007860561,-0.013100934,0.0043271803,-0.045898445,0.031855278,-0.019219005,0.008351039,-0.026330942,0.014094798,0.004217468,-0.0058115227,0.011726304,0.009073849,0.01504994,-0.013436525,0.00025391128,-0.0007175666,-0.0025604905,0.009073849,0.020625904,-0.0061761546,-0.012042534,0.0017505558,0.0027524868,-0.004569193,0.036889132,0.22551677,-0.011422982,0.0031510005,0.045330524,-0.00017263547,0.03632121,0.016495561,0.003342997,-0.025388706,0.009499791,-0.027002122,0.012326495,0.013694671,0.00037007718,0.0026056662,-0.028576816,-0.01630195,-0.01741198,-0.037353795,-0.019864373,-0.001844134,-0.0023555867,-0.016043805,-0.019231914,0.006769892,-0.011836017,-0.0029218956,-0.0087124435,0.018973768,0.027828192,-0.008525288,-0.0021329354,0.004178746,-0.0054178494,-0.016611727,0.008635,-0.004891876,-0.0011818269,0.0036366386,0.005937369,-0.019606225,0.010596913,-0.00615034,0.030177325,-0.01256528,0.02493695,-0.00948043,0.01263627,0.015075755,0.014791794,-0.027802376,-0.020522647,0.03392045,0.061438866,-0.015669491,0.010261323,-0.003820568,0.003514019,-0.007370082,0.00032328814,-0.0041174367,0.015398438,-0.025479058,0.017670127,-0.012113524,0.009686947,-0.03864453,0.019954724,0.016844058,-0.013643042,0.0046143685,-0.03053873,-0.015992176,-0.01683115,-0.032965306,-0.01640521,0.029015666,0.003910919,-0.010332313,0.017089298,0.011345538,-0.0366568,-0.010054805,-0.021064753,-0.025078932,-0.046931032,0.015927639,-0.0025298356,-0.009777298,0.02402053,-0.013230007,-0.0069828625,-0.015024126,-0.02010961,0.01760559,-0.011371353,0.009396533,0.00726037,-0.026589088,-0.008002541,-0.024317399,-0.013927003,0.009641771,0.005714718,-0.016121248,0.020225778,0.0010366195,0.012784705,-0.01237167,0.0050144955,0.012029626,-0.019412616,0.01073244,0.007099028,-0.019993445,0.018418752,0.0027783015,0.007918644,0.027105382,-0.03193272,0.0015980881,0.011248733,0.012384578,-0.0057243984,0.0045756465,-0.024730433,-0.007563692,0.0094288,0.0025943723,-0.02981592,-0.0077895704,-0.017089298,0.018212235,-0.011061577,-0.0068989648,0.007963819,-0.000080267426,0.0051693833,-0.004314273,0.016327765,-0.01111966,0.0049402784,-0.0058825132,0.020819515,0.022432929,-0.0154242525,0.008880239,0.009015766,0.0031493872,-0.013668857,-0.010112889,-0.01543716,0.00764759,-0.02629222,0.012804066,-0.026356757,-0.036734246,-0.02803471,-0.016469747,0.029273812,-0.030796876,0.010461386,0.02513056,0.002694404,-0.024446473,-0.030693617,-0.16603982,0.03203598,0.02329772,-0.004624049,0.018289678,0.0037366704,0.011777934,0.001595668,0.02010961,0.0014803087,0.021684306,0.0029590041,-0.034953035,0.009712761,0.026460014,0.014198056,0.001739262,0.013100934,0.0018279998,0.008312317,0.023891458,-0.020819515,-0.0058599254,-0.011797295,-0.003005793,-0.012081255,0.007415258,0.022497466,-0.0024201234,0.005459798,-0.017773386,-0.009570781,0.033068564,0.004998361,0.0109518645,0.012971861,-0.01635358,-0.022148969,0.00041041258,0.02909311,0.02151651,0.007834746,0.029867548,-0.0014561075,0.0048047514,0.020264499,0.0057405327,0.00075548183,0.013836652,-0.015992176,-0.006308455,-0.019838559,-0.008964137,-0.010822792,0.009506244,0.023839828,0.014727257,0.007053853,-0.0016400369,-0.02301376,-0.008538195,-0.018457474,0.005369447,-0.017902458,-0.016069619,-0.020483924,-0.0007768596,-0.007279731,-0.010345221,0.012752436,0.00029182652,-0.0003874214,-0.0017973449,0.0029025346,0.016676264,0.000081225386,-0.013759208,0.030409656,-0.01281052,-0.005598552,-0.022252228,0.032991122,0.011093846,-0.0009761164,0.006989316,0.0114939725,-0.010654996,-0.007776663,-0.023258999,-0.015385531,0.020587182,-0.012010265,-0.00366568,-0.0014367466,0.012694353,0.026563274,0.00372699,-0.009712761,0.00733136,0.004069034,-0.0016860192,-0.0072732773,-0.00032490154,0.03087432,0.021284178,0.024420658,0.016882781,0.011132567,0.019141562,0.010209694,-0.004081941,-0.00056832563,0.014456203,0.017373258,0.004010951,0.024975672,0.0059954524,-0.0114939725,0.033791374,0.0020022488,0.0488155,-0.0007268437,-0.021103475,-0.0019231914,-0.010132249,-0.007376536,-0.06908,-0.021981174,0.02320737,-0.00017374469,-0.01452074,0.012203875,-0.008280048,0.00582443,-0.014004447,0.009577234,0.00085027,-0.046724513,-0.0006606937,-0.012081255,0.008822156,-0.0060051326,-0.01053883,-0.001085022,-0.008744712,0.015037033,0.0039786827,0.011887645,0.011429436,0.006553694,-0.011635953,-0.0018167059,-0.021542324,0.035236996,0.009467523,0.012210329,0.0012850855,0.010945411,-0.003685041,-0.01924482,-0.02160686,0.0018957633,-0.021426158,-0.01256528,0.0034882044,-0.056895487,0.008486566,0.025066024,-0.013139656,-0.03211342,-0.014598184,-0.009519151,0.010713079,0.01111966,0.016727893,-0.022213506,-0.034462556,-0.02373657,-0.014172242,0.0023975356,0.023955993,-0.006553694,0.016856967,-0.008157429,-0.00274442,-0.00054896466,-0.0016126088,0.002073239,-0.0033559042,0.017670127,0.00063891255,0.004543378,-0.0064343014,-0.021400344,0.010519469,-0.019167377,-0.020006353,0.0033881727,-0.035004664,-0.0036430922,-0.033507414,-0.016082527,-0.01804444,-0.013552691,0.036966577,-0.01510157,-0.011732758,-0.011584324,0.023413887,-0.023568774,0.03781846,0.019812742,-0.007641136,0.010590459,0.0005154863,-0.00523392,-0.021361621,0.01640521,0.015617862,-0.028137967,-0.008570463,0.015398438,0.006511745,0.026279312,0.015617862,-0.0060470817,-0.0014754685,0.012642724,-0.056482453,0.016663356,-0.00073975103,-0.00044731947,0.015127384,0.0018538145,0.0026314808,-0.0015593661,0.013965725,0.0052113323,-0.020329036,0.0011616593,0.0051242076,0.008822156,-0.03286205,-0.007796024,0.006418167,0.018108977,0.005059671,-0.0050403103,0.0023733343,0.016702078,0.0072668237,-0.00027851586,-0.018935045,-0.012655632,-0.0039044656,0.007370082,-0.019709485,-0.0044562537,0.02010961,-0.027002122,-0.026589088,0.042516727,-0.009544967,-0.031752016,0.008415575,0.008718898,0.032061793,0.018922137,-0.010893782,-0.008951229,0.011861831,-0.026421294,-0.015204828,0.01261691,-0.0047724834,0.017115112,0.013888281,-0.012087709,-0.0188576,0.023930179,0.005362993,-0.015475882,-0.00940944,-0.035133738,0.029996622,-0.0118295625,0.008518834,-0.008835063,0.030074066,0.014533647,0.021619769,0.0013907643,0.014727257,-0.016418116,-0.0070022233,0.008467205,0.011603685,-0.052713513,-0.016624633,-0.006363311,0.013010583,0.018935045,-0.004817659,0.010048352,0.0034688434,-0.0025685576,-0.009351357,0.0162116,0.020432295,-0.008112254,-0.04086459,0.004217468,0.029609403,0.030512914,0.0010366195,0.0035269265,0.00047636093,0.010584006,-0.012074802,0.008757619,0.0042949123,-0.0037108557,-0.018922137,0.040064335,-0.022123154,-0.013384895,0.0016779521,0.016250322,-0.010016084,-0.006169701,0.0044820686,-0.030358028,-0.023271905,0.01679243,-0.029454514,-0.01996763,0.001184247,0.0051984247,0.036992393,0.011061577,-0.017812107,0.0058986475,-0.00928682,0.017115112,-0.0103387665,-0.023452608,-0.027286084,0.019451339,0.018147698,0.022161877,-0.0008631773,-0.03714728,0.010603367,0.024394844,0.026124425,-0.014236779,-0.006279413,0.011739211,0.008209058,-0.011268094,0.008822156,0.0047595757,0.011287455,0.012081255,-0.024007624,0.03226831,-0.017050575,0.03892849,0.009764391,-0.022949222,0.0088996,-0.036114693,0.010164518,0.02137453,-0.004262644,-0.011235826,-0.015863102,-0.013397803,-8.476734e-7,0.025775926,0.0067505306,-0.035185367,-0.014314223,-0.029196369,0.0077895704,-0.002473366,-0.020045076,0.015179014,0.00095272186,0.030616174,-0.009351357,-0.007602414,-0.013617227,0.030667802,0.02195536,-0.0010705012,-0.028783333,-0.0087834345,-0.013384895,-0.017683035,-0.03231994,0.02363331,0.0010196787,0.015540418,-0.0067892526,0.01237167,0.015876008,0.008551102,0.0058728326,-0.020729164,-0.0326039,0.020290313,-0.0016174491,-0.0043045925,0.012739529,-0.012190968], 9 "numCandidates": 100, 10 "limit": 20 11 } 12 }, { 13 "$group": { 14 "_id": null, 15 "docs": {"$push": "$$ROOT"} 16 } 17 }, { 18 "$unwind": { 19 "path": "$docs", 20 "includeArrayIndex": "rank" 21 } 22 }, { 23 "$addFields": { 24 "vs_score": { 25 "$multiply": [ 26 vector_weight, { 27 "$divide": [ 28 1.0, { 29 "$add": ["$rank", 60] 30 } 31 ] 32 } 33 ] 34 } 35 } 36 }, { 37 "$project": { 38 "vs_score": 1, 39 "_id": "$docs._id", 40 "title": "$docs.title" 41 } 42 }, 43 { 44 "$unionWith": { 45 "coll": "movies", 46 "pipeline": [ 47 { 48 "$search": { 49 "index": "rrf-full-text-search", 50 "phrase": { 51 "query": "star wars", 52 "path": "title" 53 } 54 } 55 }, { 56 "$limit": 20 57 }, { 58 "$group": { 59 "_id": null, 60 "docs": {"$push": "$$ROOT"} 61 } 62 }, { 63 "$unwind": { 64 "path": "$docs", 65 "includeArrayIndex": "rank" 66 } 67 }, { 68 "$addFields": { 69 "fts_score": { 70 "$multiply": [ 71 full_text_weight, { 72 "$divide": [ 73 1.0, { 74 "$add": ["$rank", 60] 75 } 76 ] 77 } 78 ] 79 } 80 } 81 }, 82 { 83 "$project": { 84 "fts_score": 1, 85 "_id": "$docs._id", 86 "title": "$docs.title" 87 } 88 } 89 ] 90 } 91 }, 92 { 93 "$group": { 94 "_id": "$title", 95 "vs_score": {"$max": "$vs_score"}, 96 "fts_score": {"$max": "$fts_score"} 97 } 98 }, 99 { 100 "$project": { 101 "_id": 1, 102 "title": 1, 103 "vs_score": {"$ifNull": ["$vs_score", 0]}, 104 "fts_score": {"$ifNull": ["$fts_score", 0]} 105 } 106 }, 107 { 108 "$project": { 109 "score": {"$add": ["$fts_score", "$vs_score"]}, 110 "_id": 1, 111 "title": 1, 112 "vs_score": 1, 113 "fts_score": 1 114 } 115 }, 116 {"$sort": {"score": -1}}, 117 {"$limit": 10} 118 ])
[ { _id: 'Star Wars: Episode IV - A New Hope', vs_score: 0.0016666666666666668, fts_score: 0, score: 0.0016666666666666668 }, { _id: 'Star Wars: Episode I - The Phantom Menace', vs_score: 0.0016393442622950822, fts_score: 0, score: 0.0016393442622950822 }, { _id: 'Star Wars: Episode V - The Empire Strikes Back', vs_score: 0.0016129032258064516, fts_score: 0, score: 0.0016129032258064516 }, { _id: 'Star Wars: Episode VI - Return of the Jedi', vs_score: 0.0015873015873015873, fts_score: 0, score: 0.0015873015873015873 }, { _id: 'Star Wars: The Clone Wars', vs_score: 0.0015625, fts_score: 0, score: 0.0015625 }, { _id: 'Message from Space', vs_score: 0.0015384615384615387, fts_score: 0, score: 0.0015384615384615387 }, { _id: 'Star Wars: Episode II - Attack of the Clones', vs_score: 0.0014925373134328358, fts_score: 0, score: 0.0014925373134328358 }, { _id: 'Guardians of the Galaxy', vs_score: 0.0014705882352941176, fts_score: 0, score: 0.0014705882352941176 }, { _id: 'Abiogenesis', vs_score: 0.0014285714285714286, fts_score: 0, score: 0.0014285714285714286 }, { _id: 'Dune', vs_score: 0.0014084507042253522, fts_score: 0, score: 0.0014084507042253522 } ]
如果通过将第 103 行上的score
的值替换为1
来对结果进行升序排序,则 Atlas Vector Search 将返回以下结果:
[ { _id: 'Cowboys & Aliens', vs_score: 0.0012658227848101266, fts_score: 0, score: 0.0012658227848101266 }, { _id: 'Planet of the Apes', vs_score: 0.001298701298701299, fts_score: 0, score: 0.001298701298701299 }, { _id: 'Starcrash', vs_score: 0.0013157894736842105, fts_score: 0, score: 0.0013157894736842105 }, { _id: 'Zathura: A Space Adventure', vs_score: 0.0013333333333333335, fts_score: 0, score: 0.0013333333333333335 }, { _id: 'Space Raiders', vs_score: 0.0013513513513513514, fts_score: 0, score: 0.0013513513513513514 }, { _id: 'Star Wars: Episode III - Revenge of the Sith', vs_score: 0.0013698630136986301, fts_score: 0, score: 0.0013698630136986301 }, { _id: 'The Ewok Adventure', vs_score: 0.001388888888888889, fts_score: 0, score: 0.001388888888888889 }, { _id: 'Dune', vs_score: 0.0014084507042253522, fts_score: 0, score: 0.0014084507042253522 }, { _id: 'Abiogenesis', vs_score: 0.0014285714285714286, fts_score: 0, score: 0.0014285714285714286 }, { _id: 'Guardians of the Galaxy', vs_score: 0.0014705882352941176, fts_score: 0, score: 0.0014705882352941176 } ]
关于查询
示例查询从语义搜索和全文搜索中检索排序的搜索结果,并根据结果数组中的位置为结果中的文档分配倒数排名分数。排名分数的倒数使用以下公式计算:
1.0/{document position in the results + constant value}
然后,该查询将每个文档的两次搜索分数相加,根据组合分数对文档进行排名,并对文档进行排序以返回单个结果。
查询变量
示例查询定义了以下变量以增加分数的权重,数字越小,权重越高:
vector_weight
full_text_weight
加权排名分数倒数使用以下公式计算:
weight x reciprocal rank
查询阶段
示例查询使用以下管道阶段对集合执行语义搜索,并检索结果中文档的倒数排名:
在 plot_embeddings 字段中搜索指定为查询的queryVector 字段中的向量嵌入的字符串Star Wars 。该查询使用ada-002-text 嵌入,与plot_embedding 字段中的向量嵌入相同。该查询还指定最多搜索100 个最近邻,并将结果限制为仅20 文档。此阶段从结果中返回语义搜索的排序文档。 | |
将语义搜索结果中的所有文档分组到名为 docs 的字段中。 | |
展开 docs 字段中的文档数组,并将文档在结果数组中的位置存储在名为rank 的字段中。 | |
添加名为 vs_score 的新字段,其中包含结果中每个文档的排名分数倒数。此处,倒数排名分数的计算方法是将1.0 除以rank 与排名常量值60 的总和。然后,通过将vector_weight 权重乘以倒数排名分数来计算加权倒数排名。 | |
结果中仅包含以下字段:
|
示例查询使用$unionWith
阶段对集合执行文本搜索,并检索结果中文档的倒数排名:
搜索在 title 字段中包含star wars 一词的电影。此阶段从结果中的全文搜索中返回已排序的文档。 | |
将输出限制为仅 20 结果。 | |
将全文搜索中的所有文档分组到名为 docs 的字段中。 | |
展开 docs 字段中的文档数组,并将文档在结果数组中的位置存储在名为rank 的字段中。 | |
添加名为 fts_score 的新字段,其中包含结果中每个文档的排名分数倒数。此处,倒数排名分数的计算方法是将1.0 除以rank 与排名常量值60 的总和。然后,通过将full_text_weight 权重乘以倒数排名分数来计算加权倒数排名。 | |
结果中仅包含以下字段:
|
示例查询使用以下阶段来合并语义搜索和文本搜索的结果,并在结果中返回单个已排序的文档列表:
通过观看学习
观看应用程序演示,该应用程序展示了混合搜索查询,该查询将 Atlas Search 全文搜索和向量搜索相结合以返回单个合并结果集。该应用程序实施相对分数融合 (RSF) 和倒数排名融合 (RRF),以返回使用排名融合算法创建的合并集。
时长:2.43 分钟