Overview
在本指南中,您可以学习如何使用Java驱动程序中的MongoDB 向量搜索功能。Aggregates构建者类提供了可用于创建$vectorSearch管道阶段的vectorSearch()辅助程序。此管道阶段允许您对文档执行语义搜索。语义搜索是一种搜索,可查找与您提供的搜索术语或短语含义相似但不一定相同的信息。
执行向量搜索
要使用此功能,您必须创建向量搜索索引并为向量嵌入创建索引。要了解如何以编程方式创建向量搜索索引,请参阅索引指南中的MongoDB Search 和 Vector Search 索引部分。如要了解有关向量嵌入的更多信息,请参阅 Atlas 文档中的如何为向量搜索的向量嵌入创建索引。
在向量嵌入上创建向量搜索索引后,您可以在管道阶段引用此索引,如下节所示。
向量搜索示例
以下示例展示了如何构建一个聚合管道,从而使用 vectorSearch() 和 project() 方法计算向量搜索分数:
// Create an instance of the BinaryVector class as the query vector BinaryVector queryVector = BinaryVector.floatVector( new float[]{0.0001f, 1.12345f, 2.23456f, 3.34567f, 4.45678f}); // Specify the index name for the vector embedding index String indexName = "mflix_movies_embedding_index"; // Specify the path of the field to search on FieldSearchPath fieldSearchPath = fieldPath("plot_embedding"); // Limit the number of matches to 1 int limit = 1; // Create a pre-filter to only search within a subset of documents VectorSearchOptions options = exactVectorSearchOptions() .filter(gte("year", 2016)); // Create the vectorSearch pipeline stage List<Bson> pipeline = asList( vectorSearch( fieldSearchPath, queryVector, indexName, limit, options), project( metaVectorSearchScore("vectorSearchScore")));
提示
查询向量类型
前面的示例创建了一个 BinaryVector实例提供服务查询向量,但您也可以创建 Double 实例的 List。 但是,我们建议您使用 BinaryVector 类型以提高存储效率。
以下示例展示了如何运行聚合并从上述聚合管道的结果中打印向量搜索分数:
Document found = collection.aggregate(pipeline).first(); double score = found.getDouble("vectorSearchScore").doubleValue(); System.out.println("vectorSearch score: " + score);
查询自动嵌入索引
您可以通过查询自动嵌入的MongoDB Vector Search索引,自动生成文本搜索的向量。要学习;了解如何创建自动嵌入索引,请参阅MongoDB自动嵌入搜索索引模型。
以下示例构造了一个MongoDB Vector Search查询,该查询搜索与 plot字段中的短语 time travel 的语义相似性。该查询在名为 auto_embedding_index 的 plot字段上使用自动嵌入的MongoDB Vector Search索引:
List<Bson> pipeline = asList( vectorSearch( fieldPath("plot"), textQuery("time travel"), "auto_embedding_index", 10L, approximateVectorSearchOptions(150L) ), project( fields(include("title", "plot"), excludeId()) ) ); List<Document> results = collection.aggregate(pipeline).into(new ArrayList<>()); for (Document doc : results) { System.out.println("Title: " + doc.getString("title")); System.out.println("Plot: " + doc.getString("plot")); System.out.println("---"); }
Title: Manuel on the Island of Wonders Plot: Manuel's fantasy travel through Time goes from Long Ago (Episode 1 - O jardim proibido / Le Jardin interdit) through Now (Episode 2 - O pique-nique dos sonhos / Le Pique-nique des rèves), ... --- Title: 11 Minutes Ago Plot: Traveling in 11-minute increments, a time-tumbler from 48-years in the future spends two years of his life weaving through a two-hour wedding reception. --- Title: Time Freak Plot: A neurotic inventor creates a time machine and gets lost traveling around yesterday. --- Title: Timecrimes Plot: A man accidentally gets into a time machine and travels back in time nearly an hour. Finding himself will be the first of a series of disasters of unforeseeable consequences. --- Title: The Little Girl Who Conquered Time Plot: A high-school girl acquires the ability to time travel. --- Title: Time Traveller Plot: A high-school girl acquires the ability to time travel. --- Title: Je t'aime je t'aime Plot: Recovering from an attempted suicide, a man is selected to participate in a time travel experiment that has only been tested on mice. A malfunction in the experiment causes the man to ... --- Title: A.P.E.X. Plot: A time-travel experiment in which a robot probe is sent from the year 2073 to the year 1973 goes terribly wrong thrusting one of the project scientists, a man named Nicholas Sinclair into a... --- Title: The Ah of Life Plot: Theoretical mathematician, Nigel Kline finds himself the subject of his own vertical time study. Entering into Einstein's relativity, three versions of Nigel face off with each other, weaving time and space in a world of fluid moments. --- Title: About Time Plot: At the age of 21, Tim discovers he can travel in time and change what happens and has happened in his own life. His decision to make his world a better place by getting a girlfriend turns out not to be as easy as you might think. ---
注意
使用自动嵌入索引时,请直接提供要搜索的文本,而不是该文本的向量表示。
有关自动嵌入MongoDB Vector Search 索引的更多信息,请参阅 索引指南的MongoDB自动嵌入搜索索引模型部分。
API 文档
要进一步了解本指南所提及的方法和类型,请参阅以下 API 文档: