コレクションとクエリ内のテキストデータのベクトル埋め込みを自動的に生成する方法

Automated embedding is available as a Preview feature only for MongoDB Community Edition v8.2 and later. The feature and the corresponding documentation might change at anytime during the Preview period. To learn more, see Preview Features.

MongoDB ベクトル検索を配置し、 MongoDB ベクトル検索インデックスへの自動埋め込みにより、テキストデータに対してAIを使用したインテリジェントなセマンティック検索を有効にできます。 MongoDB の自動埋め込み機能により、ベクトル検索実装の従来の複雑なプロセスが単一ステップのソリューションに変換されます。埋め込みインフラストラクチャ、モデル選択、統合コードを個別に管理する代わりに、簡単なフィールド構成でセマンティック検索を実装できるようになりました。

MongoDB ベクトル検索インデックスを構成すると、選択した最新の MongoDB AI埋め込みモデルを使用して、コレクション内のテキストデータのベクトル埋め込みが自動的に生成され、データが変更されるたびに埋め込みが同期されます。また、は、自然言語テキストクエリの使用これらのベクトル埋め込みは、データ内の意味のある関係をキャプチャし、キーワードではなく意向に基づいて検索を可能にします。

Overview

With a simple configuration change, you can enable semantic search, RAG, and memory for AI Agents, without writing embedding code, managing model infrastructure, or handling vector pipelines. That is, when deploying the MongoDB Community Edition with the MongoDB Search and Vector Search process, mongot, you can provide the Voyage AI API keys to use for generating embeddings, ideally one for indexing operations and another for query operations from different projects.

配置後:

コレクションから、セマンティック検索を有効にするテキストフィールドを選択します。
使用可能な埋め込みモデルのリストから、埋め込みモデルを選択します。
MongoDB ベクトル検索インデックス定義で、autoEmbed タイプを使用して自動埋め込みを構成します。

自動埋め込み用のMongoDB ベクトル検索インデックスの構成の詳細については、「テキストフィールドのインデックス」を参照してください。

MongoDB Vector Search automatically generates embeddings for existing and new documents that you insert or update by using the API keys that you specified while initializing MongoDB Community Edition.

注意

生成された埋め込みは、同じクラスター上の別のシステムコレクションに保存されます。

query.text$vectorSearchクエリには、パイプラインステージでオプションを使用します。 MongoDB ベクトル検索は、インデックス定義で同じ埋め込みモデルを使用してテキストクエリの埋め込みを生成します。model $vectorSearchパイプラインステージでオプションを使用して別の埋め込みモデルを指定できますが、指定された埋め込みモデルは、インデックス時に使用される埋め込みモデルと互換性がある必要があります。 MongoDB ベクトル検索は、 MongoDB Community の初期化中に提供したクエリAPIキーを使用して、クエリ時に埋め込みを生成します。詳しくは、「テキストクエリの実行」を参照してください。

You will incur charges for generating embeddings using the API keys. To learn more, see Costs.

埋め込みを自動化し、サンプルクエリを実行するには、「使い始める」を参照してください。

Voyage AI API Keys

While you can use a single API key for generating embeddings at index-time and at query-time, we recommend that you use separate API keys to avoid query operations from negatively impacting indexing operations.

You can generate API keys in the following ways:

(Recommended) Using your Atlas account, which allows you to manage your Voyage AI embedding model API key from the Atlas UI.
To learn more about generating and managing API keys including configuring the rate limits (which is a combination of TPM and RPM) and monitoring API key usage, see Model API Keys.
Voyage AI.
To learn more about managing the API keys created from Voyage AI, see API Key.

After creating the keys, you must specify the keys you want to use for automated embedding when configuring mongot during deployment with MongoDB Community Edition. MongoDB Vector Search uses the Voyage AI API key that you provided during deployment of mongot to automatically generate embeddings for your data at index- and for your query text at query-time.

サポートされている埋め込みモデル

MongoDB ベクトル検索は、それぞれ特定のユースケースに合わせて最適化された、Voyage AI の最新の埋め込みモデルと統合します。

埋め込みモデル	説明
`voyage-4-lite`	大規模でコストのかかるアプリケーションに最適化されます。
`voyage-4`	（推奨）一般的なテキスト検索のバランスの取れたパフォーマンス。
`voyage-4-large`	複雑なセマンティック関係の最大精度。
`voyage-code-3`	コード検索と技術ドキュメントに特殊化されています。

コスト

Embedding model pricing is usage-based, with charges billed to the account linked to the API key used for access. Pricing is based on the number of tokens in your text field and queries.

注意

埋め込みモデルと LVM のコンテキストでは、トークンとは、埋め込みの作成やテキストの生成のためにモデルが処理する単語、サブワード、文字などのテキストの基本単位です。トークンは、埋め込みモデルと LM の使用量に対して課金される方法を示します。

If you use the API key that you created using your Atlas account, you can monitor API key usage from the Atlas UI. To learn more, see Billing.

If you generated the API key directly from Voyage AI, see Pricing to learn more about the charge for requests to the embedding service endpoint.

制限

自動埋め込み機能はプレビュー段階ですが、次の配置タイプではまだ使用できません。

Atlas クラスター
Atlas CLI を使用した Atlas のローカル配置
MongoDB Enterprise エディション

この機能は、 Docker、 tarball、またはパッケージマネージャーを使用するMongoDB Search およびMongoDB ベクトル検索の配置、およびMongoDB8.2 以降の Community Edition を持つKubernetes演算子用のMongoDBコントロールを使用する配置でのみ使用できます。

はじめる

次のチュートリアルを使用して、 MongoDB ベクトル検索を構成してベクトル埋め込みを自動的に生成する方法を学習します。具体的には、次のアクションを実行します。

インデックス時に埋め込みを自動的に生成するテキストデータを含む、コレクション内のフィールドをインデックスします。
クエリ時に自動生成される埋め込みを使用して、インデックスフィールドに対してテキストクエリを実行します。

このチュートリアルでは、sample_ Airbnb.listingAnd summaryReviews名前空間を使用して、コレクション内のテキストフィールド（）にインデックスを付けて、インデックス時に埋め込みを自動生成する方法と、インデックス付きフィールド（に対して生成された埋め込みを使用してテキストsummary クエリを実行する方法）を説明します。クエリ時間。

インターフェース

前提条件

Atlas のサンプルデータセットからの映画データを含むコレクションを使用します。

MongoDB v8.2 以降の Community Edition をMongoDB Search およびMongoDB ベクトル検索で自己管理型配置
詳しくは、「 MongoDB Community Edition のインストール」を参照してください。
Valid Voyage AI API key or keys
To learn more, see Voyage AI API Keys.

テキストフィールドのインデックス

このチュートリアルで作成するインデックス定義は、sample_airbnb.listingsAndReviewsコレクション内の次のフィールドをインデックス化します。

summary field as the autoEmbed type to automatically generate embeddings for the text data in the field using the voyage-4 embedding model.
address.country field as the filter type to prefilter the data for the semantic search using the string value in the field.
bedroom field as the filter type to prefilter the data for the semantic search using the numeric value in the field.

このインデックスを作成するには

テキストクエリの実行

このチュートリアルのクエリでは、次の処理が行われます。

sample_airbnb.listingsAndReviewsコレクション内のインデックス付き summaryフィールドに対してを実行します。
次の条件を使用してプロパティを事前フィルタリングします。
- 3 以上の bedrooms を持つプロパティ。
- United States という名前の country のプロパティ。
voyage-4 埋め込みモデルで自動生成された埋め込みを使用して、close to amusement parks であるプロパティのセマンティック検索を実行します。クエリでは以下の操作が実行されます。
- 最大 100 件の近傍を対象とします。
- 結果を10ドキュメントに制限します。

このクエリを実行するには、以下を行います。

戻る

互換性と制約

マニュアル