MongoDB Vector Search enables AI-powered semantic search capabilities in your
Kubernetes environment by deploying the mongot process alongside your
MongoDB database deployment (mongod). The mongot process manages
vector indexes, sources data from the database, and processes
$vectorSearch queries. This eliminates the need to maintain
separate systems in sync while providing advanced search features.
To deploy Vector Search, you apply the MongoDBSearch Custom Resource
(CR), which the Kubernetes Operator picks up and uses to deploy mongot
pods and request persistent storage specified in the spec.
For deployment procedures, see Deploy MongoDB Search and Vector Search.
The mongot Process
Each mongot process has its own persistent volume that is not shared
with the database or other search nodes. Storage is used to maintain
vector indexes that are built from the data continuously sourced from the
database. The index definitions (metadata) are stored in the database
itself.
The mongot performs the following actions:
Manages the vector index.
The
mongotis responsible for updating the index definitions in the database.Sources the data from the database.
The
mongotnodes establish permanent connections to the database in order to update indexes from the database in real time.Processes vector search queries.
When
mongodreceives a$vectorSearchquery, it directs the query to one of themongotnodes. Themongotthat receives the query processes the query, aggregates the data, and returns the results tomongod, which forwards the results to the user.
The mongot components are tightly coupled with a single MongoDB
replica set and cannot be shared across multiple databases or replica
sets. Each replica set deployment has its own dedicated search nodes.
Networking
Network connectivity between mongot and mongod goes in both
directions:
mongotestablishes connections to the replica set to source the data used to build indexes and run queries.mongodconnects tomongotto forward search-related operations such as index management and querying the data.
The mongod acts as the proxy for all search queries. You never
interact directly with the mongot.
Operator-Managed Deployment
When both the mongot and mongod processes are deployed inside
the Kubernetes cluster, the Kubernetes Operator performs configuration for both
processes automatically. Specifically, the Kubernetes Operator:
Finds the MongoDB CR referenced by MongoDBSearch using
spec.source.mongodbResourceRef, or by a naming convention by looking for the MongoDB CR with the same name as MongoDBSearch.Generates
mongotconfiguration in a YAML file and saves it to a config map named<MongoDBSearch.metadata.name>-search-config.Deploys the Vector Search stateful set named
<MongoDBSearch.metadata.name>-searchwith storage and resource requirements configured according tospec.persistenceandspec.resourceRequirementsin the CR.Updates configuration of every
mongodprocess by adding the necessarysetParameteroptions, including the hostnames and port numbers of themongothosts.
External MongoDB Deployment
When the MongoDB replica set is outside of Kubernetes, you deploy mongot
using the Kubernetes Operator and perform some steps manually. The
Kubernetes Operator handles configuration of the search pods, but you must
reconfigure your MongoDB nodes and the networking.
Security
If the MongoDB server is inside the Kubernetes cluster, the Kubernetes Operator automatically sets up keyfile authentication for Vector Search. If the MongoDB server is external, you must create a Kubernetes Secret containing the replica set's keyfile credential and reference it in the MongoDBSearch CR.
Limitations
You can't deploy Vector Search on the following architectures:
IBM Power (ppc64le)
IBM Z (s390x)
Tip
Deploy MongoDB Search and Vector Search — Deploy MongoDB Search and Vector Search
MongoDB Search and Vector Search Settings — MongoDBSearch CR settings