You can implement multi-tenancy with MongoDB Vector Search so that a single instance of an application serves multiple tenants. This page describes design recommendations that apply specifically to MongoDB Vector Search. These recommendations differ from our multi-tenancy recommendations for Atlas.
Recommendations
Refer to the following recommendations when designing a multi-tenant architecture for MongoDB Vector Search.
Important
This guidance assumes that you can colocate tenants within a single VPC. Otherwise, you must maintain separate projects for each tenant, which we don't recommend for MongoDB Vector Search.
One Collection for all Tenants
We recommend storing all tenant data in a single collection,
as well as in a single database and cluster. You can distinguish between
tenants by including a tenant_id field within each document. This field
can be any unique identifier for the tenant, such as a UUID or
a tenant name. You can use this field as a pre-filter in your MongoDB Vector Search indexes and queries.
This centralized approach offers the following benefits:
Easy to model and scale.
Simplifies maintenance operations.
Efficient query routing through pre-filtering by
tenant_id.Note
You are guaranteed to not serve tenants that do not match this filter.
One Collection per Tenant or One Database per Tenant
We do not recommend storing each tenant in a separate collection or database for the following reasons:
Performance Impact: This approach can lead to varying change stream loads depending on the number of collections, which might negatively impact performance and monitoring capabilities.
No Additional Isolation: Data isolation guarantees in Atlas apply at the database level. Using separate collections within the same database provides no additional data isolation benefit. Using separate databases introduces operational complexity without meaningful security advantages for most use cases.
Instead, use one collection for all tenants. For an example of how to migrate from a collection-per-tenant model to a single collection model, see Migrating from a Collection-Per-Tenant Model.
Considerations
Consider the following strategies to mitigate potential performance issues with the recommended approach.
Tenant Size Discrepancies
If you have many tenants with relatively few vectors each, or if you experience performance issues due to unequal distribution of data (some large tenants and many small tenants), consider the following strategies.
Use Flat Indexes for Many Small Tenants
If you have many tenants (up to 1 million) and each tenant has
relatively few vectors (fewer than 10,000 vectors each), use a
flat index instead of the default Hierarchical Navigable Small Worlds index. To create a flat
index, set the indexingMethod field to flat in your
index definition.
When each tenant has a small number of vectors, queries filtered to a specific tenant are already searched exhaustively. In these cases, the Hierarchical Navigable Small Worlds graph provides no benefit but adds memory and maintenance overhead. Flat indexes eliminate this unnecessary overhead.
Flat indexes offer the following benefits for multi-tenant workloads:
Optimized for selective filters: For highly selective queries where each tenant has a small number of vectors, exhaustive scan is already the fastest path. Flat indexes support this directly, improving both latency and recall.
Predictable performance: Query latency stays within a tight band regardless of which tenant is targeted, eliminating noisy-neighbor effects across tenants.
Resource efficiency: Flat indexes eliminate the memory and maintenance overhead associated with building Hierarchical Navigable Small Worlds graphs.
Example
The following index definition creates a flat index with
a tenant_id filter field:
{ "fields": [ { "type": "vector", "path": "<fieldToIndex>", "numDimensions": <numberOfDimensions>, "similarity": "euclidean | cosine | dotProduct", "indexingMethod": "flat" }, { "type": "filter", "path": "tenant_id" } ] }
Note
Flat indexes are compatible with
scalar and binary quantization.
Always include the tenant_id field as a
pre-filter in your
queries when using flat indexes.
Use HNSW Indexes on Views for Larger Tenants
For larger tenants that have more than 10,000 vectors each, use Hierarchical Navigable Small Worlds indexes on MongoDB Views to separate large tenants from smaller tenants:
Large Tenants (Top 1%):
Create a view for each large tenant.
Create an Hierarchical Navigable Small Worlds index for each view.
Maintain a record of large tenants that you check at query-time to route queries accordingly.
Small Tenants (Remaining Tenants):
Create a single view for all small tenants.
Build a single flat index for this view.
Use the
tenant_idfield as a pre-filter to route queries accordingly.
Example
The following example demonstrates how to create views
for large and small tenants by using mongosh:
Keep a record of your large tenants and their corresponding
tenant_id values, and then create a view for each of these
tenants:
db.createView( "<viewName>", "<collectionName>", [ { "$match": { "tenant_id": "<largeTenantId>" } } ] )
Create a view for the small tenants, filtering out the large tenants:
db.createView( "<viewName>", "<collectionName>", [ { "$match": { "tenant_id": { "$nin": [ "<largeTenantId1>", "<largeTenantId2>", ... ] } } } ] )
After creating the views, create the indexes for each view. Verify the following:
When specifying the collection name for the index, use the view name instead of the original collection name.
For large tenant views, create an Hierarchical Navigable Small Worlds index (the default).
For the small tenant view, create an index with the flat indexing method and include the
tenant_idfield as a pre-filter.
Refer to the Create Indexes page for instructions on creating indexes.
Many Large Tenants
If you have many tenants that each have a large number of vectors, consider using a partition-based system by distributing data across shards.
You can use the tenant_id field as a
shard key
to distribute the data across specific ranges
based on the tenant ID. For more information,
see Ranged Sharding.
Migrating from a Collection-Per-Tenant Model
To migrate from a collection-per-tenant model to a single collection model, process each tenant collection and insert the documents into a new collection.
For example, the following script uses the Node.js driver
to migrate your data from a collection-per-tenant model to a single collection model.
The script also includes a tenant_id field
for each document based on the source collection's name.
import { MongoClient } from 'mongodb'; const uri = "<connectionString>"; const sourceDbName = "<sourceDatabaseName>"; const targetDbName = "<targetDatabaseName>"; const targetCollectionName = "<targetCollectionName>"; async function migrateCollections() { const client = new MongoClient(uri); try { await client.connect(); const sourceDb = client.db(sourceDbName); const targetDb = client.db(targetDbName); const targetCollection = targetDb.collection(targetCollectionName); const collections = await sourceDb.listCollections().toArray(); console.log(`Found ${collections.length} collections.`); const BATCH_SIZE = 1000; // Define a suitable batch size based on your requirements let totalProcessed = 0; for (const collectionInfo of collections) { const collection = sourceDb.collection(collectionInfo.name); let documentsProcessed = 0; let batch = []; const tenantId = collectionInfo.name; // Uses the collection name as the tenant_id const cursor = collection.find({}); for await (const doc of cursor) { doc.tenant_id = tenantId; // Adds a tenant_id field to each document batch.push(doc); if (batch.length >= BATCH_SIZE) { await targetCollection.insertMany(batch); totalProcessed += batch.length; documentsProcessed += batch.length; console.log(`Processed ${documentsProcessed} documents from ${collectionInfo.name}. Total processed: ${totalProcessed}`); batch = []; } } if (batch.length > 0) { await targetCollection.insertMany(batch); totalProcessed += batch.length; documentsProcessed += batch.length; console.log(`Processed ${documentsProcessed} documents from ${collectionInfo.name}. Total processed: ${totalProcessed}`); } } console.log(`Migration completed. Total documents processed: ${totalProcessed}`); } catch (err) { console.error('An error occurred:', err); } finally { await client.close(); } } await migrateCollections().catch(console.error);