Resource Allocation Considerations for mongot Deployments

MongoDB Search and Vector Search with MongoDB Community is in Preview. The feature and the corresponding documentation might change at any time during the Preview period. To learn more, see Preview Features.

This section answers critical planning questions about your mongot deployment and provides actionable steps to design your search architecture. For a comprehensive overview, read this section linearly. Alternatively, you can use this section as a reference, skipping to relevant sections as needed. The goal of this section is to help accurately estimate resource requirements and build a robust deployment.

Data Characteristics and Indexes

Your data's structure and index definition are the primary drivers of disk usage and RAM requirements. Use this section to estimate the final index size and the memory needed to manage indexes efficiently.

Calculate a baseline index size

Use the following formula as a starting point for your disk space estimation:

Estimated Index Size = (Number of Documents) x (Avg. Size of Indexed Text per Doc) x (Expansion Factor)

The Expansion Factor is critical and depends heavily on your tokenization strategy. Use these multipliers as a guide:

Use Case	Multiplier
Standard/Language Tokenization	`1.5x - 3x`
Keyword/Simple Tokenization	`1.1x - 1.5x`
edgeGram (Autocomplete)	`5x - 8x`
nGram (Partial Match)	`8x - 15x+` (This can be extremely large, especially with a low `minGram` value.)

Note

The preceding expansion factor multipliers represent a generalized example and may not apply to all text based content. Make sure to test with your own data to determine the index calculation for your deployment.

Add vector index size

For vector search, calculate the index size separately and add it to the total. The raw vector data size is a reasonable baseline for this value.

Vector Index Size = (Total Number of Vectors) x (Vector Dimensions) x 4 bytes / dimension

Note

This formula is for float32 vectors. The final HNSW index on disk will have additional overhead.

Manage memory consumption to prevent instability

Use Static Mappings: If your document structure is unpredictable, disable dynamic mappings. Dynamic mapping can lead to a "mapping explosion" where thousands of unintended fields are created, consuming excessive RAM and causing instability. Explicitly define your index with dynamic: false. To learn more, see Dynamic and Static Mappings.
Limit Stored Source Fields: Only store the minimum set of fields required for your search results within the mongot index itself. Fetching fields from the primary MongoDB collection is often more efficient and reduces the disk space used by the index. To learn more, see Define Stored Source Fields in Your MongoDB Search Index.
Account for High-Memory Features: Be aware that synonym collections and deeply nested embeddedDocuments create significant memory artifacts. If you use these features heavily, you must allocate more Java heap space to the mongot process.

Indexing Workload

The rate and type of your writes (inserts, updates, deletes) determine the CPU and Disk IOPS needed to keep your search index synchronized with your database.

To provision for write throughput and maintenance:

Match resources to your write rate

High Write Rate (>1,000 operation per second): This workload is CPU-bound and I/O-bound. Provision servers with a high CPU count and fast storage. Monitor the mongot process's CPU utilization and disk I/O queue length. If these metrics are consistently high and replication lag is growing, you need to scale up your hardware.
Low Write Rate (<100 operations per second): Standard hardware configurations are usually sufficient.

Minimize replication lag

Consolidate Indexes: Avoid defining multiple, separate search indexes on a single collection. Each index adds overhead. Instead, create a single, comprehensive index definition.
Materialize Complex Views: If you are indexing a complex view, the change stream performance can be a constraint. Consider creating a materialized view to provide a simple, pre-aggregated data source for mongot to index.

Plan for index rebuilds

Changing an index definition triggers a full, resource-intensive rebuild.

A rebuild creates a new index in addition to the old one before switching query requests over to the new index. Always ensure you have at least 125% of your current index size available as free disk space to accommodate this process without running out of storage.

Scale horizontally with sharding

For extremely demanding write workloads, scaling a single mongot node vertically might not be sufficient.

If you anticipate sustained write loads exceeding 10,000 operations per second, sharding is the most effective scaling strategy. You need a minimum of 1 mongot per shard. mongot only indexes collections within the shard it is connected to, which helps to distribute the indexing load horizontally.

Query Workload

Query performance is primarily a function of CPU for processing and RAM for caching. The complexity and volume of your queries determine the resources needed to meet your latency targets.

To estimate required deployment sizes for query performance:

Estimate required CPU cores

To establish a CPU baseline, use your Queries per Second (QPS) target. A general starting point is 1 CPU core for every 10 QPS.

This baseline assumes simple queries. For workloads heavy with complex aggregations, fuzzy matching, or regex queries, you may only achieve 2-5 QPS per core. Conversely, simple term matching might yield 20+ QPS per core. Always test performance with a realistic query mix.

Allocate RAM for low latency

RAM is the most important factor for fast queries.

Full-Text Search: The goal is to fit as much of the index as possible into the operating system's file system cache. For optimal performance, provision enough RAM on the mongot node to match the total size of the index on disk. This minimizes slow disk reads.
Vector Search: For low-latency vector search, the HNSW graph index must exist in memory. To calculate the minimum required RAM, use the steps in the index size estimation procedure and add a buffer of 20-25% for overhead. If the vector index doesn't fit in RAM, query latency increases.

Note

To reduce RAM requirements, consider using vector quantization. Vector quantization can compress vector embeddings, which lowers their memory footprint (and thus the RAM required to hold them) often with minimal impact on recall or precision.

Back

Architecture Patterns

Hardware