To quickly provide your mongot
deployment with a healthy balance of
resources for most general use cases, a small or medium High-CPU node is
often an effective starting point. This configuration provides a solid
foundation for common search workloads.
For more precise resource provisioning tailored to a specific workload, review the following pages:
These pages offer guidance on mission-critical applications, or when higher-fidelity optimization is required.
Note
Sizing resources for search and vector search workloads is an iterative process. These examples represent a starting point, but advanced considerations and measurements may be required for sizing a specific workload.
Workload Classes
mongot
deployments fall into two classes:
Low-CPU (suitable for lower data volumes and vector search)
High-CPU (suitable for higher data volumes and full-text search)
Use the following guidance to select a starting configuration that matches your application's needs.
Low-CPU Workloads
The low-CPU archetype is ideal for vector search applications or low data volumes where memory is prioritized over raw CPU power. These nodes typically have an 8:1 RAM-to-CPU ratio. A key factor in determining the appropriate size category is an estimate of your expected total vector size. To see reference vector size ranges, refer to the table in the Select a starting size step of the Introduction.
The following table shows recommendations for memory, storage, and CPU cores based on your expected workload in Low-CPU deployments:
Workload Size Category | Default Memory (GB) | Default Storage (GB) | CPU Cores |
---|---|---|---|
Small | 8 - 16 | 50 - 100 | 1 - 2 |
Medium | 32 - 64 | 200 - 400 | 4 - 8 |
Large | 128 - 256 | 800 - 1600 | 16 - 32 |
Additional considerations:
Small: Suitable for initial testing or very small vector search applications.
Medium: Suitable for growing vector search use cases or moderate data volumes.
Large: Suitable for substantial vector search applications or larger low-CPU-intensive workloads.
High-CPU Workloads
The High-CPU archetype is designed for general-purpose full-text search workloads where queries are more CPU-intensive. These nodes typically have a 2:1 RAM-to-CPU ratio. Key factors in determining the appropriate size category include the required throughput (QPS) and the expected indexing load. The volume of inserts can serve as a proxy for indexing load. More inserts generally indicate a higher level of indexing activity. To see reference QPS ranges, refer to the table in the Select a starting size step of the Introduction.
The following table shows recommendations for memory, storage, and CPU cores based on your expected workload in High-CPU deployments:
Workload Size Category | Default Memory (GB) | Default Storage (GB) | CPU Cores |
---|---|---|---|
Small | 4 - 8 | 100 - 200 | 2 - 4 |
Medium | 16 - 32 | 400 - 800 | 8 - 16 |
Large | 64 - 96 | 1600 - 2400 | 32 - 48 |
Additional considerations:
Small: A starting point for full-text search with moderate query rates. A minimum setup of two small nodes (4 CPUs total) supports roughly 40 QPS.
Medium: Suitable for more active full-text search applications with higher query throughput.
Large: Suitable for demanding full-text search, heavy indexing, or substantial query workloads.
Considerations for Large Vector Search Workloads
Vector search is a key focus area for AI applications. Modern techniques like automatic binary quantization are shifting the primary resource constraint from RAM to storage. Binary quantization makes indexes more storage-constrained.
In these cases, consider a Low-CPU class node with a large amount of storage available. The large storage supports full-fidelity vector embeddings and the quantized version of the source vector embeddings. This alignment of resources to workload ensures you can build and scale modern AI applications efficiently and economically.