Docs Menu
Docs Home
/
MongoDB Atlas
/

Improve Vector Search Performance

On this page

  • Reduce Vector Dimensions
  • Avoid Indexing Vectors When Running Queries
  • Exclude Vector Fields From the Results
  • Ensure Enough Memory
  • Warm up the Filesystem Cache

Atlas Vector Search enables you to perform ANN queries that search for results similar to a selected product, search for images, and so on. To improve the indexing speed and query performance, review the following best practices.

Atlas Vector Search supports up to 4096, inclusive, vector dimensions. However, vector search indexing and queries are computationally intensive, as larger vectors require more floating point comparisons. Therefore, where possible, we recommend reducing the number of dimensions after ensuring that you can measure the impact of changing embedding models on the accuracy of your vector queries.

Vector embeddings consume computational resources during indexing. We recommend avoiding indexing and re-indexing during a vector search. If you decide to change the embedding model that produces the vectors to index, we recommend that you re-index the new vectors into a new index rather than updating the index that is currently in use.

You can request existing fields from the documents in the results and newly computed fields to be returned in the $project stage. To improve query performance, use the $project stage to judiciously select the fields to return in the results, unless you need all the fields in the results. We recommend excluding the vector field in the $project stage because vector embeddings might be large and impact query latency in returning the results.

Hierarchical Navigable Small Worlds works efficiently when vector data is held in memory. You must ensure that the data nodes have enough RAM to hold the vector data and indexes. We recommend deploying separate Search Nodes for workload isolation without data isolation, which enables more efficient usage of memory for vector search use cases.

When you perform vector search, your queries initially perform random seeks on disk as you traverse the Hierarchical Navigable Small Worlds graph and the vector values are read into memory. This causes very high latency for initial queries. The latency improves when Hierarchical Navigable Small Worlds traversal reads all indexed vectors into memory, which allows them to be accessed much more quickly for subsequent queries.

However, this cache warming process must be repeated on large writes, or when your index is rebuilt.

Back

API Resources