POST /embeddings

Creates vector embeddings for the provided text input(s). This endpoint accepts a single text string or a list of text strings and returns their corresponding vector embeddings.

For semantic search and retrieval tasks, set the input_type parameter to query or document to optimize how the model creates the vectors.

application/json

Body Required

  • input string | array[string] Required

    A single text string or a list of text strings to be embedded, such as ["I like cats", "I also like dogs"].

    Constraints:

    • Maximum list length: 1,000 items
    • Maximum total tokens: 1M for voyage-3.5-lite and voyage-4-lite; 320K for voyage-3.5, voyage-4, and voyage-2; 120K for voyage-3-large, voyage-4-large, voyage-code-3, voyage-finance-2, and voyage-law-2
    One of:

    Minimum length is 1.

    At least 1 but not more than 1000 elements. Minimum length of each is 1.

  • model string Required

    The embedding model to use. Recommended models: voyage-4-large, voyage-4, voyage-4-lite, voyage-code-3, voyage-finance-2, voyage-law-2.

    Values are voyage-context-3, voyage-4, voyage-4-lite, voyage-4-large, voyage-3.5, voyage-3.5-lite, voyage-3-large, voyage-code-3, voyage-multimodal-3, voyage-finance-2, voyage-law-2, or voyage-code-2.

  • input_type

    The type of input text. Use this parameter to optimize embeddings for semantic search and retrieval tasks.

    Options:

    • null (default): The model directly converts the input into numerical vectors without any additional prompts.
    • query: Use when the input represents a search query. The model prepends "Represent the query for retrieving supporting documents: " to optimize the embedding for retrieval.
    • document: Use when the input represents a document to be searched. The model prepends "Represent the document for retrieval: " to optimize the embedding for retrieval.

    For semantic search and retrieval tasks, always set this parameter to query or document as appropriate. Embeddings generated with and without the input_type argument are compatible.

    Values are query, document, or null.

  • truncation boolean

    Whether to truncate input texts that exceed the context length.

    • true (default): Input texts that exceed the context length are automatically truncated before vectorization.
    • false: An error is returned if any input text exceeds the context length.

    Default value is true.

  • output_dimension integer | null

    The number of dimensions for the output embeddings.

    Most models support only a single default dimension. The models voyage-4-large, voyage-4, voyage-4-lite, voyage-3-large, voyage-3.5, voyage-3.5-lite, and voyage-code-3 support the following values: 256, 512, 1024 (default), and 2048.

    Set to null to use the model's default dimension.

    Values are 256, 512, 1024, 2048, or null.

  • output_dtype string

    The data type for the returned embeddings.

    Options:

    • float (default): 32-bit single-precision floating-point numbers. Provides the highest precision and retrieval accuracy. Supported by all models.
    • int8: 8-bit signed integers ranging from -128 to 127. Supported by voyage-4-large, voyage-4, voyage-4-lite, voyage-3-large, voyage-3.5, voyage-3.5-lite, and voyage-code-3.
    • uint8: 8-bit unsigned integers ranging from 0 to 255. Supported by voyage-4-large, voyage-4, voyage-4-lite, voyage-3-large, voyage-3.5, voyage-3.5-lite, and voyage-code-3.
    • binary: Bit-packed, quantized single-bit embedding values represented as int8. The returned list length is 1/8 of output_dimension. Uses the offset binary method. Supported by voyage-4-large, voyage-4, voyage-4-lite, voyage-3-large, voyage-3.5, voyage-3.5-lite, and voyage-code-3.
    • ubinary: Bit-packed, quantized single-bit embedding values represented as uint8. The returned list length is 1/8 of output_dimension. Supported by voyage-4-large, voyage-4, voyage-4-lite, voyage-3-large, voyage-3.5, voyage-3.5-lite, and voyage-code-3.

    Values are float, int8, uint8, binary, or ubinary. Default value is float.

  • encoding_format

    The format in which embeddings are encoded in the response.

    Options:

    • null (default): Embeddings are returned as arrays. When output_dtype is float, each embedding is an array of floating-point numbers. For other output_dtype values (int8, uint8, binary, ubinary), each embedding is an array of integers.
    • base64: Embeddings are returned as Base64-encoded NumPy arrays with the following data types:
      • numpy.float32 when output_dtype is float
      • numpy.int8 when output_dtype is int8 or binary
      • numpy.uint8 when output_dtype is uint8 or ubinary

    Values are base64 or null.

Responses

  • 200 application/json

    Success

    Hide response attributes Show response attributes object
    • object string Required

      The object type. Always returns "list".

      Value is list.

    • data array[object] Required

      An array of embedding objects, one for each input text.

      Hide data attributes Show data attributes object
      • object string Required

        The object type. Always returns "embedding".

        Value is embedding.

      • embedding array[number] | string Required

        The embedding vector. The format depends on the encoding_format parameter:

        • When encoding_format is null: An array of numbers (floats when output_dtype is float, integers for int8, uint8, binary, and ubinary)
        • When encoding_format is base64: A Base64-encoded string
        One of:

        Array format (when encoding_format is null)

        Base64-encoded format (when encoding_format is base64)

      • index integer Required

        The index of this embedding in the input list.

    • model string Required

      The name of the model used to generate the embeddings.

    • usage object Required
      Hide usage attribute Show usage attribute object
      • total_tokens integer Required

        The total number of tokens processed across all input texts.

  • 400 application/json

    Invalid Request

    Hide response attribute Show response attribute object
    • detail string

      The request is invalid. This error can occur due to invalid JSON, invalid parameter types, incorrect data types, batch size too large, total tokens exceeding the limit, or tokens in an example exceeding context length.

  • 401 application/json

    Unauthorized

    Hide response attribute Show response attribute object
    • detail string

      Invalid authentication. Ensure your model API key is correctly specified in the Authorization header as Bearer VOYAGE_API_KEY.

  • 403 application/json

    Forbidden

    Hide response attribute Show response attribute object
    • detail string

      Access forbidden. This may occur if the IP address you are sending the request from is not allowed.

  • 429 application/json

    Rate Limit Exceeded

    Hide response attribute Show response attribute object
    • detail string

      Rate limit exceeded. Your request frequency or token usage is too high. Reduce your request rate or wait before retrying.

  • 500 application/json

    Internal Server Error

    Hide response attribute Show response attribute object
    • detail string

      An unexpected error occurred on the server. Retry your request after a brief wait.

  • 502 application/json

    Bad Gateway

    Hide response attribute Show response attribute object
    • detail string

      The server received an invalid response from an upstream server. Retry your request after a brief wait.

  • 503 application/json

    Service Unavailable

    Hide response attribute Show response attribute object
    • detail string

      The service is temporarily unavailable due to high traffic or maintenance. Retry your request after a brief wait.

  • 504 application/json

    Gateway Timeout

    Hide response attribute Show response attribute object
    • detail string

      The server did not receive a timely response from an upstream server. Retry your request after a brief wait.

POST /embeddings
curl \
 --request POST 'https://ai.mongodb.com/v1/embeddings' \
 --header "Authorization: Bearer $ACCESS_TOKEN" \
 --header "Content-Type: application/json" \
 --data '{"input":"string","model":"voyage-context-3","input_type":"query","truncation":true,"output_dimension":256,"output_dtype":"float","encoding_format":"base64"}'
Request examples
{
  "input": "string",
  "model": "voyage-context-3",
  "input_type": "query",
  "truncation": true,
  "output_dimension": 256,
  "output_dtype": "float",
  "encoding_format": "base64"
}
Response examples (200)
{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [
        42.0
      ],
      "index": 42
    }
  ],
  "model": "string",
  "usage": {
    "total_tokens": 42
  }
}
Response examples (400)
{
  "detail": "string"
}
Response examples (401)
{
  "detail": "string"
}
Response examples (403)
{
  "detail": "string"
}
Response examples (429)
{
  "detail": "string"
}
Response examples (500)
{
  "detail": "string"
}
Response examples (502)
{
  "detail": "string"
}
Response examples (503)
{
  "detail": "string"
}
Response examples (504)
{
  "detail": "string"
}