POST /rerank

Reranks a list of documents based on their relevance to a query.

This endpoint accepts a query and a list of documents, then returns the documents sorted by relevance score in descending order.

application/json

Body Required

  • query string Required

    The search query as a string.

    Maximum query length:

    • 8,000 tokens for rerank-2.5 and rerank-2.5-lite
    • 4,000 tokens for rerank-2
    • 2,000 tokens for rerank-2-lite

    Minimum length is 1.

  • documents array[string] Required

    A list of documents to be reranked, provided as strings.

    Constraints:

    • Maximum number of documents: 1,000
    • Maximum tokens per query + document pair:
      • 32,000 for rerank-2.5 and rerank-2.5-lite
      • 16,000 for rerank-2
      • 8,000 for rerank-2-lite
    • Maximum total tokens (query tokens × number of documents + sum of all document tokens):
      • 600K for rerank-2.5, rerank-2.5-lite, rerank-2, and rerank-2-lite

    At least 1 but not more than 1000 elements. Minimum length of each is 1.

  • model string Required

    The reranking model to use. Recommended models: rerank-2.5, rerank-2.5-lite.

    Values are rerank-2.5, rerank-2.5-lite, rerank-2, or rerank-2-lite.

  • top_k integer | null

    The number of most relevant documents to return. If not specified, all documents are returned with their reranking scores.

    Minimum value is 1.

  • return_documents boolean

    Whether to include the document text in the response.

    • false (default): Returns only {"index", "relevance_score"} for each document
    • true: Returns {"index", "document", "relevance_score"} for each document

    Default value is false.

  • truncation boolean

    Whether to truncate inputs that exceed the context length limit.

    • true (default): The query and documents are automatically truncated to fit within the context length limit.
    • false: An error is returned if the query or any query-document pair exceeds the context length limit.

    Default value is true.

Responses

  • 200 application/json

    Success

    Hide response attributes Show response attributes object
    • object string Required

      The object type. Always returns "list".

      Value is list.

    • data array[object] Required

      An array of reranking results, sorted by relevance score in descending order.

      Hide data attributes Show data attributes object
      • index integer Required

        The index of the document in the original input list.

      • relevance_score number Required

        The relevance score of the document with respect to the query.

      • document string

        The document text. Only included when return_documents is set to true.

    • model string Required

      The name of the model used for reranking.

    • usage object Required
      Hide usage attribute Show usage attribute object
      • total_tokens integer Required

        The total number of tokens processed for the reranking operation.

  • 400 application/json

    Invalid Request

    Hide response attribute Show response attribute object
    • detail string

      The request is invalid. This error can occur due to invalid JSON, invalid parameter types, incorrect data types, batch size too large, total tokens exceeding the limit, or tokens in an example exceeding context length.

  • 401 application/json

    Unauthorized

    Hide response attribute Show response attribute object
    • detail string

      Invalid authentication. Ensure your model API key is correctly specified in the Authorization header as Bearer VOYAGE_API_KEY.

  • 403 application/json

    Forbidden

    Hide response attribute Show response attribute object
    • detail string

      Access forbidden. This may occur if the IP address you are sending the request from is not allowed.

  • 429 application/json

    Rate Limit Exceeded

    Hide response attribute Show response attribute object
    • detail string

      Rate limit exceeded. Your request frequency or token usage is too high. Reduce your request rate or wait before retrying.

  • 500 application/json

    Internal Server Error

    Hide response attribute Show response attribute object
    • detail string

      An unexpected error occurred on the server. Retry your request after a brief wait.

  • 502 application/json

    Bad Gateway

    Hide response attribute Show response attribute object
    • detail string

      The server received an invalid response from an upstream server. Retry your request after a brief wait.

  • 503 application/json

    Service Unavailable

    Hide response attribute Show response attribute object
    • detail string

      The service is temporarily unavailable due to high traffic or maintenance. Retry your request after a brief wait.

  • 504 application/json

    Gateway Timeout

    Hide response attribute Show response attribute object
    • detail string

      The server did not receive a timely response from an upstream server. Retry your request after a brief wait.

POST /rerank
curl \
 --request POST 'https://ai.mongodb.com/v1/rerank' \
 --header "Authorization: Bearer $ACCESS_TOKEN" \
 --header "Content-Type: application/json" \
 --data '{"query":"string","documents":["string"],"model":"rerank-2.5","top_k":42,"return_documents":false,"truncation":true}'
Request examples
{
  "query": "string",
  "documents": [
    "string"
  ],
  "model": "rerank-2.5",
  "top_k": 42,
  "return_documents": false,
  "truncation": true
}
Response examples (200)
{
  "object": "list",
  "data": [
    {
      "index": 42,
      "relevance_score": 42.0,
      "document": "string"
    }
  ],
  "model": "string",
  "usage": {
    "total_tokens": 42
  }
}
Response examples (400)
{
  "detail": "string"
}
Response examples (401)
{
  "detail": "string"
}
Response examples (403)
{
  "detail": "string"
}
Response examples (429)
{
  "detail": "string"
}
Response examples (500)
{
  "detail": "string"
}
Response examples (502)
{
  "detail": "string"
}
Response examples (503)
{
  "detail": "string"
}
Response examples (504)
{
  "detail": "string"
}