# Create multimodal embeddings **POST /multimodalembeddings** Creates vector embeddings for multimodal inputs consisting of text, images, or a combination of both. This endpoint accepts inputs that can contain text and images in any combination and returns their vector representations. ## Servers - https://ai.mongodb.com/v1: https://ai.mongodb.com/v1 () ## Authentication methods - Api key auth ## Parameters ### Body: application/json (object) - **inputs** (array[object]) A list of multimodal inputs to be vectorized. A single input in the list is a dictionary containing a single key `"content"`, whose value represents a sequence of text and images. - The value of `"content"` is a list of dictionaries, each representing a single piece of text or image. The dictionaries have four possible keys: 1. **type**: Specifies the type of the piece of the content. Allowed values are `text`, `image_url`, or `image_base64`. 2. **text**: Only present when `type` is `text`. The value should be a text string. 3. **image_base64**: Only present when `type` is `image_base64`. The value should be a Base64-encoded image in the [data URL](https://developer.mozilla.org/en-US/docs/Web/URI/Schemes/data) format `data:[];base64,`. Currently supported `mediatypes` are: `image/png`, `image/jpeg`, `image/webp`, and `image/gif`. 4. **image_url**: Only present when `type` is `image_url`. The value should be a URL linking to the image. We support PNG, JPEG, WEBP, and GIF images. - **Note**: Only one of the keys, `image_base64` or `image_url`, should be present in each dictionary for image data. Consistency is required within a request, meaning each request should use either `image_base64` or `image_url` exclusively for images, not both. **Example payload where `inputs` contains an image as a URL:** The `inputs` list contains a single input, which consists of a piece of text and an image (which is provided via a URL). ```json { "inputs": [ { "content": [ { "type": "text", "text": "This is a banana." }, { "type": "image_url", "image_url": "https://raw.githubusercontent.com/voyage-ai/voyage-multimodal-3/refs/heads/main/images/banana.jpg" } ] } ], "model": "voyage-multimodal-3.5" } ``` **Example payload where `inputs` contains a Base64 image:** Below is an equivalent example to the one above where the image content is a Base64 image instead of a URL. (Base64 images can be lengthy, so the example only shows a shortened version.) ```json { "inputs": [ { "content": [ { "type": "text", "text": "This is a banana." }, { "type": "image_base64", "image_base64": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAA..." } ] } ], "model": "voyage-multimodal-3.5" } ``` **The following constraints apply to the `inputs` list:** - The list must not contain more than 1000 inputs. - Each image must not contain more than 16 million pixels or be larger than 20 MB in size. - With every 560 pixels of an image being counted as a token, each input in the list must not exceed 32,000 tokens, and the total number of tokens across all inputs must not exceed 320,000. - **model** (string) The multimodal embedding model to use. Recommended model: `voyage-multimodal-3.5`. - **input_type** () Type of the input. Defaults to `null`. Other options: `query`, `document`. - When `input_type` is `null`, the embedding model directly converts the `inputs` into numerical vectors. For retrieval or search purposes, where a "query", which can be text or image in this case, searches for relevant information among a collection of data referred to as "documents," specify whether your `inputs` are queries or documents by setting `input_type` to `query` or `document`, respectively. In these cases, Voyage automatically prepends a prompt to your `inputs` before vectorizing them, creating vectors more tailored for retrieval or search tasks. Since inputs can be multimodal, "queries" and "documents" can be text, images, or an interleaving of both modalities. Embeddings generated with and without the `input_type` argument are compatible. - For transparency, the following prompts are prepended to your input: - For `query`, the prompt is _"Represent the query for retrieving supporting documents: "._ - For `document`, the prompt is _"Represent the document for retrieval: "._ - **truncation** (boolean) Whether to truncate the inputs to fit within the context length. Defaults to `true`. - If `true`, over-length inputs are truncated to fit within the context length before vectorization by the embedding model. If the truncation happens in the middle of an image, the entire image is discarded. - If `false`, an error occurs if any input exceeds the context length. - **output_encoding** () Format in which the embeddings are encoded. Defaults to `null`. - If `null`, the embeddings are represented as a list of floating-point numbers. - If `base64`, the embeddings are represented as a Base64-encoded NumPy array of single-precision floats. ## Responses ### 200 Success #### Body: application/json (object) - **object** (string) The object type, which is always `list`. - **data** (array[object]) An array of embedding objects. - **model** (string) Name of the model. - **usage** (object) ### 400 Invalid Request #### Body: application/json (object) - **detail** (string) The request is invalid. This error can occur due to invalid JSON, invalid parameter types, incorrect data types, batch size too large, total tokens exceeding the limit, or tokens in an example exceeding context length. ### 401 Unauthorized #### Body: application/json (object) - **detail** (string) Invalid authentication. Ensure your model API key is correctly specified in the Authorization header as `Bearer VOYAGE_API_KEY`. ### 403 Forbidden #### Body: application/json (object) - **detail** (string) Access forbidden. This may occur if the IP address you are sending the request from is not allowed. ### 429 Rate Limit Exceeded #### Body: application/json (object) - **detail** (string) Rate limit exceeded. Your request frequency or token usage is too high. Reduce your request rate or wait before retrying. ### 500 Internal Server Error #### Body: application/json (object) - **detail** (string) An unexpected error occurred on the server. Retry your request after a brief wait. ### 502 Bad Gateway #### Body: application/json (object) - **detail** (string) The server received an invalid response from an upstream server. Retry your request after a brief wait. ### 503 Service Unavailable #### Body: application/json (object) - **detail** (string) The service is temporarily unavailable due to high traffic or maintenance. Retry your request after a brief wait. ### 504 Gateway Timeout #### Body: application/json (object) - **detail** (string) The server did not receive a timely response from an upstream server. Retry your request after a brief wait. [Powered by Bump.sh](https://bump.sh)