Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,13 @@ Our API supports the most popular models for [Chat](/generative-apis/how-to/quer

### Chat and Vision models

| Provider | Model string | Context window (Tokens) | Maximum output (Tokens)| License | Model card |
| Provider | Model string | Context window (Tokens) | Maximum output (Tokens)| License \* | Model card |
|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|
| Google (Preview) | `gemma-3-27b-it` | 40k | 8192 | [Gemma](https://ai.google.dev/gemma/terms) | [HF](https://huggingface.co/google/gemma-3-27b-it) |
| Mistral | `mistral-small-3.2-24b-instruct-2506` | 128k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Mistral-Small-3.2-24B-Instruct-2506) |
| H | `holo2-30b-a3b` | 22k | 8192 | [CC-BY-NC-4.0](https://spdx.org/licenses/CC-BY-NC-4.0) | [HF](https://huggingface.co/Hcompany/Holo2-30B-A3B) |

\*Licences which are not open-weight and may restrict commercial usage (such as `CC-BY-NC-4.0`), do not apply to usage through Scaleway Products due to existing partnerships between Scaleway and the corresponding providers. Original licences are provided for transparency only.

### Chat and Audio models

Expand Down
18 changes: 17 additions & 1 deletion pages/managed-inference/reference-content/model-catalog.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ A quick overview of available models in Scaleway's catalog and their core attrib

## Models technical summary

| Model name | Provider | Maximum Context length (tokens) | Modalities | Compatible Instances (Max Context in tokens\*) | License |
| Model name | Provider | Maximum Context length (tokens) | Modalities | Compatible Instances (Max Context in tokens\*) | License \** |
|------------|----------|--------------|------------|-----------|---------|
| [`gpt-oss-120b`](#gpt-oss-120b) | OpenAI | 128k | Text | H100 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
| [`whisper-large-v3`](#whisper-large-v3) | OpenAI | - | Audio transcription | L4, L40S, H100, H100-SXM-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
Expand All @@ -38,13 +38,15 @@ A quick overview of available models in Scaleway's catalog and their core attrib
| [`devstral-small-2505`](#devstral-small-2505) | Mistral | 128k | Text | H100, H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
| [`pixtral-12b-2409`](#pixtral-12b-2409) | Mistral | 128k | Text, Vision | L40S (50k), H100, H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
| [`molmo-72b-0924`](#molmo-72b-0924) | Allen AI | 50k | Text, Vision | H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) and [Twonyi Qianwen license](https://huggingface.co/Qwen/Qwen2-72B/blob/main/LICENSE)|
| [`holo2-30b-a3b`](#holo2-30b-a3b) | H | 22k | Text, Vision | H100-SXM-2 | [CC-BY-NC-4.0](https://spdx.org/licenses/CC-BY-NC-4.0)|
| [`qwen3-embedding-8b`](#qwen3-embedding-8b) | Qwen | 32k | Embeddings | L4, L40S, H100, H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
| [`qwen3-coder-30b-a3b-instruct`](#qwen3-coder-30b-a3b-instruct) | Qwen | 128k | Code | L40S, H100, H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
| [`qwen2.5-coder-32b-instruct`](#qwen25-coder-32b-instruct) | Qwen | 32k | Code | H100, H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
| [`bge-multilingual-gemma2`](#bge-multilingual-gemma2) | BAAI | 8k | Embeddings | L4, L40S, H100, H100-2 | [Gemma](https://ai.google.dev/gemma/terms) |
| [`sentence-t5-xxl`](#sentence-t5-xxl) | Sentence transformers | 512 | Embeddings | L4 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |

\*Maximum context length is only mentioned when instances VRAM size limits context length. Otherwise, maximum context length is the one defined by the model.
\**Licences which are not open-weight and may restrict commercial usage (such as `CC-BY-NC-4.0`), do not apply to usage through Scaleway Products due to existing partnerships between Scaleway and the corresponding providers. Original licences are provided for transparency only.

## Models feature summary
| Model name | Structured output supported | Function calling | Supported languages |
Expand All @@ -71,6 +73,7 @@ A quick overview of available models in Scaleway's catalog and their core attrib
| `devstral-small-2505` | Yes | Yes | English, French, German, Spanish, Portuguese, Italian, Japanese, Korean, Russian, Chinese, Arabic, Persian, Indonesian, Malay, Nepali, Polish, Romanian, Serbian, Swedish, Turkish, Ukrainian, Vietnamese, Hindi, Bengali |
| `pixtral-12b-2409` | Yes | Yes | English |
| `molmo-72b-0924` | Yes | No | English |
| `holo2-30b-a3b` | Yes | No | English |
| `qwen3-embedding-8b` | No | No | English, French, German, Chinese, Japanese, Korean and 113 additional languages and dialects |
| `qwen3-coder-30b-a3b-instruct` | Yes | Yes | English, French, German, Chinese, Japanese, Korean and 113 additional languages and dialects |
| `qwen2.5-coder-32b-instruct` | Yes | Yes | English, French, Spanish, Portuguese, German, Italian, Russian, Chinese, Japanese, Korean, Vietnamese, Thai, Arabic and 16 additional languages. |
Expand Down Expand Up @@ -160,6 +163,19 @@ It can analyze images and offer insights from visual content alongside text.
```
mistral/pixtral-12b-2409:bf16
```
### Holo2-30b-a3b
Holo2 30B is a text and vision model optimized to analyse Graphical User Interface, such as Web browser or software, and take actions.

| Attribute | Value |
|-----------|-------|
| Supports parallel tool calling | Yes |
| Supported images formats | PNG, JPEG, WEBP, and non-animated GIFs |
| Token dimension (pixels)| 16x16 |

#### Model name
```
hcompany/holo2-30b-a3b:bf16
```

### Molmo-72b-0924
Molmo 72B is the powerhouse of the Molmo family, multimodal models developed by the renowned research lab Allen Institute for AI.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -217,8 +217,9 @@ Generative APIs are rate limited based on:
| qwen3-235b-a22b-instruct-2507 | 200k | 400k |
| qwen3-embedding-8b | 200k | 400k |
| qwen3-coder-30b-a3b-instruct | 200k | 400k |
| qwen2.5-coder-32b-instruct | 200k | 400k |
| gpt-oss-120b | 200k | 400k |
| qwen2.5-coder-32b-instruct | 200k | 400k |
| gpt-oss-120b | 200k | 400k |
| holo2-30b-a3b | 200k | 400k |
| bge-multilingual-gemma2 | 200k | 400k |

| Audio seconds per minute | [Payment method validated](/billing/how-to/add-payment-method/#how-to-add-a-credit-card) | Payment method and [identity validated](/account/how-to/verify-identity/) |
Expand All @@ -244,7 +245,8 @@ Generative APIs are rate limited based on:
| qwen3-coder-30b-a3b-instruct | 300 | 600 |
| qwen2.5-coder-32b-instruct | 300 | 600 |
| gpt-oss-120b | 300 | 600 |
| bge-multilingual-gemma2 | 300 | 600 |
| holo2-30b-a3b | 300 | 600 |
| bge-multilingual-gemma2 | 300 | 600 |
| whisper-large-v3 | 300 | 600 |

| Concurrent requests | [Payment method validated](/billing/how-to/add-payment-method/#how-to-add-a-credit-card) | Payment method and [identity validated](/account/how-to/verify-identity/) |
Expand Down