Large Language Models

Browse our supported open source models and deploy in dedicated endpoints

Create a New Endpoint

New

GLM-4.7-Flash

$0.07/MtInput

$0.01/MtCache Read

$0.4/MtOutput

200000Context

128000Max Output

LLMServerless

Hot

Deepseek V3.2

$0.269/MtInput

$0.1345/MtCache Read

$0.4/MtOutput

163840Context

65536Max Output

LLMServerless

New

Qwen3.5-27B

$0.3/MtInput

$2.4/MtOutput

262144Context

65536Max Output

LLMServerless

New

Qwen3.5-122B-A10B

$0.4/MtInput

$3.2/MtOutput

262144Context

65536Max Output

LLMServerless

New

Qwen3.5-35B-A3B

$0.25/MtInput

$2/MtOutput

262144Context

65536Max Output

LLMServerless

New

Qwen3.5-397B-A17B

$0.6/MtInput

$3.6/MtOutput

262144Context

65536Max Output

LLMServerless

New

MiniMax M2.5

$0.3/MtInput

$0.03/MtCache Read

$1.2/MtOutput

204800Context

131100Max Output

LLMServerless

New

GLM-5

$1/MtInput

$0.2/MtCache Read

$3.2/MtOutput

202800Context

131072Max Output

LLMServerless

New

Qwen3 Coder Next

$0.2/MtInput

$1.5/MtOutput

262144Context

65536Max Output

LLMServerless

New

DeepSeek-OCR 2

$0.03/MtInput

$0.03/MtOutput

8192Context

8192Max Output

LLMServerless

New

Kimi K2.5

$0.6/MtInput

$0.1/MtCache Read

$3/MtOutput

262144Context

262144Max Output

LLMServerless

New

Minimax M2.1

$0.3/MtInput

$0.03/MtCache Read

$1.2/MtOutput

204800Context

131072Max Output

LLMServerless

New

GLM-4.7

$0.6/MtInput

$0.11/MtCache Read

$2.2/MtOutput

204800Context

131072Max Output

LLMServerless

New

MXiaomiMiMo/MiMo-V2-Flash

$0.1/MtInput

$0.02/MtCache Read

$0.3/MtOutput

262144Context

32000Max Output

LLMServerless

New

AutoGLM-Phone-9B-Multilingual

$0.035/MtInput

$0.138/MtOutput

65536Context

65536Max Output

LLMServerless

New

Kimi K2 Thinking

$0.6/MtInput

$0.15/MtCache Read

$2.5/MtOutput

262144Context

262144Max Output

LLMServerless

New

MiniMax-M2

$0.3/MtInput

$0.03/MtCache Read

$1.2/MtOutput

204800Context

131072Max Output

LLMServerless

PaddleOCR-VL

$0.02/MtInput

$0.02/MtOutput

16384Context

16384Max Output

LLM

New

Deepseek V3.2 Exp

$0.27/MtInput

$0.41/MtOutput

163840Context

65536Max Output

LLMServerless

New

Qwen3 VL 235B A22B Thinking

$0.98/MtInput

$3.95/MtOutput

131072Context

32768Max Output

LLMServerless

New

GLM 4.6V

$0.3/MtInput

$0.055/MtCache Read

$0.9/MtOutput

131072Context

32768Max Output

LLMServerless

New

GLM 4.6

$0.55/MtInput

$0.11/MtCache Read

$2.2/MtOutput

204800Context

131072Max Output

LLMServerless

Kat Coder Pro

$0.3/MtInput

$0.06/MtCache Read

$1.2/MtOutput

256000Context

128000Max Output

LLMServerless

Qwen3 Next 80B A3B Instruct

$0.15/MtInput

$1.5/MtOutput

131072Context

32768Max Output

LLMServerless

Qwen3 Next 80B A3B Thinking

$0.15/MtInput

$1.5/MtOutput

131072Context

32768Max Output

LLMServerless

New

DeepSeek-OCR

$0.03/MtInput

$0.03/MtOutput

8192Context

8192Max Output

LLM

New

Deepseek V3.1 Terminus

$0.27/MtInput

$0.135/MtCache Read

$1/MtOutput

131072Context

32768Max Output

LLMServerless

New

Qwen3 VL 235B A22B Instruct

$0.3/MtInput

$1.5/MtOutput

131072Context

32768Max Output

LLMServerless

Qwen3 Max

$2.11/MtInput

$8.45/MtOutput

262144Context

65536Max Output

Partner

LLMServerless

New

DeepSeek V3.1

$0.27/MtInput

$0.135/MtCache Read

$1/MtOutput

131072Context

32768Max Output

LLMServerless

New

Kimi K2 0905

$0.6/MtInput

$2.5/MtOutput

262144Context

262144Max Output

LLMServerless

Qwen3 Coder 480B A35B Instruct

$0.3/MtInput

$1.3/MtOutput

262144Context

65536Max Output

LLMServerless

New

Qwen3 Coder 30b A3B Instruct

$0.07/MtInput

$0.27/MtOutput

160000Context

32768Max Output

LLMServerless

OpenAI GPT OSS 120B

$0.05/MtInput

$0.25/MtOutput

131072Context

32768Max Output

LLMServerless

Kimi K2 Instruct

$0.57/MtInput

$2.3/MtOutput

131072Context

131072Max Output

LLMServerless

Hot

DeepSeek V3 0324

$0.27/MtInput

$0.135/MtCache Read

$1.12/MtOutput

163840Context

163840Max Output

LLMServerless

GLM-4.5

$0.6/MtInput

$0.11/MtCache Read

$2.2/MtOutput

131072Context

98304Max Output

LLMServerless

Qwen3 235B A22b Thinking 2507

$0.3/MtInput

$3/MtOutput

131072Context

32768Max Output

LLMServerless

Llama 3.1 8B Instruct

$0.02/MtInput

$0.05/MtOutput

16384Context

16384Max Output

LLMServerless

New

Gemma3 12B

$0.05/MtInput

$0.1/MtOutput

131072Context

8192Max Output

LLM

GLM 4.5V

$0.6/MtInput

$0.11/MtCache Read

$1.8/MtOutput

65536Context

16384Max Output

LLMServerless

OpenAI: GPT OSS 20B

$0.04/MtInput

$0.15/MtOutput

131072Context

32768Max Output

LLMServerless

Qwen3 235B A22B Instruct 2507

$0.09/MtInput

$0.58/MtOutput

131072Context

16384Max Output

LLMServerless

DeepSeek R1 Distill Qwen 14B

$0.15/MtInput

$0.15/MtOutput

32768Context

16384Max Output

LLM

Llama 3.3 70B Instruct

$0.135/MtInput

$0.4/MtOutput

131072Context

120000Max Output

LLMServerless

Qwen 2.5 72B Instruct

$0.38/MtInput

$0.4/MtOutput

32000Context

8192Max Output

LLMServerless

Mistral Nemo

$0.04/MtInput

$0.17/MtOutput

60288Context

16000Max Output

LLMServerless

MiniMax M1

$0.55/MtInput

$2.2/MtOutput

1000000Context

40000Max Output

LLMServerless

DeepSeek R1 0528

$0.7/MtInput

$0.35/MtCache Read

$2.5/MtOutput

163840Context

32768Max Output

LLMServerless

DeepSeek R1 Distill Qwen 32B

$0.3/MtInput

$0.3/MtOutput

64000Context

32000Max Output

LLM

Llama 3 8B Instruct

$0.04/MtInput

$0.04/MtOutput

8192Context

8192Max Output

LLMServerless

Wizardlm 2 8x22B

$0.62/MtInput

$0.62/MtOutput

65535Context

8000Max Output

LLMServerless

Dedicated

DeepSeek R1 0528 Qwen3 8B

$0.06/MtInput

$0.09/MtOutput

128000Context

32000Max Output

LLM

DeepSeek R1 Distill LLama 70B

$0.8/MtInput

$0.8/MtOutput

8192Context

8192Max Output

LLMServerless

Llama3 70B Instruct

$0.51/MtInput

$0.74/MtOutput

8192Context

8000Max Output

LLMServerless

Qwen3 235B A22B

$0.2/MtInput

$0.8/MtOutput

40960Context

20000Max Output

LLMServerless

Llama 4 Maverick Instruct

$0.27/MtInput

$0.85/MtOutput

1048576Context

8192Max Output

LLMServerless

Dedicated

Llama 4 Scout Instruct

$0.18/MtInput

$0.59/MtOutput

131072Context

131072Max Output

LLMServerless

Hermes 2 Pro Llama 3 8B

$0.14/MtInput

$0.14/MtOutput

8192Context

8192Max Output

LLMServerless

Qwen2.5 VL 72B Instruct

$0.8/MtInput

$0.8/MtOutput

32768Context

32768Max Output

LLMServerless

L3 70B Euryale V2.1

$1.48/MtInput

$1.48/MtOutput

8192Context

8192Max Output

LLMServerless

ERNIE-4.5-21B-A3B-Thinking

$0.07/MtInput

$0.28/MtOutput

131072Context

65536Max Output

LLMServerless

Sao10k L3 8B Lunaris

$0.05/MtInput

$0.05/MtOutput

8192Context

8192Max Output

LLMServerless

BaiChuan M2 32B

$0.07/MtInput

$0.07/MtOutput

131072Context

131072Max Output

LLM

ERNIE 4.5 VL 424B A47B

$0.42/MtInput

$1.25/MtOutput

123000Context

16000Max Output

LLMServerless

ERNIE 4.5 300B A47B

$0.28/MtInput

$1.1/MtOutput

123000Context

12000Max Output

LLMServerless

Deepseek Prover V2 671B

$0.7/MtInput

$2.5/MtOutput

160000Context

160000Max Output

LLMServerless

Qwen3 32B

$0.1/MtInput

$0.45/MtOutput

40960Context

20000Max Output

LLMServerless

Qwen3 30B A3B

$0.09/MtInput

$0.45/MtOutput

40960Context

20000Max Output

LLMServerless

Gemma 3 27B

$0.119/MtInput

$0.2/MtOutput

98304Context

16384Max Output

LLMServerless

DeepSeek V3 (Turbo)

$0.4/MtInput

$1.3/MtOutput

64000Context

16000Max Output

LLMServerless

DeepSeek R1 (Turbo)

$0.7/MtInput

$2.5/MtOutput

64000Context

16000Max Output

LLMServerless

L3 8B Stheno V3.2

$0.05/MtInput

$0.05/MtOutput

8192Context

32000Max Output

LLMServerless

MMythomax L2 13B

$0.09/MtInput

$0.09/MtOutput

4096Context

3200Max Output

LLM

ERNIE-4.5-VL-28B-A3B-Thinking

$0.39/MtInput

$0.39/MtOutput

131072Context

65536Max Output

LLMServerless

qwen/qwen3-vl-8b-instruct

$0.08/MtInput

$0.5/MtOutput

131072Context

32768Max Output

LLMServerless

zai-org/glm-4.5-air

$0.13/MtInput

$0.025/MtCache Read

$0.85/MtOutput

131072Context

98304Max Output

LLMServerless

qwen/qwen3-vl-30b-a3b-instruct

$0.2/MtInput

$0.7/MtOutput

131072Context

32768Max Output

LLMServerless

qwen/qwen3-vl-30b-a3b-thinking

$0.2/MtInput

$1/MtOutput

131072Context

32768Max Output

LLMServerless

Qwen3 Omni 30B A3B Thinking

$0.25/MtInput

$0.97/MtOutput

65536Context

16384Max Output

LLMServerless

Qwen3 Omni 30B A3B Instruct

$0.25/MtInput

$0.97/MtOutput

65536Context

16384Max Output

LLMServerless

New

Qwen MT Plus

$0.25/MtInput

$0.75/MtOutput

16384Context

8192Max Output

LLMServerless

ERNIE 4.5 VL 28B A3B

$0.14/MtInput

$0.56/MtOutput

30000Context

8000Max Output

LLMServerless

ERNIE 4.5 21B A3B

$0.07/MtInput

$0.28/MtOutput

120000Context

8000Max Output

LLMServerless

Dedicated

Qwen3 8B

$0.035/MtInput

$0.138/MtOutput

128000Context

20000Max Output

LLM

Qwen3 4B

$0.03/MtInput

$0.03/MtOutput

128000Context

20000Max Output

LLMServerless

Qwen2.5 7B Instruct

$0.07/MtInput

$0.07/MtOutput

32000Context

32000Max Output

LLMServerless

Llama 3.2 3B Instruct

$0.03/MtInput

$0.05/MtOutput

32768Context

32000Max Output

LLM

L31 70B Euryale V2.2

$1.48/MtInput

$1.48/MtOutput

8192Context

8192Max Output

LLMServerless

Large Language Models

Dedicated Endpoint