Large Language Models

Browse our supported open source models and deploy in dedicated endpoints

New
MoonshotAI
Kimi K2 Thinking
$0.6/MtInput
$2.5/MtOutput
262144Context
262144Max Output
LLMServerless
New
minimax-m2MiniMax-M2
$0.3/MtInput
$1.2/MtOutput
204800Context
131072Max Output
LLMServerless
paddleocr-vlPaddleOCR-VL
$0.02/MtInput
$0.02/MtOutput
16384Context
16384Max Output
LLMServerless
New
deepseek-v3.2-expDeepseek V3.2 Exp
$0.27/MtInput
$0.41/MtOutput
163840Context
65536Max Output
LLMServerless
Partner
qwen3-maxQwen3 Max
$2.11/MtInput
$8.45/MtOutput
262144Context
65536Max Output
LLMServerless
New
glm-4.6GLM 4.6
$0.6/MtInput
$0.11/MtCache Read
$2.2/MtOutput
204800Context
131072Max Output
LLMServerless
New
qwen3-vl-235b-a22b-thinkingQwen3 VL 235B A22B Thinking
$0.98/MtInput
$3.95/MtOutput
131072Context
32768Max Output
LLMServerless
qwen3-next-80b-a3b-instructQwen3 Next 80B A3B Instruct
$0.15/MtInput
$1.5/MtOutput
131072Context
32768Max Output
LLMServerless
qwen3-next-80b-a3b-thinkingQwen3 Next 80B A3B Thinking
$0.15/MtInput
$1.5/MtOutput
131072Context
32768Max Output
LLMServerless
New
qwen3-vl-235b-a22b-instructQwen3 VL 235B A22B Instruct
$0.3/MtInput
$1.5/MtOutput
131072Context
32768Max Output
LLMServerless
New
deepseek-ocrDeepSeek-OCR
$0.03/MtInput
$0.03/MtOutput
8192Context
8192Max Output
LLMServerless
New
deepseek-v3.1-terminusDeepseek V3.1 Terminus
$0.27/MtInput
$1/MtOutput
131072Context
65536Max Output
LLMServerless
Free
kat-coderKAT-Coder-Pro V1
$0/MtInput
$0/MtOutput
256000Context
32000Max Output
LLMServerless
New
MoonshotAI
Kimi K2 0905
$0.6/MtInput
$2.5/MtOutput
262144Context
262144Max Output
LLMServerless
New
deepseek-v3.1DeepSeek V3.1
$0.27/MtInput
$1/MtOutput
131072Context
32768Max Output
LLMServerless
qwen3-coder-480b-a35b-instructQwen3 Coder 480B A35B Instruct
$0.29/MtInput
$1.2/MtOutput
262144Context
65536Max Output
LLMServerless
New
qwen3-coder-30b-a3b-instructQwen3 Coder 30b A3B Instruct
$0.07/MtInput
$0.27/MtOutput
262144Context
32768Max Output
LLMServerless
OpenAI
OpenAI GPT OSS 120B
$0.05/MtInput
$0.25/MtOutput
131072Context
32768Max Output
LLMServerless
MoonshotAI
Kimi K2 Instruct
$0.57/MtInput
$2.3/MtOutput
131072Context
131072Max Output
LLMServerless
Hot
deepseek-v3-0324DeepSeek V3 0324
$0.27/MtInput
$0.135/MtCache Read
$1.12/MtOutput
163840Context
163840Max Output
LLMServerless
glm-4.5GLM-4.5
$0.6/MtInput
$0.11/MtCache Read
$2.2/MtOutput
131072Context
98304Max Output
LLMServerless
qwen3-235b-a22b-thinking-2507Qwen3 235B A22b Thinking 2507
$0.3/MtInput
$3/MtOutput
131072Context
32768Max Output
LLMServerless
llama-3.1-8b-instructLlama 3.1 8B Instruct
$0.02/MtInput
$0.05/MtOutput
16384Context
16384Max Output
LLMServerless
New
gemma-3-12b-itGemma3 12B
$0.05/MtInput
$0.1/MtOutput
131072Context
8192Max Output
LLMServerless
glm-4.5vGLM 4.5V
$0.6/MtInput
$0.11/MtCache Read
$1.8/MtOutput
65536Context
16384Max Output
LLMServerless
OpenAI
OpenAI: GPT OSS 20B
$0.04/MtInput
$0.15/MtOutput
131072Context
32768Max Output
LLMServerless
qwen3-235b-a22b-instruct-2507Qwen3 235B A22B Instruct 2507
$0.09/MtInput
$0.58/MtOutput
131072Context
16384Max Output
LLMServerless
deepseek-r1-distill-qwen-14bDeepSeek R1 Distill Qwen 14B
$0.15/MtInput
$0.15/MtOutput
32768Context
16384Max Output
LLMServerless
llama-3.3-70b-instructLlama 3.3 70B Instruct
$0.13/MtInput
$0.39/MtOutput
131072Context
120000Max Output
LLMServerless
qwen-2.5-72b-instructQwen 2.5 72B Instruct
$0.38/MtInput
$0.4/MtOutput
32000Context
8192Max Output
LLMServerless
Mistral
Mistral Nemo
$0.04/MtInput
$0.17/MtOutput
60288Context
16000Max Output
LLMServerless
minimax-m1-80kMiniMax M1
$0.55/MtInput
$2.2/MtOutput
1000000Context
40000Max Output
LLMServerless
deepseek-r1-0528DeepSeek R1 0528
$0.7/MtInput
$0.35/MtCache Read
$2.5/MtOutput
163840Context
32768Max Output
LLMServerless
deepseek-r1-distill-qwen-32bDeepSeek R1 Distill Qwen 32B
$0.3/MtInput
$0.3/MtOutput
64000Context
32000Max Output
LLMServerless
llama-3-8b-instructLlama 3 8B Instruct
$0.04/MtInput
$0.04/MtOutput
8192Context
8192Max Output
LLMServerless
Azure
Wizardlm 2 8x22B
$0.62/MtInput
$0.62/MtOutput
65535Context
8000Max Output
LLMServerless
Dedicated
deepseek-r1-0528-qwen3-8bDeepSeek R1 0528 Qwen3 8B
$0.06/MtInput
$0.09/MtOutput
128000Context
32000Max Output
LLMServerless
deepseek-r1-distill-llama-70bDeepSeek R1 Distill LLama 70B
$0.8/MtInput
$0.8/MtOutput
8192Context
8192Max Output
LLMServerless
llama-3-70b-instructLlama3 70B Instruct
$0.51/MtInput
$0.74/MtOutput
8192Context
8000Max Output
LLMServerless
qwen3-235b-a22b-fp8Qwen3 235B A22B
$0.2/MtInput
$0.8/MtOutput
40960Context
20000Max Output
LLMServerless
llama-4-maverick-17b-128e-instruct-fp8Llama 4 Maverick Instruct
$0.17/MtInput
$0.85/MtOutput
1048576Context
8192Max Output
LLMServerless
Dedicated
llama-4-scout-17b-16e-instructLlama 4 Scout Instruct
$0.1/MtInput
$0.5/MtOutput
131072Context
131072Max Output
LLMServerless
hermes-2-pro-llama-3-8bHermes 2 Pro Llama 3 8B
$0.14/MtInput
$0.14/MtOutput
8192Context
8192Max Output
LLMServerless
qwen2.5-vl-72b-instructQwen2.5 VL 72B Instruct
$0.8/MtInput
$0.8/MtOutput
32768Context
32768Max Output
LLMServerless
l3-70b-euryale-v2.1L3 70B Euryale V2.1
$1.48/MtInput
$1.48/MtOutput
8192Context
8192Max Output
LLMServerless
Wenxin
ERNIE-4.5-21B-A3B-Thinking
$0.07/MtInput
$0.28/MtOutput
131072Context
65536Max Output
LLMServerless
l3-8b-lunarisSao10k L3 8B Lunaris
$0.05/MtInput
$0.05/MtOutput
8192Context
8192Max Output
LLMServerless
Baichuan
BaiChuan M2 32B
$0.07/MtInput
$0.07/MtOutput
131072Context
131072Max Output
LLMServerless
glm-4.1v-9b-thinkingGLM 4.1V 9B Thinking
$0.035/MtInput
$0.138/MtOutput
65536Context
8000Max Output
LLMServerless
Wenxin
ERNIE 4.5 VL 424B A47B
$0.42/MtInput
$1.25/MtOutput
123000Context
16000Max Output
LLMServerless
Wenxin
ERNIE 4.5 300B A47B
$0.28/MtInput
$1.1/MtOutput
123000Context
12000Max Output
LLMServerless
deepseek-prover-v2-671bDeepseek Prover V2 671B
$0.7/MtInput
$2.5/MtOutput
160000Context
160000Max Output
LLMServerless
qwen3-32b-fp8Qwen3 32B
$0.1/MtInput
$0.45/MtOutput
40960Context
20000Max Output
LLMServerless
qwen3-30b-a3b-fp8Qwen3 30B A3B
$0.09/MtInput
$0.45/MtOutput
40960Context
20000Max Output
LLMServerless
gemma-3-27b-itGemma 3 27B
$0.119/MtInput
$0.2/MtOutput
98304Context
16384Max Output
LLMServerless
deepseek-v3-turboDeepSeek V3 (Turbo)
$0.4/MtInput
$1.3/MtOutput
64000Context
16000Max Output
LLMServerless
deepseek-r1-turboDeepSeek R1 (Turbo)
$0.7/MtInput
$2.5/MtOutput
64000Context
16000Max Output
LLMServerless
L3-8B-Stheno-v3.2L3 8B Stheno V3.2
$0.05/MtInput
$0.05/MtOutput
8192Context
32000Max Output
LLMServerless
MMythomax L2 13B
$0.09/MtInput
$0.09/MtOutput
4096Context
32000Max Output
LLMServerless
qwen3-vl-8b-instructqwen/qwen3-vl-8b-instruct
$0.08/MtInput
$0.5/MtOutput
131072Context
32768Max Output
LLMServerless
glm-4.5-airzai-org/glm-4.5-air
$0.13/MtInput
$0.85/MtOutput
131072Context
98304Max Output
LLMServerless
qwen3-vl-30b-a3b-instructqwen/qwen3-vl-30b-a3b-instruct
$0.2/MtInput
$0.7/MtOutput
131072Context
32768Max Output
LLMServerless
qwen3-vl-30b-a3b-thinkingqwen/qwen3-vl-30b-a3b-thinking
$0.2/MtInput
$1/MtOutput
131072Context
32768Max Output
LLMServerless
New
qwen-mt-plusQwen MT Plus
$0.25/MtInput
$0.75/MtOutput
4096Context
2048Max Output
LLMServerless
Wenxin
ERNIE 4.5 VL 28B A3B
$0.14/MtInput
$0.56/MtOutput
30000Context
8000Max Output
LLMServerless
Wenxin
ERNIE 4.5 21B A3B
$0.07/MtInput
$0.28/MtOutput
120000Context
8000Max Output
LLMServerless
Dedicated
qwen3-8b-fp8Qwen3 8B
$0.035/MtInput
$0.138/MtOutput
128000Context
20000Max Output
LLMServerless
qwen3-4b-fp8Qwen3 4B
$0.03/MtInput
$0.03/MtOutput
128000Context
20000Max Output
LLMServerless
qwen2.5-7b-instructQwen2.5 7B Instruct
$0.07/MtInput
$0.07/MtOutput
32000Context
32000Max Output
LLMServerless
llama-3.2-3b-instructLlama 3.2 3B Instruct
$0.03/MtInput
$0.05/MtOutput
32768Context
32000Max Output
LLMServerless
l31-70b-euryale-v2.2L31 70B Euryale V2.2
$1.48/MtInput
$1.48/MtOutput
8192Context
8192Max Output
LLMServerless

Dedicated Endpoint

Enterprise-Grade Infrastructure for AI

For enterprises that require higher performance, tailored SLAs, or private hosting for custom models
  • Custom pricing
  • Guaranteed uptime & latency
  • Unlimited scale
  • Dedicated clusters
Get Enterprise-Grade Endpoint
de-banner