Large Language Models

Browse our supported open source models and deploy in dedicated endpoints

New
qwen3-next-80b-a3b-instructQwen3 Next 80B A3B Instruct$0.15/M In$1.5/M Out65536 ContextLLMServerless
New
qwen3-next-80b-a3b-thinkingQwen3 Next 80B A3B Thinking$0.15/M In$1.5/M Out65536 ContextLLMServerless
New
MoonshotAI
Kimi K2 0905$0.6/M In$2.5/M Out262144 ContextLLMServerless
New
deepseek-v3.1DeepSeek V3.1$0.27/M In$1/M Out163840 ContextLLMServerless
qwen3-coder-480b-a35b-instructQwen3 Coder 480B A35B Instruct$0.29/M In$1.2/M Out262144 ContextLLMServerless
OpenAI
OpenAI GPT OSS 120B$0.1/M In$0.5/M Out131072 ContextLLMServerless
MoonshotAI
Kimi K2 Instruct$0.57/M In$2.3/M Out131072 ContextLLMServerless
Hot
deepseek-v3-0324DeepSeek V3 0324$0.28/M In$1.14/M Out163840 ContextLLMServerless
glm-4.5GLM-4.5$0.6/M In$2.2/M Out131072 ContextLLMServerless
qwen3-235b-a22b-thinking-2507Qwen3 235B A22b Thinking 2507$0.3/M In$3/M Out131072 ContextLLMServerless
llama-3.1-8b-instructLlama 3.1 8B Instruct$0.02/M In$0.05/M Out16384 ContextLLMServerless
New
gemma-3-12b-itGemma3 12B$0.05/M In$0.1/M Out131072 ContextLLMServerless
glm-4.5vGLM 4.5V$0.6/M In$1.8/M Out65536 ContextLLMServerless
OpenAI
OpenAI: GPT OSS 20B$0.05/M In$0.2/M Out131072 ContextLLMServerless
qwen3-235b-a22b-instruct-2507Qwen3 235B A22B Instruct 2507$0.15/M In$0.8/M Out131072 ContextLLMServerless
deepseek-r1-distill-qwen-14bDeepSeek R1 Distill Qwen 14B$0.15/M In$0.15/M Out64000 ContextLLMServerless
llama-3.3-70b-instructLlama 3.3 70B Instruct$0.13/M In$0.39/M Out131072 ContextLLMServerless
qwen-2.5-72b-instructQwen 2.5 72B Instruct$0.38/M In$0.4/M Out32000 ContextLLMServerless
Mistral
Mistral Nemo$0.04/M In$0.17/M Out60288 ContextLLMServerless
minimax-m1-80kMiniMax M1$0.55/M In$2.2/M Out1000000 ContextLLMServerless
deepseek-r1-0528DeepSeek R1 0528$0.7/M In$2.5/M Out163840 ContextLLMServerless
deepseek-r1-distill-qwen-32bDeepSeek R1 Distill Qwen 32B$0.3/M In$0.3/M Out64000 ContextLLMServerless
llama-3-8b-instructLlama 3 8B Instruct$0.04/M In$0.04/M Out8192 ContextLLMServerless
Azure
Wizardlm 2 8x22B$0.62/M In$0.62/M Out65535 ContextLLMServerless
Dedicated
deepseek-r1-0528-qwen3-8bDeepSeek R1 0528 Qwen3 8B$0.06/M In$0.09/M Out128000 ContextLLMServerless
Dedicated
deepseek-r1-distill-llama-8bDeepSeek R1 Distill Llama 8B$0.04/M In$0.04/M Out32000 ContextLLMServerless
deepseek-r1-distill-llama-70bDeepSeek R1 Distill LLama 70B$0.8/M In$0.8/M Out32000 ContextLLMServerless
Mistral
Mistral 7B Instruct$0.029/M In$0.059/M Out32768 ContextLLMServerless
llama-3-70b-instructLlama3 70B Instruct$0.51/M In$0.74/M Out8192 ContextLLMServerless
qwen3-235b-a22b-fp8Qwen3 235B A22B$0.2/M In$0.8/M Out40960 ContextLLMServerless
llama-4-maverick-17b-128e-instruct-fp8Llama 4 Maverick Instruct$0.17/M In$0.85/M Out1048576 ContextLLMServerless
Dedicated
llama-4-scout-17b-16e-instructLlama 4 Scout Instruct$0.1/M In$0.5/M Out131072 ContextLLMServerless
hermes-2-pro-llama-3-8bHermes 2 Pro Llama 3 8B$0.14/M In$0.14/M Out8192 ContextLLMServerless
qwen2.5-vl-72b-instructQwen2.5 VL 72B Instruct$0.8/M In$0.8/M Out32768 ContextLLMServerless
l3-70b-euryale-v2.1L3 70B Euryale V2.1 $1.48/M In$1.48/M Out8192 ContextLLMServerless
Mistral
Dolphin Mixtral 8x22B$0.9/M In$0.9/M Out16000 ContextLLMServerless
MMidnight Rose 70B$0.8/M In$0.8/M Out4096 ContextLLMServerless
l3-8b-lunarisSao10k L3 8B Lunaris $0.05/M In$0.05/M Out8192 ContextLLMServerless
Baichuan
BaiChuan M2 32B$0.07/M In$0.07/M Out131072 ContextLLMServerless
glm-4.1v-9b-thinkingGLM 4.1V 9B Thinking$0.035/M In$0.138/M Out65536 ContextLLMServerless
Wenxin
ERNIE 4.5 VL 424B A47B$0.42/M In$1.25/M Out123000 ContextLLMServerless
Wenxin
ERNIE 4.5 300B A47B$0.28/M In$1.1/M Out123000 ContextLLMServerless
deepseek-prover-v2-671bDeepseek Prover V2 671B$0.7/M In$2.5/M Out160000 ContextLLMServerless
qwen3-32b-fp8Qwen3 32B$0.1/M In$0.45/M Out40960 ContextLLMServerless
qwen3-30b-a3b-fp8Qwen3 30B A3B$0.09/M In$0.45/M Out40960 ContextLLMServerless
gemma-3-27b-itGemma 3 27B$0.119/M In$0.2/M Out32768 ContextLLMServerless
deepseek-v3-turboDeepSeek V3 (Turbo) $0.4/M In$1.3/M Out64000 ContextLLMServerless
deepseek-r1-turboDeepSeek R1 (Turbo) $0.7/M In$2.5/M Out64000 ContextLLMServerless
L3-8B-Stheno-v3.2L3 8B Stheno V3.2$0.05/M In$0.05/M Out8192 ContextLLMServerless
MMythomax L2 13B$0.09/M In$0.09/M Out4096 ContextLLMServerless
New
qwen-mt-plusQwen MT Plus$0.25/M In$0.75/M Out4096 ContextLLMServerless
Wenxin
ERNIE 4.5 VL 28B A3B$0.14/M In$0.56/M Out30000 ContextLLMServerless
Wenxin
ERNIE 4.5 21B A3B$0.07/M In$0.28/M Out120000 ContextLLMServerless
Wenxin
ERNIE 4.5 0.3B$0/M In$0/M Out120000 ContextLLMServerless
Free
gemma-3-1b-itGemma3 1B IT$0/M In$0/M Out32768 ContextLLMServerless
Dedicated
qwen3-8b-fp8Qwen3 8B$0.035/M In$0.138/M Out128000 ContextLLMServerless
Free
qwen3-4b-fp8Qwen3 4B$0/M In$0/M Out128000 ContextLLMServerless
glm-4-32b-0414GLM-4-32B-0414$0.55/M In$1.66/M Out32000 ContextLLMServerless
qwen2.5-7b-instructQwen2.5 7B Instruct$0.07/M In$0.07/M Out32000 ContextLLMServerless
Free
llama-3.2-1b-instructLlama 3.2 1B Instruct $0/M In$0/M Out131000 ContextLLMServerless
llama-3.2-3b-instructLlama 3.2 3B Instruct$0.03/M In$0.05/M Out32768 ContextLLMServerless
l31-70b-euryale-v2.2L31 70B Euryale V2.2$1.48/M In$1.48/M Out8192 ContextLLMServerless

Dedicated Endpoint

Enterprise-Grade Infrastructure for AI

For enterprises that require higher performance, tailored SLAs, or private hosting for custom models
  • Custom pricing
  • Guaranteed uptime & latency
  • Unlimited scale
  • Dedicated clusters
Get Enterprise-Grade Endpoint
de-banner