Large Language Models

deepseek-prover-v2-671bDeepseek Prover V2 671BChat$0.7/2.5 in/out MTokens | 160000 Context
qwen3-235b-a22b-fp8Qwen3-235B-A22BChat$0.2/0.8 in/out MTokens | 128000 Context
qwen3-30b-a3b-fp8Qwen3 30B A3BChat$0.1/0.45 in/out MTokens | 128000 Context
qwen3-32b-fp8Qwen3 32BChat$0.1/0.45 in/out MTokens | 128000 Context
deepseek-v3-0324DeepSeek V3 0324Chat$0.33/1.3 in/out MTokens | 128000 Context
qwen2.5-vl-72b-instructQwen:Qwen2.5-vl-72b-instructChat$0.8/0.8 in/out MTokens | 96000 Context
deepseek-v3-turboDeepSeek V3 (Turbo) Chat$0.4/1.3 in/out MTokens | 64000 Context
deepseek-r1-turboDeepSeek R1 (Turbo) Chat$0.7/2.5 in/out MTokens | 64000 Context
llama-4-maverick-17b-128e-instruct-fp8Llama 4 Maverick InstructChat$0.17/0.85 in/out MTokens | 1048576 Context
GGemma 3 27BChat$0.119/0.2 in/out MTokens | 32000 Context
QQwen: QwQ 32BChat$0.18/0.2 in/out MTokens | 32768 Context
LL3 8B Stheno V3.2Chat$0.05/0.05 in/out MTokens | 8192 Context
MMythomax L2 13BChat$0.09/0.09 in/out MTokens | 4096 Context
llama-4-scout-17b-16e-instructLlama 4 Scout InstructChat$0.1/0.5 in/out MTokens | 131072 Context
deepseek-r1-distill-llama-8bDeepSeek: DeepSeek R1 Distill Llama 8BChat$0.04/0.04 in/out MTokens | 32000 Context
deepseek_v3DeepSeek V3Chat$0.89/0.89 in/out MTokens | 64000 Context
llama-3.1-8b-instructLlama 3.1 8B InstructChat$0.02/0.05 in/out MTokens | 16384 Context
deepseek-r1-distill-qwen-14bDeepSeek: DeepSeek R1 Distill Qwen 14BChat$0.15/0.15 in/out MTokens | 64000 Context
llama-3.3-70b-instructLlama 3.3 70B InstructChat$0.13/0.39 in/out MTokens | 131072 Context
qwen-2.5-72b-instructQwen 2.5 72B InstructChat$0.38/0.4 in/out MTokens | 32000 Context
MMistral NemoChat$0.04/0.17 in/out MTokens | 131072 Context
deepseek-r1-distill-qwen-32bDeepSeek: DeepSeek R1 Distill Qwen 32BChat$0.3/0.3 in/out MTokens | 64000 Context
llama-3-8b-instructLlama 3 8B InstructChat$0.04/0.04 in/out MTokens | 8192 Context
WWizardlm 2 8x22BChat$0.62/0.62 in/out MTokens | 65535 Context
deepseek-r1-distill-llama-70bDeepSeek R1 Distill LLama 70BChat$0.8/0.8 in/out MTokens | 32000 Context
llama-3.1-70b-instructLlama 3.1 70B InstructChat$0.119/0.39 in/out MTokens | 32768 Context
GGemma 2 9BChat$0.08/0.08 in/out MTokens | 8192 Context
MMistral 7B InstructChat$0.029/0.059 in/out MTokens | 32768 Context
llama-3-70b-instructLlama3 70b InstructChat$0.51/0.74 in/out MTokens | 8192 Context
deepseek-r1DeepSeek R1Chat$4/4 in/out MTokens | 64000 Context
hermes-2-pro-llama-3-8bHermes 2 Pro Llama 3 8BChat$0.14/0.14 in/out MTokens | 8192 Context
LL3 70B Euryale V2.1 Chat$1.48/1.48 in/out MTokens | 8192 Context
DDolphin Mixtral 8x22BChat$0.9/0.9 in/out MTokens | 16000 Context
AAiroboros L2 70BChat$0.5/0.5 in/out MTokens | 4096 Context
MMidnight Rose 70BChat$0.8/0.8 in/out MTokens | 4096 Context
LSao10k L3 8B Lunaris Chat$0.05/0.05 in/out MTokens | 8192 Context
qwen3-0.6b-fp8Qwen3 0.6BChat$0/0 in/out MTokens | 32000 Context
qwen3-1.7b-fp8Qwen3 1.7BChat$0/0 in/out MTokens | 32000 Context
qwen3-8b-fp8Qwen3 8BChat$0.035/0.138 in/out MTokens | 128000 Context
qwen3-4b-fp8Qwen3 4BChat$0/0 in/out MTokens | 128000 Context
qwen3-14b-fp8Qwen3 14BChat$0.07/0.275 in/out MTokens | 128000 Context
GTHUDM/GLM-4-9B-0414Chat$0/0 in/out MTokens | 32000 Context
GTHUDM/GLM-Z1-9B-0414Chat$0/0 in/out MTokens | 32000 Context
GTHUDM/GLM-Z1-32B-0414Chat$0.24/0.24 in/out MTokens | 32000 Context
GTHUDM/GLM-4-32B-0414Chat$0.24/0.24 in/out MTokens | 32000 Context
GTHUDM/GLM-Z1-Rumination-32B-0414Chat$0.24/0.24 in/out MTokens | 32000 Context
qwen2.5-7b-instructQwen/Qwen2.5-7B-InstructChat$0/0 in/out MTokens | 32000 Context
llama-3.2-1b-instructLlama 3.2 1B Instruct Chat$0/0 in/out MTokens | 131000 Context
llama-3.2-11b-vision-instructLlama 3.2 11B Vision Instruct Chat$0.06/0.06 in/out MTokens | 32768 Context
llama-3.2-3b-instructLlama 3.2 3B InstructChat$0.03/0.05 in/out MTokens | 32768 Context
llama-3.1-8b-instruct-bf16Llama 3.1 8B Instruct BF16Chat$0.06/0.06 in/out MTokens | 8192 Context
LL31 70B Euryale V2.2Chat$1.48/1.48 in/out MTokens | 8192 Context
BBAAI:BGE-M3Embedding$0 in MTokens | 8192 Context

Dedicated Endpoint

Enterprise-Grade Infrastructure for AI

For enterprises that require higher performance, tailored SLAs, or private hosting for custom models
  • Custom pricing
  • Guaranteed uptime & latency
  • Unlimited scale
  • Dedicated clusters
Get Enterprise-Grade Endpoint
de-banner

Can't afford API credits? Get up to $500 in free credits!

Refer a friend to Novita and both earn $10 in LLM API credits—up to $500 in total.

Build Free