GPU Spot Pricing Now Live — Save up to 50%, RTX 4090 from just $0.18/hour!

Don't show again

Pricing

Explore pricing for our Model APIs and GPU resources. Find the right plan to match your needs with transparent rates and flexible options.

Pricing

Explore pricing for our Model APIs and GPU resources. Find the right plan to match your needs with transparent rates and flexible options.

Batch inference is available at an introductory 50% discount on input and output tokens for supported models.Learn more

Deepseek

Advanced AI models from DeepSeek, offering cutting-edge reasoning capabilities and competitive pricing for enterprise and research applications.

Model Name	Context	Input	Output	Actions
deepseek/deepseek-v3-0324	163,840	$0.28 /M Tokens	$1.14 /M Tokens	More
deepseek/deepseek-r1-0528	163,840	$0.7 /M Tokens	$2.5 /M Tokens	More
deepseek/deepseek-r1-0528-qwen3-8b	128,000	$0.06 /M Tokens	$0.09 /M Tokens	More
deepseek/deepseek-v3-turbo	64,000	$0.4 /M Tokens	$1.3 /M Tokens	More
deepseek/deepseek-r1-turbo	64,000	$0.7 /M Tokens	$2.5 /M Tokens	More
deepseek/deepseek-prover-v2-671b	160,000	$0.7 /M Tokens	$2.5 /M Tokens	More
deepseek/deepseek-r1-distill-llama-8b	32,000	$0.04 /M Tokens	$0.04 /M Tokens	More
deepseek/deepseek-r1-distill-qwen-14b	64,000	$0.15 /M Tokens	$0.15 /M Tokens	More
deepseek/deepseek-r1-distill-qwen-32b	64,000	$0.3 /M Tokens	$0.3 /M Tokens	More
deepseek/deepseek-r1-distill-llama-70b	32,000	$0.8 /M Tokens	$0.8 /M Tokens	More

Llama

Meta's Llama models providing state-of-the-art language understanding with open architecture designed for diverse applications.

Model Name	Context	Input	Output	Actions
meta-llama/llama-4-maverick-17b-128e-instruct-fp8	1,048,576	$0.17 /M Tokens	$0.85 /M Tokens	More
meta-llama/llama-4-scout-17b-16e-instruct	131,072	$0.1 /M Tokens	$0.5 /M Tokens	More
meta-llama/llama-3.1-8b-instruct	16,384	$0.02 /M Tokens	$0.05 /M Tokens	More
meta-llama/llama-3.3-70b-instruct	131,072	$0.13 /M Tokens	$0.39 /M Tokens	More
meta-llama/llama-3-8b-instruct	8,192	$0.04 /M Tokens	$0.04 /M Tokens	More
meta-llama/llama-3-70b-instruct	8,192	$0.51 /M Tokens	$0.74 /M Tokens	More
meta-llama/llama-3.2-1b-instruct	131,000	Free	Free	More
meta-llama/llama-3.2-3b-instruct	32,768	$0.03 /M Tokens	$0.05 /M Tokens	More

Qwen

Qwen series models offering efficient language processing with various parameter sizes, from lightweight to enterprise-grade solutions.

Model Name	Context	Input	Output	Actions
qwen/qwen3-coder-480b-a35b-instruct	262,144	$0.64 /M Tokens	$2.5 /M Tokens	More
qwen/qwen3-235b-a22b-thinking-2507	131,072	$0.3 /M Tokens	$3 /M Tokens	More
qwen/qwen3-235b-a22b-instruct-2507	262,144	$0.15 /M Tokens	$0.8 /M Tokens	More
qwen/qwen3-30b-a3b-fp8	40,960	$0.1 /M Tokens	$0.45 /M Tokens	More
qwen/qwen3-32b-fp8	40,960	$0.1 /M Tokens	$0.45 /M Tokens	More
qwen/qwen2.5-vl-72b-instruct	32,768	$0.8 /M Tokens	$0.8 /M Tokens	More
qwen/qwen3-235b-a22b-fp8	40,960	$0.2 /M Tokens	$0.8 /M Tokens	More
qwen/qwen-2.5-72b-instruct	32,000	$0.38 /M Tokens	$0.4 /M Tokens	More
qwen/qwen3-8b-fp8	128,000	$0.035 /M Tokens	$0.138 /M Tokens	More
qwen/qwen3-4b-fp8	128,000	Free	Free	More
qwen/qwen2.5-7b-instruct	32,000	$0.07 /M Tokens	$0.07 /M Tokens	More

Baidu

Baidu's ERNIE models providing advanced Chinese language understanding and multimodal capabilities, optimized for Chinese applications with competitive pricing.

Model Name	Context	Input	Output	Actions
baidu/ernie-4.5-vl-424b-a47b	123,000	$0.42 /M Tokens	$1.25 /M Tokens	More
baidu/ernie-4.5-300b-a47b-paddle	123,000	$0.28 /M Tokens	$1.1 /M Tokens	More
baidu/ernie-4.5-vl-28b-a3b	30,000	$0.14 /M Tokens	$0.56 /M Tokens	More
baidu/ernie-4.5-21B-a3b	120,000	$0.07 /M Tokens	$0.28 /M Tokens	More
baidu/ernie-4.5-0.3b	120,000	Free	Free	More

Gemma

Google's Gemma models offering high-quality language processing with excellent performance for various NLP tasks.

Model Name	Context	Input	Output	Actions
google/gemma-3-27b-it	32,000	$0.119 /M Tokens	$0.2 /M Tokens	More
google/gemma-3-1b-it	32,768	Free	Free	More

THUDM

GLM series models from Tsinghua University, featuring advanced Chinese language understanding and generation capabilities.

Model Name	Context	Input	Output	Actions
zai-org/glm-4.5v	65,536	$0.6 /M Tokens	$1.8 /M Tokens	More
zai-org/glm-4.5	131,072	$0.6 /M Tokens	$2.2 /M Tokens	More
thudm/glm-4.1v-9b-thinking	65,536	$0.035 /M Tokens	$0.138 /M Tokens	More
thudm/glm-4-32b-0414	32,000	$0.24 /M Tokens	$0.24 /M Tokens	More

Sao10K

Specialized fine-tuned models optimized for creative and roleplay applications with enhanced storytelling capabilities.

Model Name	Context	Input	Output	Actions
Sao10K/L3-8B-Stheno-v3.2	8,192	$0.05 /M Tokens	$0.05 /M Tokens	More
sao10k/l3-70b-euryale-v2.1	8,192	$1.48 /M Tokens	$1.48 /M Tokens	More
sao10k/l3-8b-lunaris	8,192	$0.05 /M Tokens	$0.05 /M Tokens	More
sao10k/l31-70b-euryale-v2.2	8,192	$1.48 /M Tokens	$1.48 /M Tokens	More

Mistralai

Efficient and powerful language models from Mistral AI, designed for both commercial and open-source applications.

Model Name	Context	Input	Output	Actions
mistralai/mistral-nemo	60,288	$0.04 /M Tokens	$0.17 /M Tokens	More
mistralai/mistral-7b-instruct	32,768	$0.029 /M Tokens	$0.059 /M Tokens	More

Nousresearch

Research-focused AI models designed for advanced reasoning and enhanced instruction following capabilities.

Model Name	Context	Input	Output	Actions
nousresearch/hermes-2-pro-llama-3-8b	8,192	$0.14 /M Tokens	$0.14 /M Tokens	More

CognitiveComputations

Specialized AI models focused on advanced cognitive tasks and complex reasoning applications.

Model Name	Context	Input	Output	Actions
cognitivecomputations/dolphin-mixtral-8x22b	16,000	$0.9 /M Tokens	$0.9 /M Tokens	More

Sophosympatheia

Fine-tuned models designed for enhanced emotional intelligence and nuanced conversational capabilities.

Model Name	Context	Input	Output	Actions
sophosympatheia/midnight-rose-70b	4,096	$0.8 /M Tokens	$0.8 /M Tokens	More

Gryphe

Innovative AI models from Gryphe, delivering specialized language understanding with a focus on efficiency and adaptability for niche applications.

Model Name	Context	Input	Output	Actions
gryphe/mythomax-l2-13b	4,096	$0.09 /M Tokens	$0.09 /M Tokens	More

MiniMax

Minimax AI's advanced language models delivering robust conversational AI capabilities with optimized performance for customer service, content generation, and creative applications, featuring strong multilingual support and enterprise-ready scalability.

Model Name	Context	Input	Output	Actions
minimaxai/minimax-m1-80k	1,000,000	$0.55 /M Tokens	$2.2 /M Tokens	More

Microsoft

Microsoft’s models deliver state-of-the-art multilingual capabilities with enterprise-grade security, seamlessly integrated into the Azure cloud ecosystem. Optimized for cross-platform collaboration and business intelligence, these models excel in document understanding and generation—key strengths for Microsoft 365 workflows.

Model Name	Context	Input	Output	Actions
microsoft/wizardlm-2-8x22b	65,535	$0.62 /M Tokens	$0.62 /M Tokens	More

Mixture of Expert

Premium collection of state-of-the-art AI models featuring advanced reasoning, mathematical proof capabilities, and cutting-edge language understanding across multiple domains.

Model Name	Context	Input	Output	Actions
deepseek/deepseek-v3-0324	163,840	$0.28 /M Tokens	$1.14 /M Tokens	More
zai-org/glm-4.5v	65,536	$0.6 /M Tokens	$1.8 /M Tokens	More
openai/gpt-oss-120b	131,072	$0.1 /M Tokens	$0.5 /M Tokens	More
openai/gpt-oss-20b	131,072	$0.05 /M Tokens	$0.2 /M Tokens	More
zai-org/glm-4.5	131,072	$0.6 /M Tokens	$2.2 /M Tokens	More
qwen/qwen3-235b-a22b-thinking-2507	131,072	$0.3 /M Tokens	$3 /M Tokens	More
moonshotai/kimi-k2-instruct	131,072	$0.57 /M Tokens	$2.3 /M Tokens	More
deepseek/deepseek-r1-0528	163,840	$0.7 /M Tokens	$2.5 /M Tokens	More
baidu/ernie-4.5-vl-424b-a47b	123,000	$0.42 /M Tokens	$1.25 /M Tokens	More
baidu/ernie-4.5-300b-a47b-paddle	123,000	$0.28 /M Tokens	$1.1 /M Tokens	More
qwen/qwen3-30b-a3b-fp8	40,960	$0.1 /M Tokens	$0.45 /M Tokens	More
minimaxai/minimax-m1-80k	1,000,000	$0.55 /M Tokens	$2.2 /M Tokens	More
qwen/qwen3-32b-fp8	40,960	$0.1 /M Tokens	$0.45 /M Tokens	More
qwen/qwen3-235b-a22b-fp8	40,960	$0.2 /M Tokens	$0.8 /M Tokens	More
deepseek/deepseek-v3-turbo	64,000	$0.4 /M Tokens	$1.3 /M Tokens	More
meta-llama/llama-4-maverick-17b-128e-instruct-fp8	1,048,576	$0.17 /M Tokens	$0.85 /M Tokens	More
deepseek/deepseek-r1-turbo	64,000	$0.7 /M Tokens	$2.5 /M Tokens	More
deepseek/deepseek-prover-v2-671b	160,000	$0.7 /M Tokens	$2.5 /M Tokens	More
meta-llama/llama-4-scout-17b-16e-instruct	131,072	$0.1 /M Tokens	$0.5 /M Tokens	More
baidu/ernie-4.5-vl-28b-a3b	30,000	$0.14 /M Tokens	$0.56 /M Tokens	More
baidu/ernie-4.5-21B-a3b	120,000	$0.07 /M Tokens	$0.28 /M Tokens	More
baidu/ernie-4.5-0.3b	120,000	Free	Free	More

Embeddings

Image

Pricing may vary based on image dimensions, inference steps, and upscaling factors. Use thePricing Calculatorfor an estimate.

API Name	Width&Height	Steps/Scale	Pricing
Text to Image	512*512	5	$0.001 /image
Image to Image	512*512	5	$0.001 /image
Remove Background	-	-	$0.017 /image
Replace Background	-	-	$0.0255 /image
Inpainting	512*512	5	$0.0015 /image
Remove Text	-	-	$0.017 /image
Cleanup	-	-	$0.017 /image
Merge Face	-	-	$0.0255 /image

API Name	Width&Height	Pricing
Seedream 3.0 Text to Image	-	$0.03 /image
Qwen-Image Text to Image	-	$0.02 /image

Video

Pricing may vary based on the number of frames, chosen model, and inference steps. Use thePricing Calculatorfor an estimate.

API Name	Total Frames	Steps	Pricing
Text to Video	32	20	$0.0307 /video

API Name	Mode	Duration	Resolution	Pricing
Kling V1.6 Text to Video	Standard	5s	720P	$0.27 /video
Kling V1.6 Text to Video	Standard	10s	720P	$0.54 /video
Kling V1.6 Image to Video	Standard	5s	720P	$0.27 /video
	Standard	10s	720P	$0.54 /video
	Professional	5s	1080P	$0.46 /video
	Professional	10s	1080P	$0.92 /video
MiniMax Video 01	-	6s	720P	$0.40 /video
MiniMax Video 02	-	6s	768P	$0.25 /video
	-	10s	768P	$0.50 /video
	-	6s	1080P	$0.44 /video
Hunyuan Video Fast	-	5s	1280720 \| 7201280	$0.30 /video
Wan 2.1 Text to Video	-	5s	1280720 \| 7201280	$0.30 /video
	-	5s	832480 \| 480832	$0.20 /video
	fast_mode	5s	1280720 \| 7201280	$0.225 /video
	fast_mode	5s	832480 \| 480832	$0.125 /video
Wan 2.1 Image to Video	-	5s	1280720 \| 7201280	$0.30 /video
	-	5s	832480 \| 480832	$0.20 /video
	fast_mode	5s	1280720 \| 7201280	$0.225 /video
	fast_mode	5s	832480 \| 480832	$0.125 /video
Wan 2.2 Text to Video	-	5s	480P	$0.09 /video
Wan 2.2 Text to Video	-	5s	1080p	$0.40 /video
Wan 2.2 Image to Video	-	5s	480P	$0.09 /video
Wan 2.2 Image to Video	-	5s	1080P	$0.40 /video
Vidu Q1 Text to Video	general style	5s	1080P	$0.36 /video
Vidu Q1 Text to Video	anime style	5s	1080P	$0.36 /video
Vidu Q1 Image to Video	-	5s	1080P	$0.36 /video
Vidu Q1 Start End to Video	-	5s	1080P	$0.36 /video
Vidu Q1 Reference to Video	-	5s	1080P	$0.36 /video
Vidu 2.0 Image to Video	-	4s	360P	$0.09 /video
	-	4s	720P	$0.18 /video
	-	4s	1080P	$0.27 /video
	-	8s	720P	$0.27 /video
Vidu 2.0 Reference to Video	-	4s	360P	$0.09 /video
Vidu 2.0 Reference to Video	-	4s	720P	$0.18 /video
Vidu 2.0 Start End to Video	-	4s	360P	$0.09 /video
	-	4s	720P	$0.18 /video
	-	4s	1080P	$0.27 /video
	-	8s	720P	$0.27 /video

API Name	Model	Steps	Pricing
Image to Video	SVD-XT	20	$0.024 /video
Image to Video	SVD	20	$0.0134 /video

Audio

API Name	Mode	Pricing
Text to Speech	-	$15 /1M characters
MiniMax speech-02-hd	T2A / T2A Async	$80 /1M characters
MiniMax speech-02-turbo	T2A / T2A Async	$48 /1M characters
MiniMax Voice-Cloning	-	$2.4 /voice

Ready to build smarter? Start today.

Get started with Novita AI and unlock the power of affordable, reliable, and scalable AI inference for your applications.

Get Started