On-demand deployments allow you to use qwen/qwen3-8b-fp8 on dedicated GPUs with high-performance serving stack with high reliability and no rate limits.
200+ models, on-demand GPUs, and secure agent runtimes — unified under one API. Free to start, scales as you grow.