On-demand deployments allow you to use google/gemma-3-12b-it on dedicated GPUs with high-performance serving stack with high reliability and no rate limits.
200+ models, on-demand GPUs, and secure agent runtimes — unified under one API. Free to start, scales as you grow.