--- name: Novitaai description: Use when building AI applications that need model inference (LLMs, image/video/audio generation), running isolated code sandboxes for agents, or managing GPU compute resources. Reach for this skill when integrating OpenAI-compatible APIs, handling asynchronous tasks, deploying custom models, or executing untrusted code safely. metadata: mintlify-proj: novitaai version: "1.0" --- # Novita AI Skill ## Product summary Novita AI is a cloud inference platform providing cost-effective access to 10,000+ AI models. Use it to ship LLM APIs (OpenAI-compatible), image/video/audio generation, GPU compute instances, and isolated sandboxes for agent code execution. Key endpoints: `https://api.novita.ai/openai` (LLM), `https://api.novita.ai/v3` (image/video/audio). Authenticate with Bearer tokens in the `Authorization` header. Primary docs: https://novita.ai/docs ## When to use - **LLM inference**: Chat completions, text generation, embeddings, function calling, structured outputs, reasoning models - **Media generation**: Text-to-image, image-to-image, text-to-video, image-to-video, text-to-speech, voice cloning - **Batch processing**: Cost-effective offline inference for large datasets (up to 50k requests per batch, 24-hour completion window) - **Agent sandboxes**: Securely execute AI-generated code with <200ms startup, multi-language support (Python, JavaScript, C++) - **GPU compute**: Dedicated instances for long-running workloads or serverless endpoints for burstable tasks - **Custom models**: Deploy fine-tuned models or LoRA adapters on dedicated endpoints with per-hour billing ## Quick reference ### API Authentication ```bash Authorization: Bearer {{API_KEY}} ``` Get API keys from https://novita.ai/settings/key-management ### LLM Endpoints | Task | Endpoint | Format | |------|----------|--------| | Chat | `/v1/chat/completions` | OpenAI-compatible | | Completion | `/v1/completions` | OpenAI-compatible | | Embeddings | `/v1/embeddings` | OpenAI-compatible | | Batch | `/v1/batches` | JSONL input files | ### Image/Video/Audio Endpoints | Task | Base URL | Pattern | |------|----------|---------| | Text-to-Image | `https://api.novita.ai/v3` | Async (returns task_id) | | Image-to-Image | `https://api.novita.ai/v3` | Async (returns task_id) | | Text-to-Video | `https://api.novita.ai/v3` | Async (returns task_id) | | Task Result | `https://api.novita.ai/v3/async/task-result` | Poll with task_id | ### Common Parameters - `model`: Model identifier (e.g., `deepseek/deepseek-r1`, `flux-1-schnell`) - `max_tokens`: Output length limit - `temperature`: Randomness (0–2, default 0.7) - `stream`: Boolean for streaming responses - `task_id`: Returned by async APIs; use to poll results ### Sandbox CLI Commands ```bash npm install -g novita-sandbox-cli novita-sandbox-cli sandbox list # List running sandboxes novita-sandbox-cli sandbox kill # Terminate sandbox ``` ### GPU Instance Pricing Models - **On-Demand**: Pay per second, no commitment - **Spot**: 50% discount, 1-hour grace period, 1-hour advance notice before reclaim - **Monthly Subscription**: 10%+ discount vs on-demand ## Decision guidance | Scenario | Use This | Why | |----------|----------|-----| | Real-time chat/completion | Streaming LLM API | Low latency, immediate responses | | Large dataset processing | Batch API | Higher rate limits, cost-effective, 24h window acceptable | | Image/video generation | Async task API + polling | Async pattern, poll task result with task_id | | Webhook notifications | Webhook listener | Avoid polling; receive ASYNC_TASK_RESULT events | | Long-running model | GPU Instance | Dedicated resources, predictable performance | | Variable workload | Serverless GPU | Auto-scaling, per-second billing, no cold starts | | Custom/fine-tuned model | Dedicated Endpoint | Exclusive access, zero cold starts, LoRA support | | Untrusted code execution | Agent Sandbox | System isolation, <200ms startup, multi-language | ## Workflow ### 1. Set up authentication - Create account at https://novita.ai - Generate API key in Key Management - Store securely; include in `Authorization: Bearer {{KEY}}` header ### 2. Choose your integration path - **LLM**: Use OpenAI SDK with base URL `https://api.novita.ai/openai` - **Image/Video/Audio**: Call async endpoints, receive task_id, poll task result - **Batch**: Upload JSONL file, create batch, check status, retrieve results - **Sandbox**: Use Python/JavaScript SDK or CLI - **GPU**: Use API or console to create/manage instances ### 3. Make your first request For LLM (Python): ```python from openai import OpenAI client = OpenAI(base_url="https://api.novita.ai/openai", api_key="") response = client.chat.completions.create( model="deepseek/deepseek-r1", messages=[{"role": "user", "content": "Hello"}] ) print(response.choices[0].message.content) ``` For image generation (async pattern): ```bash # 1. Submit task curl -X POST https://api.novita.ai/v3/txt2img \ -H "Authorization: Bearer " \ -H "Content-Type: application/json" \ -d '{"prompt": "cat", "model": "flux-1-schnell"}' # Returns: {"task_id": "xxx"} # 2. Poll result curl https://api.novita.ai/v3/async/task-result?task_id=xxx \ -H "Authorization: Bearer " # Returns: {"task": {"status": "TASK_STATUS_SUCCEED"}, "images": [...]} ``` ### 4. Handle async tasks - Image/video/audio APIs return `task_id` immediately - Poll `/v3/async/task-result?task_id=` until status is `TASK_STATUS_SUCCEED` or `TASK_STATUS_FAILED` - Alternatively, set up webhook to receive `ASYNC_TASK_RESULT` events - Results expire after 30 days; download promptly ### 5. Monitor and optimize - Check rate limits in response headers - Use batch API for non-urgent bulk processing - Monitor account balance; set up auto-top-up - Review billing at https://novita.ai/billing ## Common gotchas - **Async task polling**: Image/video/audio APIs are async. Don't expect immediate results; poll task result endpoint or use webhooks. - **Task result expiration**: Results are deleted 30 days after completion. Retrieve promptly. - **Batch file format**: JSONL only, one request per line. All requests in a batch must target the same model. - **Rate limits**: 429 errors mean you've hit TPM (tokens per minute) or RPM (requests per minute). Retry with backoff or contact support. - **API key security**: Never commit API keys. Use environment variables or secret managers. - **Streaming timeout**: Long outputs may timeout without streaming. Use `stream=true` for large responses. - **Model availability**: Not all models support all features (function calling, structured outputs, reasoning). Check model details page. - **Sandbox persistence**: Paused sandboxes retain state but consume storage. Resume or kill when done. - **GPU spot instances**: Can be reclaimed with 1-hour notice. Use on-demand or monthly subscription for critical workloads. - **Batch expiration**: Batches expire after 24 hours. Incomplete requests are canceled; you only pay for completed ones. ## Verification checklist Before submitting work: - [ ] API key is valid and has sufficient balance - [ ] Correct base URL used (`https://api.novita.ai/openai` for LLM, `https://api.novita.ai/v3` for media) - [ ] Authorization header includes `Bearer` prefix - [ ] Model name matches available models (check `/v1/models` endpoint) - [ ] For async tasks: polling logic handles all status values (QUEUED, PROCESSING, SUCCEED, FAILED) - [ ] Batch JSONL is valid (one request per line, same model per batch) - [ ] Rate limit handling includes exponential backoff - [ ] Error responses checked for `BILLING_BALANCE_NOT_ENOUGH` or `RATE_LIMIT_EXCEEDED` - [ ] Sandbox code tested locally before deployment - [ ] GPU instance has sufficient disk/memory for workload - [ ] Webhook endpoint (if used) is publicly accessible and returns 200 OK ## Resources - **Full navigation**: https://novita.ai/docs/llms.txt - **LLM API guide**: https://novita.ai/docs/guides/llm-api - **Batch API guide**: https://novita.ai/docs/guides/llm-batch-api - **Agent Sandbox overview**: https://novita.ai/docs/guides/sandbox-overview - **GPU instances**: https://novita.ai/docs/guides/gpu-instance-overview - **Error codes**: https://novita.ai/docs/api-reference/basic-error-code - **Rate limits**: https://novita.ai/docs/guides/llm-rate-limits --- > For additional documentation and navigation, see: https://novita.ai/docs/llms.txt