---
name: Novitaai
description: Use when building AI applications with LLM APIs, generating images/videos, running code in isolated sandboxes, or deploying GPU instances. Reach for this skill when integrating OpenAI-compatible language models, creating multimodal content, executing AI-generated code safely, or managing GPU infrastructure.
metadata:
    mintlify-proj: novitaai
    version: "1.0"
---

# Novita AI Skill

## Product Summary

Novita AI is a cloud inference platform providing APIs for large language models (LLMs), image/video generation, audio processing, and isolated sandbox environments for AI agents. It offers OpenAI-compatible LLM endpoints at `https://api.novita.ai/openai`, multimodal generation APIs, GPU instance management, and Agent Sandbox for secure code execution. Authentication uses Bearer tokens with API keys from the Key Management dashboard. The platform supports streaming and non-streaming requests, batch processing, and integrates with popular frameworks like LangChain, Dify, and Claude Code.

## When to Use

Use Novita AI when:

- **Building LLM applications**: You need OpenAI-compatible chat/completion endpoints with 10,000+ open-source models (Deepseek, Qwen, Llama, etc.)
- **Generating images or videos**: You need text-to-image, image-to-image, text-to-video, or image-to-video generation with models like FLUX, Kling, or Vidu
- **Processing audio**: You need text-to-speech, speech recognition, or voice cloning with Minimax or GLM models
- **Running AI-generated code safely**: You need isolated sandbox environments for code execution with <200ms startup time
- **Managing GPU infrastructure**: You need dedicated GPU instances or serverless GPU endpoints for custom model deployment
- **Batch processing**: You need to process large volumes of LLM requests asynchronously
- **Integrating with agent frameworks**: You're building agents with Claude Code, Dify, LangChain, or other frameworks that support Novita

## Quick Reference

### API Base URLs

| Purpose | URL |
|---------|-----|
| Model APIs (general) | `https://api.novita.ai` |
| LLM (OpenAI-compatible) | `https://api.novita.ai/openai` |

### Authentication

All requests require Bearer token in header:
```
Authorization: Bearer {{API_KEY}}
```

Get API keys from: https://novita.ai/settings/key-management

### LLM API Endpoints

| Endpoint | Purpose |
|----------|---------|
| `/v1/chat/completions` | Chat-based LLM requests (streaming/non-streaming) |
| `/v1/completions` | Text completion requests |
| `/v1/embeddings` | Generate embeddings |
| `/v1/models` | List available models |
| `/v1/batches` | Submit batch jobs |

### Image/Video/Audio API Endpoints

| Category | Endpoints |
|----------|-----------|
| Image Generation | `/txt2img`, `/img2img`, `/reimagine` |
| Image Editing | `/upscale`, `/remove-background`, `/inpainting` |
| Video Generation | `/txt2video`, `/img2video` |
| Audio | `/txt2speech`, `/asr`, `/voice-cloning` |
| Task Status | `/task-result` (for async operations) |

### SDK Installation

```bash
# Python - LLM
pip install openai

# Python - Image/Video/Audio
pip install novita-sdk

# JavaScript/TypeScript - LLM
npm install openai

# JavaScript/TypeScript - Image/Video/Audio
npm install novita-sdk

# Agent Sandbox - Python
pip install novita-sandbox

# Agent Sandbox - JavaScript/TypeScript
npm install novita-sandbox

# Agent Sandbox - CLI
npm install -g novita-sandbox-cli
```

### Recommended Models by Use Case

| Use Case | Recommended Models |
|----------|-------------------|
| Code generation & reasoning | `deepseek/deepseek-r1-0528`, `deepseek/deepseek-v3-0324`, `qwen/qwen3-coder-480b` |
| General reasoning | `deepseek/deepseek-r1`, `qwen/qwen-2.5-72b-instruct`, `meta-llama/llama-3.3-70b-instruct` |
| Function calling & tools | Qwen 3 family models |
| Long context | `meta-llama/llama-4-maverick-17b-128e-instruct-fp8` |
| Vision & documents | `qwen/qwen2.5-vl-72b-instruct`, `meta-llama/llama-4-scout-17b-16e-instruct` |
| Low-latency extraction | `meta-llama/llama-3.1-8b-instruct`, `meta-llama/llama-3.2-3b-instruct` |

## Decision Guidance

### When to Use Chat Completion vs Completion

| Scenario | Use Chat Completion | Use Completion |
|----------|-------------------|-----------------|
| Multi-turn conversations | ✓ | |
| System prompts & role-playing | ✓ | |
| Simple text generation | | ✓ |
| Legacy prompt format | | ✓ |
| Function calling | ✓ | |

### When to Use Streaming vs Non-Streaming

| Scenario | Use Streaming | Use Non-Streaming |
|----------|---------------|-------------------|
| Long outputs (>1000 tokens) | ✓ | |
| Real-time user feedback needed | ✓ | |
| Timeout risk on slow networks | ✓ | |
| Simple, short responses | | ✓ |
| Batch processing | | ✓ |

### When to Use GPU Instance vs Serverless GPU

| Scenario | GPU Instance | Serverless GPU |
|----------|-------------|-----------------|
| Long-running, predictable workloads | ✓ | |
| Full control over environment | ✓ | |
| Burstable, sporadic tasks | | ✓ |
| Pay-per-second usage | | ✓ |
| Custom model deployment | ✓ | ✓ |

### When to Use Agent Sandbox vs GPU Instance

| Scenario | Agent Sandbox | GPU Instance |
|----------|---------------|-------------|
| Running AI-generated code | ✓ | |
| Code agents & computer use | ✓ | |
| Data analysis & visualization | ✓ | |
| Training custom models | | ✓ |
| Long-running inference | | ✓ |

## Workflow

### 1. Set Up Account & Authentication

1. Log in to https://novita.ai (Google, GitHub, or email)
2. Navigate to https://novita.ai/settings/key-management
3. Create an API key and save it securely
4. Set environment variable: `export NOVITA_API_KEY=sk_...`
5. Verify balance at https://novita.ai/billing (add credit if needed)

### 2. Call LLM API (Chat Completion)

1. Choose a model from the recommended list or browse https://novita.ai/models
2. Initialize OpenAI client with Novita base URL:
   ```python
   from openai import OpenAI
   client = OpenAI(
       base_url="https://api.novita.ai/openai",
       api_key="sk_..."
   )
   ```
3. Create chat completion request:
   ```python
   response = client.chat.completions.create(
       model="deepseek/deepseek-v3",
       messages=[{"role": "user", "content": "Your prompt"}],
       stream=False,
       max_tokens=512
   )
   ```
4. Handle response: `response.choices[0].message.content`
5. Monitor usage in console dashboard

### 3. Generate Images/Videos

1. Choose model from API reference (e.g., FLUX, Kling, Vidu)
2. Submit async request with prompt/image
3. Receive `task_id` in response
4. Poll `/task-result` endpoint with `task_id` until status is `completed`
5. Download result from returned URL

### 4. Run Code in Agent Sandbox

1. Install SDK: `pip install novita-sandbox`
2. Create `.env` with `NOVITA_API_KEY=sk_...`
3. Initialize sandbox:
   ```python
   from novita_sandbox.code_interpreter import Sandbox
   sandbox = Sandbox.create()
   ```
4. Execute code:
   ```python
   execution = sandbox.run_code("print('hello')")
   print(execution.logs)
   ```
5. Clean up: `sandbox.kill()`

### 5. Deploy GPU Instance

1. Visit https://novita.ai/console (GPU Instances section)
2. Click "Create Instance"
3. Select GPU type, CPU, RAM, storage
4. Choose template or custom image
5. Configure networking (VPC, ports)
6. Start instance and connect via SSH/JupyterLab
7. Monitor metrics in console

## Common Gotchas

- **Missing API key**: Ensure `Authorization: Bearer` header is present and key is valid. Error 401 = missing/incorrect key.
- **Rate limits (429)**: Check if hitting TPM (tokens per minute) or RPM (requests per minute) limits. Use exponential backoff retry logic.
- **Async image/video tasks**: Don't assume immediate results. Always poll `/task-result` with `task_id`; results are asynchronous.
- **Streaming timeout (503/504)**: For long outputs, enable streaming mode. Non-streaming requests can timeout on slow networks. Use `stream=True`.
- **Model not found (400)**: Verify model name matches exactly (e.g., `deepseek/deepseek-v3`, not `deepseek-v3`). Check https://novita.ai/models for correct names.
- **Insufficient permissions (403)**: Some models require identity verification. Log in to console and complete verification if needed.
- **Sandbox startup delay**: First sandbox creation takes ~200ms; subsequent creations are faster. Don't assume instant execution.
- **GPU instance billing**: Instances charge per hour even when stopped. Use auto-shutdown or delete when not needed.
- **Batch API file format**: Batch input files must be JSONL (one JSON object per line), not JSON arrays.
- **Function calling not supported**: Not all models support function calling. Check model details or use Qwen 3 family for guaranteed support.
- **Deprecated endpoints**: V2 image APIs (`/v2/txt2img`) are deprecated. Use V3 endpoints (`/txt2img`).

## Verification Checklist

Before submitting work with Novita AI:

- [ ] API key is set and valid (test with `/v1/models` list endpoint)
- [ ] Account has sufficient credit balance (check https://novita.ai/billing)
- [ ] Model name matches exactly (verify on https://novita.ai/models)
- [ ] Request headers include `Authorization: Bearer {{API_KEY}}`
- [ ] For async tasks (images/videos), polling logic handles `task_id` correctly
- [ ] Streaming is enabled for long outputs (>1000 tokens)
- [ ] Error handling includes retry logic with exponential backoff for 429/503/504
- [ ] Sandbox code is tested locally before deployment
- [ ] GPU instances are stopped/deleted when not in use to avoid unexpected charges
- [ ] Batch files are JSONL format (one JSON per line)
- [ ] Function calling is only used with supported models (Qwen 3, etc.)

## Resources

**Comprehensive navigation**: https://novita.ai/docs/llms.txt

**Critical documentation pages**:
- [LLM API Guide](https://novita.ai/docs/guides/llm-api) — OpenAI-compatible endpoints, code examples, parameters
- [Model APIs Introduction](https://novita.ai/docs/api-reference/model-apis-introduction) — Image, video, audio generation overview
- [Agent Sandbox Overview](https://novita.ai/docs/guides/sandbox-overview) — Secure code execution for AI agents

---

> For additional documentation and navigation, see: https://novita.ai/docs/llms.txt