# Authentication Source: https://novita.ai/docs/api-reference/basic-authentication The `Novita AI` API uses API keys in the `request headers` to authenticate requests. You can view and manage your API keys in [settings page](https://novita.ai/settings/key-management?utm_source=getstarted). ```js theme={"system"} { "Authorization": "Bearer {{API Key}}" } ``` # API Error Codes Source: https://novita.ai/docs/api-reference/basic-error-code ## Image / Video / Audio
| Error Name | Status Code | Description |
|---|---|---|
| INVALID\_REQUEST\_BODY | 400 | Request parameter validation failed |
| IMAGE\_FILE\_EXCEEDS\_MAX\_SIZE | 400 | Image size exceeds limit |
| INVALID\_IMAGE\_FORMAT | 400 | Image format does not meet requirements |
| IMAGE\_EXCEEDS\_MAX\_RESOLUTION | 400 | Image resolution exceeds limit |
| INVALID\_IMAGE\_SIZE | 400 | Image width or height exceeds limit |
| API\_NOT\_FOUND | 404 | API not found |
| IMAGE\_NO\_FACE\_DETECTED | 400 | No face detected |
| INVALID\_CUSTOM\_OUTPUT\_PATH | 400 | Invalid OSS path |
| ILLEGAL\_PROMPT | 400 | Prompt contains inappropriate content |
| ILLEGAL\_IMAGE\_CONTENT | 400 | Image contains inappropriate content |
| INVALID\_AUDIO\_FILE | 400 | Invalid audio input |
| BILLING\_FAILED | 500 | Billing service error |
| BILLING\_AUTH\_FAILED | 403 | Billing service authentication failed |
| BILLING\_BALANCE\_NOT\_ENOUGH | 400 | Insufficient balance |
| MISSING\_API\_KEY | 400 | API Key not provided |
| INVALID\_API\_KEY | 403 | API Key validation failed |
| FEATURE\_NOT\_ALLOWED | 403 | No permission to upload model |
| API\_NOT\_ALLOWED | 403 | No permission to use this API |
| RATE\_LIMIT\_EXCEEDED | 429 | Rate limit exceeded |
| NEED\_REAL\_NAME\_VERIFY | 403 | Enterprise verification not completed |
| CREATE\_TASK\_FAILED | 500 | Failed to create task |
| TASK\_NOT\_FOUND | 404 | Task not found |
| GET\_RESULT\_FAILED | 500 | Failed to get task result |
| TASK\_FAILED | 500 | Task execution failed |
| Error Name | Status Code | Description |
|---|---|---|
| INVALID\_API\_KEY | 403 | API Key not provided |
| MODEL\_NOT\_FOUND | 404 | Model not found |
| FAILED\_TO\_AUTH | 401 | Authentication failed |
| NOT\_ENOUGH\_BALANCE | 403 | Insufficient balance |
| INVALID\_REQUEST\_BODY | 400 | Request body format error, see message for details |
| RATE\_LIMIT\_EXCEEDED | 429 | Too many requests, please try again later |
| TOKEN\_LIMIT\_EXCEEDED | 429 | Token limit exceeded, please try again later |
| SERVICE\_NOT\_AVAILABLE | 503 | Service unavailable |
| ACCESS\_DENY | 403 | Access denied |
| Error Name | Status Code | Description |
|---|---|---|
| UNKNOWN | 500 | Unknown error |
| GET\_TOKEN\_FAILED | 400 | Failed to obtain token |
| FORBIDDEN | 403 | Access forbidden / No permission |
| UNAUTHORIZED | 401 | Unauthorized |
| USER\_ALREADY\_EXISTS | 400 | User already exists |
| INVALID\_USER\_OR\_PASSWORD | 400 | Invalid username or password |
| INVALID\_CODE | 400 | Invalid verification code |
| USER\_NOT\_FOUND | 400 | User not found |
| USER\_PHONE\_NOT\_CONSIST | 400 | User phone number mismatch |
| SEND\_CODE\_TOO\_FAST | 429 | Verification code sent too frequently |
| INVALID\_PUBLIC\_KEY | 400 | Invalid public key |
| USER\_NOT\_ACTIVATED | 400 | User not activated |
| USER\_ALREADY\_ACTIVATED | 400 | User already activated |
| INVALID\_USER\_TOKEN | 400 | Invalid user token |
| BANNED\_USER | 400 | Account has been banned |
| RATE\_LIMIT\_EXCEEDED | 429 | Request rate limit exceeded |
| RESOURCE\_NOT\_FOUND | 400 | Resource not found (e.g., container not found) |
| CONFLICT | 400 | Conflict (e.g., container conflict) |
| VALIDATOR\_PARAM | 400 | Parameter validation failed / Invalid parameter |
| REQUEST | 400 | Request error |
| OPERATION\_LIMIT | 400 | Operation limit reached |
| INSUFFICIENT\_RESOURCE | 400 | Insufficient resources |
| CLUSTER\_STATUS | 400 | Cluster status abnormal |
| NODE\_STATUS | 400 | Node status abnormal |
| DEPENDENT\_RESOURCE\_STATE | 400 | Dependent resource state abnormal |
| PREPAID\_INSTANCE\_NOT\_SUPPORT\_RELEASE | 400 | Prepaid instance does not support release |
| CREATING\_INSTANCE\_NOT\_SUPPORT\_RENEWAL | 400 | Instance in creation does not support renewal |
| INSTANCE\_LOCAL\_STORAGE\_NOT\_FOUND | 400 | Instance local storage not found |
| INVALID\_COMMAND\_PARAM | 400 | Invalid instance startup command parameter |
| GPU\_SPEC\_USED | 400 | GPU specification already in use |
| INCORRECT\_USER\_SYNCER\_REQUIRE\_PARAMS | 400 | Incorrect user syncer request parameters |
| MIGRATE\_INSUFFICIENT\_RESOURCE | 400 | Insufficient resources for migration |
| WALLET\_NOT\_FOUND | 500 | Wallet not found |
| WALLET\_UNSUPPORT\_RECHARGE\_METHOD | 400 | Unsupported wallet recharge method |
| BALANCE\_NOT\_ENOUGH | 400 | Insufficient balance |
| UNSUPPORTED\_BILLING\_MODE | 400 | Unsupported billing mode |
| EXPIRED\_OR\_BALANCE\_NOT\_ENOUGH | 400 | Expired or insufficient balance |
| ORDER\_NOT\_FOUND | 400 | Order not found |
| SAVING\_PLAN\_ALREADY\_EXISTS | 400 | Saving plan already exists |
| CREATE\_INSTANCE\_LIMIT | 400 | Instance creation limit reached, please recharge or delete other instances |
| NETWORK\_STORAGE\_TOO\_LARGE | 400 | Network storage size exceeds limit |
| CUR\_CLUSTER\_NETWORK\_STORAGE\_NOT\_SUPPORT | 400 | Network storage not supported in current region |
| NETWORK\_STORAGE\_IN\_USE | 400 | Network storage is in use |
| NETWORK\_STORAGE\_UNAVAILABLE | 400 | Network storage unavailable |
| NETWORK\_STORAGE\_NOT\_FOUND | 400 | Network storage not found |
| IMAGE\_NOT\_FOUND | 400 | Image not found |
| IMAGE\_AUTH\_IN\_USE | 400 | Image authentication in use |
| NETWORK\_NOT\_FOUND | 400 | Instance network not found |
| NETWORK\_IN\_USE | 400 | Instance network in use |
| NETWORK\_MAX\_LIMIT | 400 | Instance network creation limit exceeded |
| SEND\_MSG\_ERROR | 400 | Message sending error |
| JOB\_NOT\_FOUND | 400 | Instance job not found |
| SERVERLESS\_ENDPOINT\_NOT\_FOUND | 400 | Serverless endpoint not found |
| SERVERLESS\_WORKER\_NOT\_FOUND | 400 | Serverless worker not found |
| SERVERLESS\_PRODUCT\_NOT\_FOUND | 400 | Serverless product not found |
| SERVERLESS\_APP\_NAME\_IS\_EXIST | 400 | Serverless application name already exists |
| TEMPLATE\_IS\_PRIVATE | 400 | Template is private |
| TEMPLATE\_NOT\_FOUND | 400 | Template not found |
| Error Name | Status Code | Description |
|---|---|---|
| UNKNOWN | 500 | Unknown error, please contact us |
| LIST\_BILL\_TOO\_FAST | 429 | Requests are too frequent, please try again later |
| INVALID\_PRODUCT\_CATEGORY | 400 | Invalid productCategory parameter |
| INVALID\_BILL\_CYCLE | 400 | Invalid cycle parameter |
| LIST\_BILL\_ERROR | 500 | Query error, please contact us |
# LoRA for Style Training
Source: https://novita.ai/docs/api-reference/model-apis-create-style-training
**You can train a LoRA model to generate images that emulate a specific artistic style.**
## Create Style Training Task
`POST https://api.novita.ai/v3/training/style`
**Use this API to start a style training task.**
> This is an **asynchronous API**; only the **task\_id** is returned initially. Utilize this **task\_id** to query the **Task Result API** at [Get Style Training Result API](#get-style-training-result) to retrieve the results of the image generation.
### Request Headers
#### 4.2 Switch to the inferencing tab and add more detail
#### Review the training results
# LoRA for Subject Training
Source: https://novita.ai/docs/api-reference/model-apis-create-subject-training
**You can train a LoRA model to generate images featuring a subject, such as yourself.**
## Create Subject Training Task
`POST https://api.novita.ai/v3/training/subject`
**Use this API to start a subject training task.**
> This is an **asynchronous API**; only the **task\_id** is returned initially. Utilize this **task\_id** to query the **Task Result API** at [Get Subject Training Result API](#get-subject-training-result) to retrieve the results of the image generation.
### Request Headers
#### 4.2 Switch to the inferencing tab and add more detail
#### Review the training results
# FLUX.1 Kontext Dev
Source: https://novita.ai/docs/api-reference/model-apis-flux-1-kontext-dev
POST https://api.novita.ai/v3/async/flux-1-kontext-dev
FLUX.1 Kontext dev is a model with greatly improved prompt adherence and typography generation meet premium consistency for editing without compromise on speed.
## Request Headers
`Response:`
```js theme={"system"}
{
"prompt": "a man standing on a rock near the ocean, Alejandro Iñárritu, Nadav Kander, Ignacio Fernández Ríos, Ignacio Fernández Ríos, Ignacio Ríos, Navid Negahban, Reza Afshar, Steven Klein, Ignacio Fernández Ríos, Lorenzo Lanfranconi, Peter Palombi, Alberto Mielgo"
}
```
# Image to Video
Source: https://novita.ai/docs/api-reference/model-apis-img2video
POST https://api.novita.ai/v3/async/img2video
**This API seamlessly transforms an image into a cohesive video. It is designed to create smooth transitions and animations from static images, making it ideal for producing dynamic visual content for presentations, social media, and marketing campaigns.**
> This is an **asynchronous** API; only the **task\_id** will be returned. You should use the **task\_id** to request the [**Task Result API**](/api-reference/model-apis-task-result) to retrieve the video generation results.
## Request Headers
### I already have mask images. How do I convert `mask` images to base64?
You can use the following code to convert mask images to base64.
```python theme={"system"}
import base64
# mask files path
filename_input = "mask_edited.png"
# read mask file
with open(filename_input, "rb") as f:
base64_pic = base64.b64encode(f.read()).decode("utf-8")
# write mask file
with open("input.txt", "w") as f:
f.write(base64_pic)
```
### Start requesting inpainting.
Please set the **`Content-Type`** header to **`application/json`** in your HTTP request to indicate that you are sending JSON data. Currently, **only JSON format is supported**.
`"model_name":"realisticVisionV40_v40VAE-inpainting_81543.safetensors"` in body represent inpainting models, which, can be accessed in API /v3/model with `sd_name` like %inpainting%.
`Request:`
```bash theme={"system"}
curl --location --request POST 'http://api.novita.ai/v3/async/inpainting' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {{API Key}}' \
--data-raw '{
"extra": {
"response_image_type": "jpeg"
},
"request": {
"model_name": "realisticVisionV40_v40VAE-inpainting_81543.safetensors",
"prompt": "Leonardo DiCaprio",
"negative_prompt": "(deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime), text, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck, BadDream, UnrealisticDream",
"image_num": 1,
"steps": 25,
"seed": -1,
"clip_skip": 1,
"guidance_scale": 7.5,
"sampler_name": "Euler a",
"mask_blur": 1,
"inpainting_full_res": 1,
"inpainting_full_res_padding": 32,
"inpainting_mask_invert": 0,
"initial_noise_multiplier": 1,
"strength": 0.85,
"image_base64": "{{base64 encoded image}}",
"mask_image_base64": "{{base64 encoded mask image}}"
}
}'
```
`Response:`
`````js theme={"system"}
{
"code": 0,
"msg": "",
"data": {
"task_id": "270f4fba-2cb0-4a56-8b82-xxxx"
}
}
````"model_name":"realisticVisionV40_v40VAE-inpainting_81543.safetensors"` in body represent inpainting models, which, can be accessed in API /v3/model with `sd_name` like %inpainting%.
`Request:`
```bash
curl --location --request POST 'http://api.novita.ai/v3/async/inpainting' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {{API Key}}' \
--data-raw '{
"extra": {
"response_image_type": "jpeg"
},
"request": {
"model_name": "realisticVisionV40_v40VAE-inpainting_81543.safetensors",
"prompt": "Leonardo DiCaprio",
"negative_prompt": "(deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime), text, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck, BadDream, UnrealisticDream",
"image_num": 1,
"steps": 25,
"seed": -1,
"clip_skip": 1,
"guidance_scale": 7.5,
"sampler_name": "Euler a",
"mask_blur": 1,
"inpainting_full_res": 1,
"inpainting_full_res_padding": 32,
"inpainting_mask_invert": 0,
"initial_noise_multiplier": 1,
"strength": 0.85,
"image_base64": "{{base64 encoded image}}",
"mask_image_base64": "{{base64 encoded mask image}}"
}
}'
`````
`Response:`
```js theme={"system"}
{
"code": 0,
"msg": "",
"data": {
"task_id": "270f4fba-2cb0-4a56-8b82-xxxx"
}
}
```
# Introduction
Source: https://novita.ai/docs/api-reference/model-apis-introduction
For LLM API, Novita AI provides compatibility with the OpenAI API standard, allowing you to use official OpenAI SDKs like openai-python, openai-node, and others to interact with our API. This means you can easily migrate your existing OpenAI-based applications to Novita AI with minimal code changes. Additionally, you can develop your own custom SDKs by following our [API reference](/api-reference/model-apis-llm-create-chat-completion), which details all available endpoints, parameters, and response formats. Our API supports key features like chat chat completions, completions, model listings, and model information retrieval, making it a flexible solution for your LLM needs.
Option 1:
You can use the string type to represent the text contents of the message.
Option 2:
Use an array of content parts, object\[]. Detailed fields are as follows:
Only vision language models can be used.
An array of content parts, object\[]. Detailed fields are as follows:
Only models that support video can be used.
An array of content parts, object\[]. Detailed fields are as follows:
Only models that support audio can be used.
Output modality parameters:
An array of content parts, object\[]. Detailed fields are as follows:
# Minimax Hailuo-02
Source: https://novita.ai/docs/api-reference/model-apis-minimax-hailuo-02
POST https://api.novita.ai/v3/async/minimax-hailuo-02
Minimax Hailuo-02 (also known as Hailuo) is an AI video generation model that supports both text-to-video and image-to-video generation. It can generate 6-second videos at 768p or 1080p resolution, and 10-second videos at 768p resolution.
text) and preview model (model) are provided in the request body, this parameter returns the preview audio as a URL.
voice\_id.
# Reimagine
Source: https://novita.ai/docs/api-reference/model-apis-reimagine
POST https://api.novita.ai/v3/reimagine
**This feature allows you to automatically generate variations of a single image.**
## Request Headers
# Remove Background
Source: https://novita.ai/docs/api-reference/model-apis-remove-background
POST https://api.novita.ai/v3/remove-background
**Automatically remove the background from images.**
## Request Headers
# Remove Text
Source: https://novita.ai/docs/api-reference/model-apis-remove-text
POST https://api.novita.ai/v3/remove-text
`POST https://api.novita.ai/v3/remove-text`
**Automatically remove text from images.**
## Request Headers
# Replace Background
Source: https://novita.ai/docs/api-reference/model-apis-replace-background
POST https://api.novita.ai/v3/replace-background
**Replace the background of images according to your prompt.**
## Request Headers
# Seedance V1 Lite Image to Video
Source: https://novita.ai/docs/api-reference/model-apis-seedance-v1-lite-i2v
POST https://api.novita.ai/v3/async/seedance-v1-lite-i2v
Seedance V1 Lite is an AI video model designed for coherent multi-shot video generation, offering smooth motion and precise adherence to detailed prompts. It supports resolutions of 480p, 720p, and 1080p.
# Upscale V2
Source: https://novita.ai/docs/api-reference/model-apis-upscale-v2-deprecated
POST https://api.novita.ai/v2/upscale
* **A Working Knowledge Base** (Optional):If you plan to use your own documents for knowledge augmentation, prepare them for upload.
## How to install AnythingLLM on Linux locally?
For Linux users, you can install AnythingLLM by running this command on terminal:
curl -fsSL
[https://s3.us-west-1.amazonaws.com/public.useanything.com/latest/installer.sh](https://s3.us-west-1.amazonaws.com/public.useanything.com/latest/installer.sh) | sh
This will download the latest version of AnythingLLM’s AppImage, unpack it, and then supply a symlink to seamlessly run AnythingLLM. This script will unpack the app in `$HOME/AnythingLLMDesktop`.
You can start the app at any time by running `./AnythingLLMDesktop/start`. This will boot the app with full logging.
## Integration Steps
### 1. Connect Novita AI to AnythingLLM
To connect Novita AI’s models with **AnythingLLM**, follow these steps:
* In the application, click on the **🔧 Settings** icon located in the lower-left corner.
* Go to the **LLM API Providers** section, **and then** select **Novita AI** from the dropdown menu.
* In the **Novita API Key** field, **paste your Novita AI API Key** (the one you generated earlier).
* Click **Save** to complete the integration.
Now, AnythingLLM will have access to all Novita AI’s models, allowing you to use them in your applications.
### 2. Enable Web Search
AnythingLLM allows you to search the web in real-time by enabling web search capabilities. Follow these steps to set up web search functionality:
* **Navigate to Agent Skills**: In the settings or main interface, locate the **Agent Skills** section.
* **Enable Scrape Websites and Web Search**: Toggle the option to turn on both **Scrape Websites** and **Web Search** capabilities.
* **Choose Web Search Providers**: You can choose from the following recommended search providers:
* **DuckDuckGo**: A free and privacy-focused web search using DuckDuckGo's HTML interface.
* **Google Search Engine**: Powered by a custom Google Search Engine. It’s free for up to 100 queries per day.
* **Bing Search**: Powered by the Bing Search API. It’s free for up to 1000 queries per month.
### 3. Create a Knowledge Base
Next, you’ll want to build a knowledge base for your assistant to use. Follow these steps:
* Click the Upload icon of the Workspace.
* Upload local documents or website links.
* Select the file(s) or webpage you wish to upload to the workspace.
* Once the file is selected, click **Move to Workspace** to transfer the document(s) to the AnythingLLM workspace.
* Save your files and click **Save and Embed** to finalize the process.
* Your knowledge base is now set up and ready for use.
### 4. Try Asking Questions
In AnythingLLM, ask questions using **@agent**.
The model will respond based on your documents and search results, with citations or references from the files.
# Automatic Top-Up
Source: https://novita.ai/docs/guides/auto-top-up
When you enable **"Automatic Top-Up"**, your account will automatically add credit if the balance falls below a specified threshold. Please follow the steps below to set up this feature:
### 1. Add a payment method.
Go to Payment Methods in the console.
If you don't have a payment method, click **"Add payment method"**. You'll be redirected to Stripe's secure page to add one, and we do not store any payment information.
### 2. Enable Automatic Top-Up and configure settings.
Click **"Modify"** in the **"Automatic Payments"** panel to configure your Automatic Top-Up settings.
After enabling the **"Enable Automatic Top-Up"** toggle, set valid values for **"When credit goes below"** and **"Bring credit back up to"**, then click **"Save settings"** to complete the process.
# Run Axolotl on Novita AI
Source: https://novita.ai/docs/guides/axolotl
Discover how to fine-tune large language models (LLMs) effortlessly with Axolotl on Novita AI.
Axolotl offers a robust, flexible framework for training LLMs using advanced techniques, supporting various model architectures and training strategies. Ideal for researchers and developers, Axolotl combined with Novita AI’s powerful, hardware-free infrastructure streamlines workflows, removing local hardware constraints.
This guide provides a step-by-step process to deploy and run Axolotl on Novita AI, unlocking the full potential of your AI model training projects.
## How to Use Axolotl:main-latest on Novita AI
Step 1: Access [**the GPU Instance Console**](https://novita.ai/gpus)
* Click `Get Started` to access the GPU Instance console.
Step 2: Choose a Template and GPU Type
* Browse various official templates and GPU card options.
* Select [**the Axolotl:main-latest template**](https://novita.ai/gpus-console?templateId=311).
* Click `Deploy` under the 4090 GPU card to proceed to the instance creation page.
Step 3: Adjust Disk and Configuration Parameters
* In the `Disk` section, adjust the size of the system disk and local disk.
* In the `Configuration` section, modify settings such as the image, startup commands, ports, and environment variables.
* Check the box for Start Jupyter Notebook to launch Jupyter.
Step 4: Confirm Configuration and Deploy
* Review the instance configuration and costs on the confirmation page.
* Click `Deploy` to start the deployment process.
Step 5: Wait for Deployment to Complete
* Wait for the instance to finish deploying.
Step 6: Manage and Monitor Instances
* Once deployment is complete, the system will redirect you to the `Instance Management` page.
* Locate your newly created instance, which will initially show a Pulling status (indicating the image is being downloaded).
* Click the small arrow on the right side of the instance to view details.
* Monitor the image pull progress. Once complete, the instance will transition to Running status.
* Click `Logs` to view deployment logs.
###
Step 7: Check Instance Logs
* Go to the `Instance Logs` tab to check if the service is starting.
* Wait for the service to finish initializing.
###
Step 8: Connect to Jupyter Lab
* Close the logs page.
* Click `Connect` to open the connection information page.
* Locate the `Connection Options` section and click `Connect to Jupyter Lab` to access the Jupyter interface.
###
Step 9: Access Jupyter Lab
* Wait for the Jupyter Lab web interface to load.
* Open `Terminal` to run an official example and verify the service is working correctly.
###
Step 10: Run a Fine-Tuning Example
* Execute the official example code to perform a fine-tuning task.
```bash theme={"system"}
# Fetch axolotl examples
axolotl fetch examples
# Or, specify a custom path
axolotl fetch examples --dest path/to/folder
# Train a model using LoRA
axolotl train examples/llama-3/lora-1b.yml
```
**Note:** You can't change the default mount path for the network volume in the console. It can only be set when creating an instance or via OpenAPI. Set your desired mount path during instance creation when attaching a volume.
# Browser Use
Source: https://novita.ai/docs/guides/browseruse
Easily enhance your browsing experience by integrating Novita AI with Browser Use for intelligent web interactions.
Browser Use is an open-source library that empowers LLMs to directly control web browsers, revolutionizing web interaction with advanced automation. By integrating Novita AI's powerful LLMs and tools, Browser Use enables seamless browsing, content generation, and task automation for an optimized user experience.
This tutorial will show you how to integrate the Novita AI API with Browser Use to automate browser interactions.
## **How to Use Browser Use with Novita AI**
### **Prerequisites**
* Python 3.11 or higher
* A Novita AI API key
### **Installation**
Step 1: Install Browser Use using pip:
```bash theme={"system"}
pip install browser-use
```
Step 2: Install Playwright (required for browser automation):
```bash theme={"system"}
playwright install chromium
```
### **Obtaining Novita AI LLM API Key**
* Create an account: Visit [Novita AI’s website](https://novita.ai/) and sign up for an account.
* Generate your API Key: After logging in, navigate to the [Key Management](https://novita.ai/settings/key-management) page to generate your API key. This key is essential to connect Novita AI’s models to Cursor.

### **Environment Setup**
Create a `.env` file in your project root and add your Novita API key:
```bash theme={"system"}
NOVITA_API_KEY=your_api_key_here
```
### Basic Implementation
* Here's a complete example of using Browser Use with Novita AI's API:
```python theme={"system"}
"""
Web automation using Novita AI and Browser Use
"""
import asyncio
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from pydantic import SecretStr
from browser_use import Agent
# Load environment variables
load_dotenv()
api_key = os.getenv('NOVITA_API_KEY', '')
if not api_key:
raise ValueError('NOVITA_API_KEY is not set')
async def run_search():
agent = Agent(
task=(
'1. Go to https://www.reddit.com/r/LocalLLaMA '
"2. Search for 'browser use' in the search bar "
'3. Click on first result '
'4. Return the first comment'
),
llm=ChatOpenAI(
base_url='https://api.novita.ai/openai',
model='deepseek/deepseek-v3-0324',
api_key=SecretStr(api_key),
),
use_vision=False,
)
await agent.run()
if __name__ == '__main__':
asyncio.run(run_search())
```
### **Creating Your Own Tasks**
* You can customize the `task` parameter to perform a wide variety of web tasks:
```python theme={"system"}
task="Compare the price of gpt-4o and DeepSeek-V3"
```
* For more complex tasks, you might want to enable vision capabilities:
```python theme={"system"}
agent = Agent(
task="Find and summarize the latest news about AI on TechCrunch",
llm=ChatOpenAI(
base_url='https://api.novita.ai/openai',
model='deepseek/deepseek-v3-0324',
api_key=SecretStr(api_key),
),
use_vision=True,
)
```
# Budgets
Source: https://novita.ai/docs/guides/budgets
The **Team Member Budgets** feature allows you to set flexible spending limits for each member,helping you effectively control overall costs.
> **Note**: This feature is only available for team accounts. Personal accounts or accounts not part of a team will see the menu but cannot access budget management functionalities. You can convert your personal account into a team account or join an existing team to collaborate with others.
## Permissions
* Only the **Team Owner**, **Admin**, and **Billing Roles** can check and manage team budgets.
* **Developer** and **Basic Roles** can only view their own budget type and budget limit, and cannot make changes.
## Budget Control Modes
For all team members, the system enforces budget execution and reset logic based on each member’s **Budget Type**.
### 1. Budget Type Description
| **Budget Type** | **Description** | **Example** |
| :-------------: | :-------------: | :-------------------------: |
| **Unlimited** | No limit | Unlimited usage |
| **One-time** | One-time budget | budget frozen once consumed |
### 2. Budget Type Switching
When an administrator changes a member’s budget type, the system applies the following transition rules:
| **Switch Path** | **Processing Logic** |
| :------------------- | :----------------------------------------------------------------------------------------------------------------------- |
| Unlimited → One-time | The new budget will take effect immediately, with current spending reset to zero and tracked under the new budget cycle. |
| One-time → Unlimited | Current quota and restrictions are immediately discarded, and unlimited quota begins. |
## Adjusting Member Budget Limits
Please follow the steps below to adjust a member’s budget:
1. Go to the [**Team Member Budgets**](https://novita.ai/billing/budgets) page.
2. Find the relevant member in the list, or use the search box to quickly locate them. Budgets can be configured for both current team members and invited members who are pending acceptance.
3. Click the "**Edit"** button and select the desired budget type.
* New members are set to **Unlimited** by default, and this can be changed at any time.
* The **One-time** budget type allows you to set a specific budget limit.
4. Click the "**Refresh"** button to get the latest budget and consumption data.
> **Note**: Budget changes take effect immediately. Once a member’s quota is exhausted, they will be unable to initiate new service calls.
## Budget Usage & Service Invocation Rules
* All API Keys created by a member share the same budget pool.
* Before any service is started, the system will automatically check both the wallet balance and the member’s remaining quota. If either is insufficient, the request will be denied.
* Upon reaching the budget limit, all related tasks will be automatically stopped.
# ChatBox
Source: https://novita.ai/docs/guides/chatbox
Integrate Novita AI with Chatbox to streamline your LLM experience across devices with powerful models and a seamless chat interface.
With the Novita AI & Chatbox integration, you gain comprehensive access to Novita AI’s powerful suite of LLMs, including GLM 4.5, Deepseek, Qwen, and more.
This guide will walk you through integrating Novita AI’s LLMs with the ChatBox platform.
## What is ChatBox
ChatBox is an open-source, cross-platform AI chat application designed to interact with a variety of LLMs. It supports a wide range of LLM providers including OpenAI, Claude, Google Gemini, and Novita AI. With features like multi-language support, advanced prompt management, image generation via DALL·E 3, and collaboration tools, Chatbox enables developers and users alike to create and manage AI workflows efficiently.
## Integration Steps
### Step 1 Open Settings
Open the Chatbox app and find the **Settings** at the bottom left.
### Step 2 Locate Novita AI
The Settings page defaults to the Model Providers tab. Scroll through the model list to find Novita AI.
### Step 3 Model Selection
Select Novita AI to view the model selection section, where several default models are available, including models supported by Novita such as deepseek-v3-0324, deepseek-r1-0528, glm 4.5, and more. For the complete list of available models, visit [https://novita.ai/models](https://novita.ai/models).
Click **Fetch** to open the model list. From there, you can add models to the default list or remove existing ones as needed.
### Step 4 Check API Key
Here, you can enter your API key. After entering it, click **Check**. If you see the message **“Connection successful”**, it means you're all set to start using it.
### Step 5 Start Using
Start a new chat. Enter "Hello" and receive a reply. You should see that the response is provided by Novita AI.
## Embedded Model Use
### Step 1 Locate Knowledge Base
Click **"Settings"**, then **"Knowledge Base"**. If there’s nothing there yet, you’ll be prompted to create a new one (the screenshot shows one already created).
### Step 2 Create Knowledge Base
When creating a new knowledge base, remember to select your desired **embedding model**, then click **"Create"**.
### Step 3 Start Using
Return to the **chat page**, and click the **knowledge base icon** below the input box. Then, select the knowledge base you just created. Now start your conversation with Novita AI LLM on ChatBox.
# Claude Code
Source: https://novita.ai/docs/guides/claude-code
Claude Code is an AI-powered coding assistance published by Anthropic that provides a terminal interface, allowing developers to delegate complex programming tasks directly from the terminal to Claude Code for completion.
Now, Novita provides [Anthropic SDK compatible LLM API services](/guides/llm-anthropic-compatibility), enabling you to easily use Novita LLM models in Claude Code to complete tasks. Please refer to the guide below to complete the integration process.
## Quick Start
### 1. Install Claude Code
### 3. Build a web game from scratch
Input your task description, then press `Enter` to start this task.
```bash Bash icon=terminal theme={"system"}
> Create a ping-pong web game. Use only HTML, CSS, and JavaScript, try to create some novel content, and the final output should be a single HTML file.
```
Claude Code will analyze your requirements, create a multi-step plan, and automatically begin executing the tasks.
After completing each task, Claude Code will mark it as complete and proceed to plan and explain the details of the next task.
### 4. Task Results and Preview
After all tasks are completed, you will see the following messages in the terminal:
At this point, you can open the `gravity-pong.html` file in your browser to view and play the game.
### 5. Use Git with Claude Code
Claude Code makes Git operations conversational:
```bash Bash icon=terminal theme={"system"}
> what files have I changed?
```
```bash Bash icon=terminal theme={"system"}
> commit my changes with a descriptive message
```
You can also prompt for more complex Git operations:
```bash Bash icon=terminal theme={"system"}
> create a new branch called feature/quickstart
```
```bash Bash icon=terminal theme={"system"}
> show me the last 5 commits
```
```bash Bash icon=terminal theme={"system"}
> help me resolve merge conflicts
```
### 6. Improve the game
As we can see, this game needs improvement: the orbs' position overlaps with the paddle's position, which affects the gaming experience. Next, we will reposition the orbs to the top-right corner and add game restart functionality.
```bash Bash icon=terminal theme={"system"}
> Position the orbs in the top-right corner and support game restart functionality.
```
This is the game preview after the improvement:
## Try More Workflows
For reference, the following provides some prompt examples for different workflows:
* Code Refactoring
```bash Bash icon=terminal theme={"system"}
> Please refactor the current project using Next.js framework.
```
* Write Unit Tests
```bash Bash icon=terminal theme={"system"}
> Please write some unit tests for the pricing policy in the project.
```
* Update Documentation
```bash Bash icon=terminal theme={"system"}
> Please update the installation dependencies section in the README.
```
* Code Review
```bash Bash icon=terminal theme={"system"}
> Please review the changes and provide optimization suggestions.
```
## Common Commands
| Command | Description | Example |
| :-------------------------- | :-------------------------------- | :---------------------------------- |
| `claude` | Start interactive mode | `claude` |
| `claude "task description"` | Run a one-time task | `claude "fix the build error"` |
| `claude -p "query"` | Run one-off query, then exit | `claude -p "explain this function"` |
| `claude -c` | Continue most recent conversation | `claude -c` |
| `claude -r` | Resume a previous conversation | `claude -r` |
| `claude commit` | Create a Git commit | `claude commit` |
| `/clear` | Clear conversation history | `> /clear` |
| `/help` | View available commands | `> /help` |
| `exit` or Ctrl+C | Exit Claude Code | `> exit` |
# CodeCompanion
Source: https://novita.ai/docs/guides/codecompanion
Supercharge Your Neovim Workflow with Novita AI and CodeCompanion.nvim.
CodeCompanion.nvim is a lightweight yet powerful Neovim plugin that connects advanced language models (LLMs) directly to your editor, enabling developers to work smarter and faster. With built-in support for Novita AI’s state-of-the-art models, this integration transforms your workflow by offering intelligent code suggestions, automated debugging, and streamlined refactoring tools.
In this comprehensive guide, we’ll walk you through the step-by-step process of setting up Novita AI with CodeCompanion.nvim. Learn how to optimize your Neovim setup and unlock the full power of AI-assisted coding for faster, smarter development.
## How to Leverage Novita AI with CodeCompanion.nvim
You can find the GitHub repository of CodeCompanion.nvim here: [olimorris/codecompanion.nvim](https://github.com/olimorris/codecompanion.nvim).
### Step 1: Generate Your Novita AI API Key
1. [Log in](https://novita.ai/user/login) to your Novita AI account.
2. Access the [Key Management Page](https://novita.ai/settings/key-management).
3. Create a new API key and copy it for later use.
### Step 2: Select a Model
1. Visit the[ Novita AI Model Library](https://novita.ai/models).
2. Choose a model that suits your needs (e.g., `meta-llama/llama-3.1-8b-instruct`).
3. Note down the model name.
### Step 3: Configure CodeCompanion
### JetBrains
Step 1: Access IDE settings by pressing Ctrl + Alt + S in your JetBrains environment.
Step 2: Navigate to plugins and search Continue in the JetBrains marketplace.
Step 3: Click `Install`and find the Continue logo on your right toolbar.
## How to Integrate Novita AI with Continue Using an API Key
### Step 1: Open VS Code
### Step 2: Search Continue
* Navigate to extensions and search Continue in the top search bar.
### Step 3: Install Continue
* Install the Continue extension by selecting the first result.
### Step 4: Click Continue
* Click the Continue icon on your left sidebar after installation completes.
### Step 5: Add your Chat Model (e.g. Novita AI)
* Select Novita AI from the provider menu in each marked location.
### Step 6: Enter the API key from Novita AI and Get Connected
* Copy your Novita AI API Key from the user avatar section for authentication.
# Cursor
Source: https://novita.ai/docs/guides/cursor
Learn how to integrate your Novita AI API keys with Cursor to unlock powerful AI models for programming. Get step-by-step instructions for seamless setup.
This guide will walk you through the steps needed to integrate Novita AI's models with Cursor. Using your own API keys, you can leverage Novita AI's large language models (LLMs) for custom AI messages in Cursor. This integration ensures that you can run AI-powered conversations and interactions while keeping full control over your API usage and cost.
## What is Cursor?
Cursor is a code editor built for programming with AI. It integrates with multiple large language models (LLMs) and allows you to input your own API keys, giving you full control over AI usage and costs. Whether you're coding or interacting with AI, Cursor streamlines the experience with features like smart autocomplete, auto-suggestions, and multi-model support, all within your development environment.
## Prerequisites
Before you begin the integration, ensure you have the following:
### **Novita AI LLM API Key**
* **Create an account**: Visit [Novita AI’s website](https://novita.ai/) and sign up for an account.
* **Generate your API Key**: After logging in, navigate to the [Key Management](https://novita.ai/settings/key-management) page to generate your API key. This key is essential to connect Novita AI’s models to Cursor.

* **Select a Model Name**: You’ll need to copy the model name you want to use from Novita AI’s [Model Library](https://novita.ai/models/llm/deepseek-deepseek-r1). Some available models include:
* `deepseek/deepseek-r1`
* `deepseek/deepseek-v3`
* `deepseek/deepseek-r1-distill-llama-70b`
* `deepseek/deepseek-r1-distill-llama-8b`
* `deepseek/deepseek-r1-distill-qwen-32b`
* `deepseek/deepseek-r1-distill-qwen-14b`
### **Download the Cursor App**
* Go to the official [Cursor website](https://cursor.com/) and download the Cursor app.
* Download Cursor App from [official website](https://cursor.com/)
## Integration Steps
### Connect Novita AI to Cursor
* Open the **Cursor App** and go to **Settings**.
* Navigate to the **Models** section.
* Uncheck all the other models that are pre-configured in Cursor.
* In the **Model Name** field, paste the model name you copied from the **Novita AI Model Library** (e.g., `deepseek/deepseek-r1`).
* Enter your **Novita AI API key** in the designated field.
* Click the **Verify** button to ensure your API key is correct. Once validated, the API key will be activated.
* In the **Open AI Base URL** field, override the default URL with the Novita AI endpoint:[`https://api.novita.ai/openai`](https://api.novita.ai/openai)
By following these steps, you’ll link your Novita AI API key with the Cursor app, enabling you to use Novita AI’s models through CursorStart a Chat with AI in Cursor.
### Start a Chat with AI in Cursor
To open the **Chat** interface, either:
* Click on **Toggle AI Pane**, or
* Press the keyboard shortcut **Ctrl + Alt + B** to start a new chat.
You can now send prompts and interact with the models you've added.
## Notes on Cursor Features
* **Tab Completion, Apply from Chat, and Composer**: These features require specialized models and will not work with custom API keys. If you wish to use these specific features, consider switching to the default models provided by Cursor.
# DeepSearcher
Source: https://novita.ai/docs/guides/deepsearcher
Easily access module support from Novita AI on DeepSearcher to build advanced search applications.
[DeepSearcher](https://github.com/zilliztech/deep-searcher) is an open-source solution designed to transform private data search and reasoning by integrating cutting-edge large language models (LLMs) with vector databases such as Milvus. With support for LLMs and embedding models from Novita AI, this powerful configuration delivers unmatched accuracy and efficiency in private data search.
This step-by-step guide will walk you through how to quickly and easily configure Novita AI with DeepSearcher.
## How to use DeepSearcher with Novita
Step 1: Follow the example in `examples/basic_example.py` .
Step 2: Add the following code below the line `config = Configuration()` :
```python theme={"system"}
config.set_provider_config("llm", "NOVITA", {"model": "deepseek/deepseek-r1-turbo"})
config.set_provider_config("embedding", "NovitaEmbedding", {"model": "baai/bge-m3"})
```
Step 3: Run `examples/basic_example.py` to execute the integration.
You'll get the following results:
# Dify
Source: https://novita.ai/docs/guides/dify
Learn how to integrate Novita AI’s DeepSeek LLMs with Dify to build intelligent, multi-turn AI applications for smarter, context-aware conversations.
With the Novita AI & Dify integration, you gain seamless access to a comprehensive suite of Novita AI LLM models, including DeepSeek, Llama, Qwen, and more, enabling you to effortlessly build and deploy advanced AI applications tailored to your needs.
This guide will walk you through integrating Novita AI’s DeepSeek R1 model with the Dify platform, enabling you to create AI applications with advanced multi-turn reasoning capabilities. DeepSeek ensures your AI applications understand the context and can hold natural, dynamic conversations, making interactions feel human-like.
## What is Dify?
**Dify** is an open-source platform that simplifies the development of generative AI applications. Whether you’re building a chatbot, knowledge assistant, or other AI-powered tools, Dify makes it easy to integrate advanced language models like **Novita AI’s DeepSeek** and deploy them quickly, with minimal coding.
### Key Features of Dify:
* **Visual Development**: Dify’s drag-and-drop interface allows you to quickly create and deploy applications without extensive coding, reducing development time.
* **Knowledge Base Augmentation**: Enhance AI responses using **Retrieval-Augmented Generation (RAG)**. This feature connects your AI to internal documents or specialized data for accurate, contextual, and informative answers.
* **Workflow Expansion**: Integrate sophisticated logic into your AI apps with **functional nodes**. You can also connect third-party platforms for additional functionality.
* **Data Insights**: Track important performance metrics such as conversations, engagement, and response quality. Dify also integrates with specialized analytics platforms to monitor and improve AI performance.
## Prerequisites
Before you begin, make sure you have:
* **Novita AI LLM API Key**:
* Visit [Novita AI’s website](https://novita.ai) and create an account.
* After logging in, go to the [**Key Management**](https://novita.ai/settings/key-management) page to generate your **API Key**. This key is required to connect Novita AI’s models to Dify.

* **Dify Account**:
* Sign up for a Dify account at [Dify.ai](https://dify.ai) to start building AI applications.
## Integration Steps
### 1. Connect Novita AI to Dify
To connect Novita AI’s models with Dify:
* Log in to your Dify account.
* Click on your profile icon or name in the top-right corner and select **Settings**.
* In the **Model Providers** section, find **Novita AI** in the list.
* Paste your **Novita AI API Key** into the provided field and click **Save**.
With this integration, you’ll now have access to **DeepSeek R1** and other Novita AI models directly in Dify.
### 2. Create a DeepSeek AI Application
Once the integration is complete, you can create an application powered by DeepSeek R1:
* From the Dify homepage, click **Create Blank App** in the left sidebar.
* Choose **Chatbot** as the application type.
* Give your app a name (e.g., “DeepSeek R1 Bot”) and click **Create**.
* From the **Model** dropdown, select **Novita AI DeepSeek R1**.
### 3. Enable Knowledge Base for Enhanced Text Analysis
To improve your AI’s response accuracy, augment it with a **knowledge base**. Using **Retrieval-Augmented Generation (RAG)**, your AI will be able to access documents and generate more contextually relevant responses.
#### Step 1: Create a Knowledge Base
* In Dify, go to the **Knowledge Base** section and click **Create Knowledage**.
* Upload documents (e.g., guides, FAQs, manuals) that provide relevant information for your AI to use.
* Use **Parent-Child Segmentation Mode** to maintain document hierarchy and context, ensuring DeepSeek processes the content correctly and understands relationships between sections.
#### Step 2: Integrate the Knowledge Base into Your AI App
* In your chatbot’s **Context Settings**, click the option to **Add Knowledge Base**.
* Choose the documents you uploaded and integrate them into your app’s context to improve its responses.
#### Step 3: Share Your AI Application
Once your AI app is ready, you can share or embed it on external platforms:
* **Public Link**: Generate a public link for others to access your AI application.
* **Embed on Websites**: Embed your app directly onto your website using Dify’s provided embed code.
### 4. Enhance AI Capabilities with Workflow-based Applications
If you need more than just a chatbot, Dify supports **workflow-based applications**. This allows you to add custom business logic and extend your AI’s capabilities by using functional nodes.
* Choose **Workflow** as the application type.
* Use **drag-and-drop nodes** to define your app’s behavior based on conditions or actions.
* Integrate external APIs (e.g., Google Search, databases) to provide richer data for your AI to process, enabling more insightful and automated responses.
Integrating Novita AI’s DeepSeek R1 with Dify provides a robust platform for creating advanced AI applications. With DeepSeek’s multi-turn reasoning, your AI App can have more dynamic, context-aware conversations, making it highly effective for building chatbots, knowledge assistants, and more.
# DocsGPT
Source: https://novita.ai/docs/guides/docsgpt
Seamlessly integrate Novita AI with DocsGPT to unlock powerful AI models for enhanced workflows.
DocsGPT simplifies documentation with AI-powered assistance. Integrating it with Novita AI enhances its performance, offering faster processing, scalable resources, and advanced model support for improved productivity.
This guide walks you through how to use DocsGPT with Novita AI based on the OpenAl APl, offering a way to query your content and get served customized answers.
## **How to use DocsGPT**
### **Prerequisites:**
**Docker:** Ensure you have [Docker ](https://docs.docker.com/engine/install/)installed and running on your system.
### **Launching DocsGPT (macOS and Linux)**
For macOS and Linux users, the easiest way to launch DocsGPT is using the provided `setup.sh` script. This script automates the configuration process and offers several setup options.
Step 1: Download the DocsGPT repository
* First, you need to download the DocsGPT repository to your local machine. You can do this using Git:
```bash theme={"system"}
git clone https://github.com/arc53/DocsGPT.git
cd DocsGPT
```
Step 2: Run the `setup.sh` script
* Navigate to the DocsGPT directory in your terminal and execute the `setup.sh` script:
```bash theme={"system"}
./setup.sh
```
This interactive script will guide you through setting up DocsGPT. It offers four options: using the public API, running locally, connecting to a local inference engine, or using a cloud API provider. The script will automatically configure your `.env` file and handle necessary downloads and installations based on your chosen option.
### **Launching DocsGPT (Windows)**
For Windows users, please refer to the [Docker Deployment documentation](https://docs.docsgpt.cloud/Deploying/Docker-Deploying) for detailed step-by-step instructions on setting up DocsGPT using Docker.
## **How to Integrate Novita AI API with DocsGPT**
Step 1: Log in to[ Novita AI ](https://novita.ai)and create an [API Key](https://novita.ai/settings/key-management)
Step 2: Select option4 and connect cloud API provider in your terminal
Step 3: Choose option7 Novita and enter the API key you just created
Step 4: Wait for the startup process to complete
Step 5: Access DocsGPT in your browser
* Once the setup is complete and Docker containers are running, navigate to [http://localhost:5173/](http://localhost:5173) in your web browser to access the DocsGPT web application.
Step 6: Stopping DocsGPT
* To stop DocsGPT, simply open a new terminal in the `DocsGPT` directory and run:
```bash theme={"system"}
docker compose -f deployment/docker-compose.yaml down
```
* (or the specific `docker compose` command shown at the end of the `setup.sh` execution, which may include optional compose files depending on your choices).
# Common Error Codes
Source: https://novita.ai/docs/guides/error
This document summarizes the most common error codes returned by the Novita API platform, along with definitions, causes, and recommended solutions to help users troubleshoot efficiently.
***
## Error Code 400
**Description**: Invalid request parameters.\
**Solution**:\
Review the error message details and check whether the parameter formats, field names, or value ranges comply with the API documentation.
***
## Error Code 401
**Description**: API Key is missing or incorrect.\
**Solution**:
* Ensure the API Key is provided in the request;
* Verify that the API Key is correct and has not expired;
* If using environment variables or config files, confirm they are being read correctly during execution.
***
## Error Code 403
**Description**: Access denied due to insufficient permissions.\
**Solution**:
* Verify that your account associated with the API Key has permission to access the requested model;
* Some models require identity verification to access:
* Log in to the console and check your account’s verification status;
* If not verified, complete identity verification first;
* Alternatively, use an API Key from an already verified account.
***
## Error Code 429
**Description**: Rate limit exceeded (Too Many Requests).\
**Solution**:
* Check if the limit is due to **TPM** (tokens per minute) or **RPM** (requests per minute);
* Refer to the official Rate Limits documentation;
* To raise your rate limit, contact support or use a verified account.
***
## Error Code 503 / 504
**Description**: Backend timeout or service unavailable, often caused by high system load or throttling.
### Possible Causes:
* GPU or CPU overload on model service nodes;
* Long generation time on non-streaming requests exceeds gateway timeout;
* Failures in downstream services (e.g., Redis, model engine);
* Traffic shaping module activated surge protection and returned 503.
### Recommended Solutions:
**For API Users**:
* **Enable retry mechanism**: Use exponential backoff to prevent repeated overload;
* **Switch to streaming mode**: Streaming responses return tokens as they’re generated, lowering latency and timeout risk;
* **Optimize client settings**: Ensure `client_timeout` and `proxy_timeout` exceed 60 seconds;
* **Avoid peak periods**: For high concurrency scenarios, retry during off-peak hours.
**For Platform Ops**:
* Enhance monitoring and auto-scaling of model services;
* Adjust gateway-level `proxy_read_timeout` appropriately;
* Implement fine-grained throttling rules (e.g., priority queues, core-business prioritization);
* Use Prometheus + Alertmanager to trigger alerts on 503/504 spikes.
***
## Error Code 500
**Description**: Internal server error—typically caused by backend exceptions or model engine crashes.\
**Solution**:
* These issues usually require platform-side resolution. Contact support to investigate logs and system resources;
* Optionally, try switching models or falling back to a less resource-intensive configuration.
***
## Other Errors
For undefined or undocumented errors:
* First, refer to the `message` field in the API response;
* Next, check request logs or console traces;
* Finally, contact Novita support or submit a ticket for further assistance.
# FAQ
Source: https://novita.ai/docs/guides/faq
Here are some frequently asked questions about Novita AI. Before contacting our support team, please check the FAQs below to help you quickly find solutions.
##
### 6. No instance specifications with a specified CUDA version.
CUDA versions are backward compatible. For example, if your service relies on CUDA version 12.1, you can choose an instance specification with a CUDA version greater than or equal to 12.1.
### 7. What is the maximum CUDA version supported by the platform?
You can check the allowed CUDA versions in the "Filter" module at the bottom right corner of the Explore.
### 8. How to diagnose the "Save Image" failure?
First, try to troubleshoot the problem through the logs of the "Save Image" task. If you are saving the image to a private repository address, please check whether your Container Registry Auth Configuration is correct. If the problem cannot be resolved, you can contact us.
### 9. Can dedicated IP be supported?
Yes. Currently, this capability is not open to the public. If you have such requirements, please contact us.
### 10. How to check the GPU usage of the instance?
Due to the PID isolation of Docker containers, the `nvidia-smi` command cannot be used to view the process. You can install the `py3nvml` library and use the shell command to check the GPU usage:
```bash theme={"system"}
# Install the py3nvml library.
$ pip install py3nvml
# Check the GPU usage.
$ py3smi
Fri Sep 20 12:17:39 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI Driver Version: 550.54.14 |
+---------------------------------+---------------------+---------------------+
| GPU Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
+=================================+=====================+=====================+
| 5 35% 28C 8 11W / 450W | 353MiB / 24564MiB | 0% Default |
+---------------------------------+---------------------+---------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU Owner PID Uptime Process Name Usage |
+=============================================================================+
+-----------------------------------------------------------------------------+
```
##
* Select the API Key Tab.
* Generate New Key: Click on `Generate new key`. During the API key creation process, you can enable `read` and `write` permissions. Write keys enable access to Helicone through our proxy service, feedback or any other Helicone service when calling `POST` or using our gateway.
### **Step 3: Pick Your Preferred Integration Method**
* Choose your provider from the options below to view specific instructions.
### **Step 4: Send Your First Request**
* Upon receiving your requests, you will see them in the `Requests` tab.
* Your new request will be immediately visible on the `Dashboard`.
## **Obtaining Novita AI API Key**
To obtain your [Novita AI API key](https://novita.ai/), simply log into your account, navigate to the LLM API key management section, and add the necessary credits to begin.
### **Step 1: Go to Novita AI and Log in**
* Novita AI offers multiple login options: use your Google or GitHub credentials for instant account creation, or register directly with your email address.
### **Step 2: Manage Novita AI LLM API Key**
Novita AI ensures secure API access by using Bearer authentication, where your API key is included in the header format "Authorization: Bearer \{\{API Key}}".
* Your first login generates a default key automatically - access all keys through the "Key Management" section in settings.
* Additional keys can be created using the “+ Add New Key” function.
## **How to Integrate Novita AI API with Helicone**
The Novita AI-Helicone integration requires three simple steps: accessing both platform accounts, configuring your API keys as environment variables (HELICONE\_API\_KEY and NOVITA\_API\_KEY), and updating the base URL with proper authentication headers.
### **Step 1: Log into your Novita AI Account and Helicone Account**
* Access your [Novita AI](https://novita.ai/) account or create one, then generate your API key directly from the dashboard.
* Sign in to [Helicone](https://www.helicone.ai/) (or create a new account) to obtain your API key.
### **Step 2: Set HELICONE\_API\_KEY and NOVITA\_API\_KEY as Environment Variables**
```bash theme={"system"}
$env:HELICONE_API_KEY="
### **Step 2: Choose Inference API Modes**
* Custom Key Mode: Calls are sent directly to the inference provider, utilizing your own API key.
* HF-Routed Mode: In this mode, no provider token is required. Charges are applied to your Hugging Face account instead of the provider's account.
### **Step 3: Explore Compatible Providers on Model Pages**
* Model pages display third-party inference providers compatible with the selected model (the ones that are compatible with the current model, sorted by user preference).
## **Using Huggingface\_hub from Python by the Client SDKs**
### **Step 1: Install** [**Huggingface\_hub**](https://github.com/huggingface/huggingface_hub)
```python theme={"system"}
pip install huggingface_hub
```
### **Step 2: Call model API in Python**
```python theme={"system"}
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="novita",
api_key="xxxxxxxxxxxxxxxxxxxxxxxx", # optional, required from 2nd calling, get from https://novita.ai/settings/key-management
)
# an example question
messages = [
dict(
role="user",
content='Sally (a girl) has 3 brothers. Each brother has 2 sisters. How many sisters does Sally have?',
),
]
completion = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1",
messages=messages,
max_tokens=512,
)
print(completion.choices[0].message)
```
## **Using Huggingface\_hub from JS by the Client SDKs**
```json theme={"system"}
import { HfInference } from "@huggingface/inference";
const client = new HfInference("xxxxxxxxxxxxxxxxxxxxxxxx");
const chatCompletion = await client.chatCompletion({
model: "deepseek-ai/DeepSeek-R1",
messages: [
{
role: "user",
content: "What is the capital of France?"
}
],
provider: "novita",
max_tokens: 500
});
console.log(chatCompletion.choices[0].message)
```
# Introduction
Source: https://novita.ai/docs/guides/introduction
Novita provides simple, reliable, and cost-effective cloud infrastructure for running AI models. We empower developers with scalable, production-ready model inference so they can focus on building AI applications.
Step 2: Select Kohya\_ss:GUI Template
* Locate and select the `Kohya_ss:GUI` official template.
* Click `Deploy` button under the 4090 GPU card option to enter the instance creation page.
###
Step 3: Configure Disk Parameters and Review Configuration Settings
* On the left panel, adjust the disk settings as needed:
* Set appropriate system disk size;
* Configure local disk capacity based on your storage needs.
* Check the right panel for configuration options:
* Verify image settings are correct;
* Confirm startup commands are properly configured;
* Ensure ports and environment variables meet your requirements.
* Confirm all settings are correct and then click the `Next` button to advance to the final confirmation page.
###
Step 4: Proceed to Confirmation and Deploy Your Instance
* Review the complete instance configuration summary.
* Verify the cost details displayed on this page.
* Click `Deploy` to initiate the deployment process.
###
Step 5: Wait as the System Creates Your Instance
###
Step 6: Monitor Deployment Progress and Track Image Download
* After deployment, the system will automatically redirect you to the instance management page.
* Your new instance will display `Pulling` status while downloading the image.
* After clicking the arrow icon next to your instance name, the instance details panel will be expanded with the image download progress in real-time.
* Once image downloading completes, instance will change status from `Pulling` to `Running`.
###
Step 7: Check Instance Logs
* Click the `Logs` button on your instance and select `Instance Logs` from the available options.
* Observe the Kohya\_ss service startup process in the logs and wait for confirmation that all services have loaded successfully.
###
Step 8: Connect to Your Instance
* Close the logs view when ready and click the `Connect` button to view connection options.
* View various connection methods: SSH, TCP, and HTTP of your instance.
* For Kohya\_SS GUI access, focus on the HTTP connection details. Therefore, in the Connection Options section, click `Connect to HTTP Service` and access to a new browser tab or window.
###
Step 9: Begin Using Your Instance
* Allow a few moments for the web interface to fully load and get ready to run Kohya\_ss:GUI on Novita AI.
# LangChain
Source: https://novita.ai/docs/guides/langchain
Integrate Novita AI with LangChain to build intelligent, language model-driven applications. Learn setup, API usage, and function calling workflows.
This guide will walk you through the process of integrating Novita AI with LangChain. You’ll be able to use Novita AI’s powerful language models with LangChain’s robust tools for building language model-driven applications.
## What is LangChain?
LangChain is a framework for developing applications powered by language models. It enables applications that:
* **Are context-aware**: LangChain connects a language model to sources of context (such as prompt instructions, few-shot examples, content to ground its response in, etc.).
* **Reason**: LangChain allows language models to reason—whether it’s determining how to answer based on provided context or deciding what actions to take.
With LangChain, you can build complex workflows, enhance model behavior with external knowledge, and create intelligent systems that can interact dynamically with users and data sources.
## Prerequisites
Before you start, make sure you have the following:
* **Novita AI LLM API Key**:
* Visit [Novita AI’s website](https://novita.ai/) and create an account.
* After logging in, go to the [**Key Management**](https://novita.ai/settings/key-management) page to generate your **API Key**. This key is required to connect Novita AI’s models to LangChain.

* A basic understanding of Node.js, JavaScript, and how to use environment variables.
## Integration
### Step 1: Set the API Key
For most environments, set the environment variable `NOVITA_API_KEY` as follows:
```bash theme={"system"}
export NOVITA_API_KEY="your-api-key"
```
Make sure to replace `your-api-key` with the actual key you got from Novita AI.
### Step 2: Install the Required Packages
To integrate Novita AI with LangChain, you need to install the `@langchain-community` package, which includes the Novita AI integration.
Choose one of the following commands to install the necessary packages:
**Using npm:**
```bash theme={"system"}
npm install @langchain/community @langchain/core
```
**Using yarn:**
```bash theme={"system"}
yarn add @langchain/community @langchain/core
```
**Using pnpm:**
```bash theme={"system"}
pnpm add @langchain/community @langchain/core
```
### Step 3: Instantiate the Novita AI Model
Once you’ve installed the necessary packages, you can instantiate the Novita AI model using the `ChatNovitaAI` class.
Here’s an example that demonstrates how to do that:
```javascript theme={"system"}
import { ChatNovitaAI } from "@langchain/community/chat_models/novita";
const llm = new ChatNovitaAI({
model: "deepseek/deepseek-r1", // You can choose the model you want to use
temperature: 0, // Optional: Controls randomness. 0 is deterministic.
// other parameters can be set here...
});
```
### Step 4: Invoke the Model for Chat Completion
Once the model is instantiated, you can use it to generate chat completions by invoking it with a message.
Here's an example of how to send a message and get a response:
```javascript theme={"system"}
const aiMsg = await llm.invoke([
{
role: "system",
content: "You are a helpful assistant that translates English to French. Translate the user sentence.",
},
{
role: "human",
content: "I love programming.",
},
]);
console.log(aiMsg.content); // The model’s response will be printed here
```
### Step 5: Chain Model with Prompt Templates
LangChain allows you to create powerful workflows by chaining models together with prompt templates. This can be especially useful when you want to reuse the same format for multiple inputs.
Here’s an example where we chain the Novita AI model with a custom prompt template for translating between languages:
```javascript theme={"system"}
import { ChatPromptTemplate } from "@langchain/core/prompts";
// Create a template for translating languages
const prompt = ChatPromptTemplate.fromMessages([
[
"system",
"You are a helpful assistant that translates {input_language} to {output_language}.",
],
["human", "{input}"],
]);
// Chain the prompt with the model
const chain = prompt.pipe(llm);
// Invoke the chain with inputs for translation
const result = await chain.invoke({
input_language: "English",
output_language: "German",
input: "I love programming.",
});
console.log(result.content); // The translated text will be printed here
```
### Step 6: Customize the Workflow
You can modify the temperature, add more messages, or tweak other parameters depending on your use case. LangChain is highly flexible, allowing you to design complex interactions by chaining multiple prompts, adding conditional logic, or working with different models.
## Function Calling with Novita AI and LangChain
To implement function calling (or tool usage) with Novita AI's LLM API, LangChain can serve as a convenient framework. In this example, we’ll create a simple math application that allows the model to perform addition and multiplication operations via function calls.
💡 While this guide uses LangChain for convenience, implementing function calling doesn’t require any specific framework. The key is designing the right prompts to make the model understand and correctly invoke functions. LangChain is used here simply to streamline the implementation.
### Prerequisites
First, install the required packages:
```bash theme={"system"}
pip install langchain-openai python-dotenv
```
### Setting Up the Environment
Create a `.env` file in your project root and add your Novita AI API key:
```
NOVITA_API_KEY=your_api_key_here
```
### Implementation Steps
1. **Define the Tools**
First, let’s create two simple mathematical tools using LangChain's `@tool` decorator:
```python theme={"system"}
from langchain_core.tools import tool
@tool
def multiply(x: float, y: float) -> float:
"""Multiply two numbers together."""
return x * y
@tool
def add(x: int, y: int) -> int:
"""Add two numbers."""
return x + y
tools = [multiply, add]
```
2. **Create the Tool Execution Function**
Next, implement a function to execute the tools:
```python theme={"system"}
from typing import Any, Dict, Optional, TypedDict
from langchain_core.runnables import RunnableConfig
class ToolCallRequest(TypedDict):
name: str
arguments: Dict[str, Any]
def invoke_tool(
tool_call_request: ToolCallRequest,
config: Optional[RunnableConfig] = None
):
"""Execute the specified tool with given arguments."""
tool_name_to_tool = {tool.name: tool for tool in tools}
name = tool_call_request["name"]
requested_tool = tool_name_to_tool[name]
return requested_tool.invoke(tool_call_request["arguments"], config=config)
```
3. **Set Up the LangChain Pipeline**
Create a chain that uses Novita AI's LLM to select and prepare tool calls:
```python theme={"system"}
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import render_text_description
import os
def create_chain():
"""Create a chain that uses the specified LLM model to select and prepare tool calls."""
model = ChatOpenAI(
model="meta-llama/llama-3.3-70b-instruct",
api_key=os.getenv("NOVITA_API_KEY"),
base_url="https://api.novita.ai/openai",
)
rendered_tools = render_text_description(tools)
system_prompt = f"""\
You are an assistant that has access to the following set of tools.
Here are the names and descriptions for each tool:
{rendered_tools}
Given the user input, return the name and input of the tool to use.
Return your response as a JSON blob with 'name' and 'arguments' keys.
The `arguments` should be a dictionary, with keys corresponding
to the argument names and the values corresponding to the requested values.
"""
prompt = ChatPromptTemplate.from_messages(
[("system", system_prompt), ("user", "{input}")]
)
return prompt | model | JsonOutputParser()
```
4. **Create the Main Processing Function**
Implement the main function that processes mathematical queries:
```python theme={"system"}
def process_math_query(query: str):
"""Process a mathematical query by using an LLM to select the appropriate tool and execute it."""
chain = create_chain()
message = chain.invoke({"input": query})
result = invoke_tool(message, config=None)
return message, result
```
5. **Usage Example**
Here’s how to use the implementation:
```python theme={"system"}
if __name__ == "__main__":
message, result = process_math_query(
"meta-llama/llama-3.3-70b-instruct",
"what's 3 plus 1132"
)
print(result) # Output: 1135
```
# Langflow
Source: https://novita.ai/docs/guides/langflow
Six steps to achieve Novita AI's LLM API integration with Langflow, enabling faster application deployment and simplified workflows.
The integration of Langflow and Novita AI creates a powerful development environment that streamlines the creation of multi-agent and RAG applications. This combination leverages Langflow's intuitive visual flow builder for rapid prototyping while enabling seamless workflow design and testing capabilities. Novita AI enhances this ecosystem with its advanced Model APIs, featuring industry-leading language models including Llama, DeepSeek, and Mistral.
This step-by-step guide will help you access Novita AI LLM API on Langflow and start building AI-powered applications.
## **Accessing Novita AI LLM API on Langflow**
### Step 1: **Expand the "Models" Tag**
* Find and expand the "Models" section on Langflow’s sidebar.
### **Step 2: Drag Novita AI to Your Canvas**
* Select and position the "Novita AI "component from the sidebar onto the Langflow canvas area.
### **Step 3: Create a Global Variable for the Novita AI API Key**
* Set up Langflow authentication instantly by adding your Novita AI API Key as a global variable.
### Step 4: **Get your Novita AI API Key**
* Generate Your API Keys Instantly on the [Key Management page](https://novita.ai/settings/key-management).
### Step 5: **Select a Model**
* Choose from Novita AI's powerful language models: DeepSeek, Llama, and Mistral and then select the perfect AI model for your project needs.
### **Step 6: Start Building Your AI App**
* Create advanced AI functionality by connecting Novita AI components to your Langflow workflow system.
# Langfuse
Source: https://novita.ai/docs/guides/langfuse
Step-by-step guide to set up Langfuse with Novita AI to develop, monitor, evaluate, and debug AI applications.
With Langfuse, your team can collaboratively debug, analyze, and iterate on their LLM applications built with Novita AI. Its fully integrated features streamline the development workflow, enhancing efficiency and accelerating progress.
This guide shows you how to integrate Novita AI with Langfuse. Novita AI's API endpoints for chat, language and code are fully compatible with OpenAI's API. This allows us to use the Langfuse OpenAI drop-in replacement to trace all parts of your application.
# Prerequisites
Before you begin, make sure you have the following:
* Novita AI LLM API Key:
* Visit Novita AI’s website to create an account.
* Log in and go to the Key Management page to generate your API Key, which is essential for integrating Novita AI’s models with Langfuse.
* Langfuse Account:
* Sign up for a Langfuse account on the official [Langfuse](https://langfuse.com)website to start building powerful AI applications.
# Integration Steps
## Step 1: Install Dependencies
Ensure you have installed all the required Python packages:
```shell theme={"system"}
pip install openai langfuse
```
## Step 2: Set Up Environment Variables
```python theme={"system"}
import os
# Get keys for your project from the project settings page
# https://cloud.langfuse.com
os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-..." # DOCS EXAMPLE KEYS
os.environ["LANGFUSE_SECRET_KEY"] = "sk-..." # DOCS EXAMPLE KEYS
os.environ["LANGFUSE_HOST"] = "https://cloud.langfuse.com" # 🇪🇺 EU region
# os.environ["LANGFUSE_HOST"] = "https://us.cloud.langfuse.com" # 🇺🇸 US region
# Get your Novita AI API key from the project settings page
os.environ["NOVITA_API_KEY"] = "..."
```
## Step 3: Langfuse OpenAI drop-in Replacement
In this step we use the native [OpenAI drop-in replacement](https://langfuse.com/docs/integrations/openai/python/get-started) by importing `from langfuse.openai import openai`.
To start using Novita AI with OpenAI's client libraries, pass in your Novita AI API key to the `api_key` option, and change the `base_url` to [`https://api.novita.ai/openai:`](https://api.novita.ai/openai)
```python theme={"system"}
# instead of import openai:
from langfuse.openai import openai
client = openai.OpenAI(
api_key=os.environ.get("NOVITA_API_KEY"),
base_url="https://api.novita.ai/openai",
)
```
**Note:** The OpenAI drop-in replacement is fully compatible with the [Low-Level Langfuse Python SDKs](https://langfuse.com/docs/sdk/python/low-level-sdk) and [`@observe() decorator`](https://langfuse.com/docs/sdk/python/decorators) to trace all parts of your application.
## Step 4: Run An Example
The following code cell shows how to use the traced OpenAI client to call Novita AI's chat model. All API calls will be seamlessly traced by Langfuse.
```python theme={"system"}
client = openai.OpenAI(
api_key=os.environ.get("NOVITA_API_KEY"),
base_url="https://api.novita.ai/openai",
)
response = client.chat.completions.create(
model="meta-llama/llama-3.1-70b-instruct",
messages=[
{"role": "system", "content": "Act like you are a helpful assistant."},
{"role": "user", "content": "What are the famous attractions in San Francisco?"},
]
)
print(response.choices[0].message.content)
San Francisco, one of the most iconic cities in the world, is home to a plethora of famous attractions that cater to all interests and ages. Here are some of the most popular attractions in San Francisco:
1. **Golden Gate Bridge**: An engineering marvel and a symbol of San Francisco, the Golden Gate Bridge is a must-visit attraction. Take a walk or bike ride across the bridge for spectacular views of the city and the Bay.
2. **Alcatraz Island**: Explore the infamous former prison turned national park, which once housed notorious inmates like Al Capone. Take a ferry to the island and enjoy a guided tour of the prison and its surroundings.
3. **Fisherman's Wharf**: A bustling waterfront district, Fisherman's Wharf is famous for its seafood restaurants, street performers, and stunning views of the Bay Bridge and Alcatraz Island. Don't miss the sea lions at Pier 39!
4. **Chinatown**: San Francisco's Chinatown is one of the largest and oldest in the United States. Explore the colorful streets, try some authentic Chinese cuisine, and shop for unique souvenirs.
5. **Golden Gate Park**: A sprawling urban park that's home to several attractions, including the de Young Museum, the California Academy of Sciences, and the Japanese Tea Garden.
6. **Cable Cars**: A classic San Francisco experience, the cable cars offer a fun and historic way to explore the city. Take a ride on the Powell-Mason line to Fisherman's Wharf, or the Powell-Hyde line to Lombard Street.
7. **Lombard Street**: Known as the "crookedest street in the world," Lombard Street is a scenic and winding road that offers stunning views of the city.
8. **Union Square**: A vibrant public square in the heart of the city, Union Square is surrounded by shopping, dining, and entertainment options. Catch a show at the historic Curran Theatre or take a stroll through the square.
9. **The Painted Ladies**: A row of colorful Victorian houses on Alamo Square, the Painted Ladies are a iconic symbol of San Francisco's architecture. Take a photo in front of these stunning homes.
10. **The Exploratorium**: A museum of science, art, and human perception, the Exploratorium is a great place to learn and have fun. With interactive exhibits and stunning views of the Bay, it's a must-visit for families and science enthusiasts.
11. **Pier 39**: A popular shopping and dining destination, Pier 39 offers stunning views of the Bay Bridge, Alcatraz Island, and the sea lions that call the pier home.
12. **The de Young Museum**: A fine arts museum located in Golden Gate Park, the de Young Museum features a diverse collection of art and cultural exhibitions from around the world.
These are just a few of the many famous attractions in San Francisco. Whether you're interested in history, culture, science, or entertainment, San Francisco has something for everyone.
```
## Step 5: See Traces in Langfuse
After running the example model call, you can view the traces in Langfuse. These traces provide detailed information about your Novita AI API calls, including:
* Request parameters (model, messages, temperature, etc.)
* Response content
* Token usage statistics
* Latency metrics
[Public example trace link in Langfuse](https://cloud.langfuse.com/project/cm7ua5l6e05wlad07qr6ce2wn/traces/039cc8b2-dba0-479f-9cd6-63672bc08c71?timestamp=2025-03-06T02%3A15%3A15.184Z).
# LiteLLM
Source: https://novita.ai/docs/guides/litellm
Supercharge Your AI Applications with Novita AI and LiteLLM.
LiteLLM is an open-source Python library and proxy server that provides access, spend tracking, and fallbacks to over 100 LLMs through a unified interface in the OpenAI format. By leveraging Novita AI's cutting-edge models, the integration with LiteLLM empowers your AI applications with seamless model switching, dependable fallbacks, and intelligent request routing—all through a standardized completion API that ensures compatibility across multiple providers.
This guide will show you how to quickly get started with integrating Novita AI and LiteLLM, enabling you to set up this powerful combination and streamline your workflow with ease.
## **How to Integrate Novita AI with LiteLLM**
### Step 1: Install LiteLLM
* Install the LiteLLM library using pip to create a unified interface for working with different language models.
```bash theme={"system"}
pip install litellm
```
### Step 2: Set Up Your API Credentials
* Log in to [the key management page](https://novita.ai/settings/key-management) in Novita AI and click `Add New Key`to generate your API key.
### Step 3: Structure Your Basic API Call
* Create a completion request to Novita AI's models through LiteLLM's standardized interface.
```python theme={"system"}
from litellm import completion
import os
## set ENV variables. Visit https://novita.ai/settings/key-management to get your API key
os.environ["NOVITA_API_KEY"] = "novita-api-key"
response = completion(
model="novita/deepseek/deepseek-r1",
messages=[{ "content": "Hello, how are you?","role": "user"}]
)
```
### Step 4: Implement Streaming for Better User Experience
* Enable streaming mode for more interactive applications or when handling longer responses.
```python theme={"system"}
from litellm import completion
import os
## set ENV variables. Visit https://novita.ai/settings/key-management to get your API key
os.environ["NOVITA_API_KEY"] = "novita_api_key"
response = completion(
model="novita/deepseek/deepseek-r1",
messages = [{ "content": "Hello, how are you?","role": "user"}],
stream=True,
)
```
# LlamaIndex
Source: https://novita.ai/docs/guides/llamaindex
Effortlessly integrate Novita AI with LlamaIndex to build intelligent, data-powered applications.
Designed for optimal indexing and retrieval, LlamaIndex excels in delivering high efficiency for applications requiring precise and fast data access. By combining [Novita AI](https://novita.ai/) with LlamaIndex, you will unlock key benefits such as superior data retrieval accuracy, unmatched scalability, and cost-effective performance.
This guide will walk you through how to use LlamaIndex with Novita AI based on the OpenAl APl, offering smarter, scalable, and highly efficient AI solutions that drive innovation and deliver exceptional results for developers.
## **How to Integrate Novita AI API with LlamaIndex**
Step 1: Visit [Model Library](https://novita.ai/llm-api) on Novita AI and select a model of interest.
Step 2: Navigate to the demo page of the chosen model and click the `Code` button on the right.
Step 3: Copy the model’s name and make a note of it.
Step 4: [Log in ](https://novita.ai/user/login)to the Novita platform.
Step 5: After logging in, go to the platform’s [settings page](https://novita.ai/settings).
Step 6: Create a new [API key](https://novita.ai/settings/key-management) and copy it for service authentication.
Step 7: Install `llama_index` and related Python libraries by running:
Step 8: Write Python code and set the model name and API key as parameters in the NovitaAI class.
Step 9: Run the code to get the output.
| Status | Description |
|---|---|
| VALIDATING | The input file is being validated before the batch can begin |
| PROGRESS | Batch is in progress |
| COMPLETED | Batch processing completed successfully |
| FAILED | Batch processing failed |
| EXPIRED | Batch exceeded deadline |
| CANCELLING | Batch is being cancelled |
| CANCELLED | Batch was cancelled |
| Error Code | Description | Solution |
|---|---|---|
| 400 | Invalid request format | Check JSONL syntax and required fields |
| 401 | Authentication failed | Verify API key |
| 404 | Batch not found | Check batch ID |
| 429 | Rate limit exceeded | Reduce request frequency |
| 500 | Server error | Contact us |
Example output:
```
<|/ref|><|det|>[[37, 48, 279, 140]]<|/det|>
<|ref|>Deploy open-source and specialized models<|/ref|><|det|>[[42, 48, 857, 133]]<|/det|>
<|ref|>smarterandfasterwithsimpleApls.Accessthe<|/ref|><|det|>[[44, 185, 902, 246]]<|/det|>
<|ref|>latest chat, code, image, audio, video models and<|/ref|><|det|>[[41, 291, 945, 370]]<|/det|>
<|ref|>more,ready for production with built-in<|/ref|><|det|>[[40, 407, 756, 488]]<|/det|>
<|ref|>scalability.<|/ref|><|det|>[[39, 515, 232, 606]]<|/det|>
<|ref|>Explore<|/ref|><|det|>[[87, 813, 266, 879]]<|/det|>
<|ref|>Models<|/ref|><|det|>[[289, 816, 432, 878]]<|/det|>
```
# Function Calling
Source: https://novita.ai/docs/guides/llm-function-calling
export const FunctionCallingModels = () => {
if (typeof document === "undefined") {
return null;
} else {
let attempts = 0;
const maxAttempts = 50;
const INIT_DISPLAY_COUNT = 3;
const interval = setInterval(() => {
const clientComponent = document.getElementById("function-calling-models");
if (clientComponent && window.novitaRemoteData.llmModels.status === 'loaded') {
const modelList = window.novitaRemoteData.llmModels.data.filter(model => {
return (model.features || []).includes('function-calling');
});
let displayModels = modelList.slice(0, INIT_DISPLAY_COUNT).map(model => {
return `| Model | T1 | T2 | T3 | T4 | T5 |
|---|
| Tier | How to reach |
|---|---|
| T1 | Monthly top-ups did not exceed \$50 in any of the last 3 calendar months. |
| T2 | Monthly top-ups were at least \$50 but did not exceed \$500 in any of the last 3 calendar months. |
| T3 | Monthly top-ups were at least \$500 but did not exceed \$3,000 in any of the last 3 calendar months. |
| T4 | Monthly top-ups were at least \$3,000 but did not exceed \$10,000 in any of the last 3 calendar months. |
| T5 | Monthly top-ups were at least \$10,000 in at least one of the last 3 calendar months. |
To connect Novita AI’s models with LobeChat:
* Log in to your **LobeChat account** and access the [**Settings**](https://lobechat.com/settings/common).
* Navigate to the **AI Service Provider** tab.
* Scroll down to find the **Novita AI** section and expand it.
* Paste your **Novita AI API Key** into the provided field and click **Save**.
### **Step 2:** Select a Novita AI Model
Once the API Key is configured, you can select a Novita AI LLM model to power your AI application:
### Step 3: Launch Your First Chat
Begin your conversation with Novita AI LLM on LobeChat.
# LoLLMS WebUI
Source: https://novita.ai/docs/guides/lollmswebui
Easily integrate Novita AI with LoLLMS WebUI to enhance your productivity and simplify complex tasks.
LoLLMS WebUI, a centralized platform designed for effortless interaction with Large Language Models (LLMs) and multimodal AI systems, offers an intuitive interface to unlock the full potential of AI. The integration between Novita AI and LoLLMS WebUI will unleash transformative power,enabling you to simplify complex tasks, find answers, and explore new possibilities effortlessly.
By combining Novita AI with LoLLMS, you'll gain access to cutting-edge AI capabilities, making it your ultimate assistant for enhanced productivity. This guide provides a step-by-step walkthrough to help you maximize the benefits of this powerful integration.
## How to Leverage Novita AI with LoLLMS WebUI
You can find the GitHub repository of LoLLMS WebUI here: [ParisNeo/lollms-webui](https://github.com/ParisNeo/lollms-webui).
### Obtain Your Novita AI API Key
1. [Log in](https://novita.ai/user/login) to your Novita AI account.
2. Navigate to [the Key Management page](https://novita.ai/settings/key-management).
3. Generate a new API Key and copy it.
### Install LoLLMS WebUI
1. Automatic Installation:
* Download the installation script from [the scripts folder](https://github.com/ParisNeo/lollms-webui/tree/main/scripts) and run it:
* `lollms_installer.bat` for Windows.
* `lollms_installer.sh` for Linux.
* `lollms_installer_macos.sh` for Mac.
2. Manual Installation:
* Ensure Python 3.11 is installed. Check your version with `python --version`.
* If needed, download Python 3.11 from [Python.org](https://www.python.org/downloads/release/python-3118/).
Step 1: Clone the Repository
```bash theme={"system"}
git clone --recursive https://github.com/ParisNeo/lollms-webui.git
cd lollms-webui
git submodule update --init --recursive
```
Step 2: Create and Activate a Virtual Environment
* Create a Virtual Environment:
```bash theme={"system"}
python -m venv venv
```
* Activate the Virtual Environment:
* On Windows:
```bash theme={"system"}
.\venv\Scripts\activate
```
* On Linux/Mac:
```bash theme={"system"}
source venv/bin/activate
```
Step 3: Install Requirements
```bash theme={"system"}
pip install -r requirements.txt
pip install -e ./lollms_core
```
Step 4: Create global\_paths\_cfg.yaml
```bash theme={"system"}
mkdir -p $HOME/.lollms_personal_data
cat > global_paths_cfg.yaml << EOL
lollms_path: $(pwd)/lollms_core/lollms
lollms_personal_path: $HOME/.lollms_personal_data
EOL
```
### Configure Novita AI in LoLLMS WebUI
1. Install Novita AI Binding
* Set environment variables for Novita AI binding. For example:
* On Windows:
```bash theme={"system"}
set NOVITA_AI_API_KEY="your_api_key_here"
set NOVITA_AI_MODEL_NAME="your_model_name_here"
```
* On Linux/Mac:
```bash theme={"system"}
export NOVITA_AI_API_KEY="your_api_key_here"
export NOVITA_AI_MODEL_NAME="your_model_name_here"
```
* Run the script to finalize the setup:
```bash theme={"system"}
python zoos/bindings_zoo/novita_ai/__init__.py
```
2. Run the Server
```bash theme={"system"}
python app.py
```
3. Use Novita AI in LoLLMS WebUI
* Open your browser and navigate to [http://localhost:9600](http://localhost:9600) (or the port shown in the terminal).
* Select Novita AI from the available bindings.
* Enter your Novita AI API key.
* Select your desired Novita AI model.
[******Here is an application: Vibe Coding Using Novita AI Bindings and Services on LoLLMS WebUI.******](https://www.youtube.com/watch?v=jyFaP4zTM9g)
# A Brief Introduction to Clip Skip
Source: https://novita.ai/docs/guides/model-apis-clip-skip
Clip Skip is a feature that literally skips part of the image generation process, resulting in slightly different outputs. This also leads to faster image rendering.
### But why would anyone want to skip a part of the diffusion process?
A typical Stable Diffusion 1.5 base model image goes through 12 "clip" layers, which represent levels of refinement. The early layers are very broad, while the later layers produce images that are clearer and more specific. In the case of some base models, especially those based on Danbooru tags, trial and error has shown that skipping certain layers can lead to better images, as the broad clip layers may introduce unwanted noise. You can literally skip layers, saving GPU time and achieving better art.
If you want a picture of "a cow", you might not care about the subcategories of "cow" that the text model might have. Especially since these can vary in quality. So if you want "a cow", you might not want "an Aberdeen Angus bull". (The full post is at the bottom of this page.)
You can think of CLIP Skip as a setting for "how accurate you want the text model to be". You can test it out and see that each clip stage adds more definition in terms of description. For example, if you have a detailed prompt about a young man standing in a field, with lower clip stages you’d get an image of "a man standing", while deeper stages would yield "a young man standing" or "a young man standing in a forest", and so on. CLIP Skip becomes particularly effective when using models that are structured in a specific way, where the "1girl" tag can break down into many sub-tags connected to that main tag.
### Do I need it?
It’s a minor optimization recommended only for hardcore, quality-obsessed enthusiasts. If you’re working on anime or semi-realism, it’s worth a try.
### Limitations
**How many layers to skip**
Generally speaking, skipping 1 or 2 layers can yield good results. However, skipping more than 2 layers may produce images that appear to have low guidance.
**Inconsistent compatibility**
Clip Skip has become one of those "wear your seatbelt" kinds of safe defaults, where many people prefer to set it and forget it. This approach isn’t wise. The feature can also yield unpredictable results when used with other technologies, such as LoRAs and Textual Inversions. Missing layers where layers are expected can degrade the image quality or have no effect at all.
It’s often faster to simply try it out than to compare results. Just make sure you also lock the seed, guidance, concept, and sampler to accurately assess the differences.
# Configure Custom AWS S3 Bucket
Source: https://novita.ai/docs/guides/model-apis-configure-custom-s3-bucket
By default, Novita AI temporarily stores output results in our private S3 bucket and returns the results to the user through a temporary authorized S3 link.
However, you can set up a **Custom S3 Bucket** to allow us to save the results in your own bucket. Please follow the steps below to enable this.
## 1. Configure S3 Bucket Policy
First, change your S3 Bucket Policy configuration to the following format (replace `${BucketName}` with your bucket name):
```json theme={"system"}
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"CanonicalUser": "e98cde8d11ec7c03ac08688f1a933b08b0f0f7746b21c4f2e7b2c8202cc0532f"
},
"Action": [
"s3:PutObject",
"s3:PutObjectAcl"
],
"Resource": "arn:aws:s3:::${BucketName}/*"
}
]
}
```
## 2. Enable Custom Storage in V3 APIs
For V3 API endpoints, Novita AI provides the `custom_storage` parameter in the request body, allowing you to configure your custom S3 bucket for storing generated images.
Here's an example using the `txt2img` API endpoint:
```bash theme={"system"}
curl --location 'https://api.novita.ai/v3/async/txt2img' \
--header 'Authorization: Bearer {{API Key}}' \
--header 'Content-Type: application/json' \
--data '{
"extra": {
"response_image_type": "jpeg",
"custom_storage": {
"aws_s3": {
"region": "us-east-2",
"bucket": "test_bucket",
"path": "/"
}
}
},
"request": {
"prompt": "a cute dog",
"model_name": "realisticVisionV51_v51VAE_94301.safetensors",
"negative_prompt": "",
"width": 512,
"height": 384,
"image_num": 2,
"steps": 20,
"seed": 123,
"clip_skip": 1,
"sampler_name": "Euler a",
"guidance_scale": 7.5
}
}'
```
# Upload Custom LoRA Models
Source: https://novita.ai/docs/guides/model-apis-custom-model
To produce an image, Stable Diffusion first generates a completely random image in the latent space. The noise predictor then estimates the noise of the image, which is subsequently subtracted from the image. This process is repeated multiple times, resulting in a clean image.
This denoising process is called sampling because Stable Diffusion generates a new sample image at each step. The method used in sampling is referred to as the `sampler` or `sampling method`.
Sampling is just one part of the Stable Diffusion model. Read the article "How Does Stable Diffusion Work?" if you want to understand the entire model.
Below is a sampling process in action. The sampler gradually produces cleaner and cleaner images.
While the framework is the same, there are many different ways to carry out this denoising process. It is often a trade-off between speed and accuracy.
### Samplers Overview
As of this writing, there are 19 samplers available in Novita AI, and this number seems to be growing over time. What are the differences?
You will learn about them in the later part of this article. The technical details can be overwhelming, so I will include a bird's-eye view in this section to help you get a general idea of what they are.
#### Old-School ODE Solvers
Let’s start with the easier ones. Some of the samplers on the list were invented more than a hundred years ago. They are old-school solvers for ordinary differential equations (ODE).
* Euler – The simplest possible solver.
* Heun – A more accurate but slower version of Euler.
* LMS (Linear Multi-Step Method) – Same speed as Euler but supposedly more accurate.
#### Ancestral Samplers
Do you notice that some sampler names have a single letter "a"?
* Euler a
* DPM2 a
* DPM++ 2S a
* DPM++ 2S a Karras
These are ancestral samplers. An ancestral sampler adds noise to the image at each sampling step. They are stochastic samplers because the sampling outcome contains some randomness.
Be aware that many other samplers are also stochastic, even if their names do not include an "a".
The drawback of using an ancestral sampler is that the image may not converge. Compare the images generated using Euler a and Euler below.
Euler a does not converge. (sampling steps 2–40)
Euler converges. (sampling steps 2-40)
Images generated with Euler a do not converge at high sampling steps, while images from Euler converge well.
For reproducibility, it is desirable to have the image converge. If you want to generate slight variations, you should use a variational seed.
#### Karras Noise Schedule
The samplers labeled "Karras" use the noise schedule recommended in the [Karras article](https://arxiv.org/abs/2206.00364). If you look closely, you will see that the noise step sizes are smaller near the end. This adjustment improves the quality of the images.
#### DDIM and PLMS
DDIM (Denoising Diffusion Implicit Model) and PLMS (Pseudo Linear Multi-Step Method) were the samplers included with the original Stable Diffusion v1. DDIM is one of the first samplers designed for diffusion models, while PLMS is a newer and faster alternative to DDIM. They are generally considered outdated and are not widely used anymore.
#### DPM and DPM++
DPM (Diffusion Probabilistic Model Solver) and DPM++ are new samplers designed for diffusion models released in 2022. They represent a family of solvers with similar architecture.
DPM and DPM2 are similar, with DPM2 being second-order (more accurate but slower).
DPM++ is an improvement over DPM.
DPM adaptive adjusts the step size adaptively. It can be slow since it doesn’t guarantee completion within the specified number of sampling steps.
#### UniPC
UniPC (Unified Predictor-Corrector) is a new sampler released in 2023. Inspired by the predictor-corrector method in ODE solvers, it can achieve high-quality image generation in 5-10 steps.
### How to Pick a Sampler
In this section, I will provide some objective comparisons to help you decide.
#### Image Convergence
In this section, I will generate the same image using different samplers with up to 40 sampling steps. The last image at the 40th step will be used as a reference for evaluating how quickly the sampling converges, with the Euler method serving as the benchmark.
> Euler, DDIM, PLMS, LMS Karras and Heun
First, let’s look at Euler, DDIM, PLMS, LMS Karras, and Heun as a group since they represent old-school ODE solvers or original diffusion solvers. DDIM converges at about the same number of steps as Euler but with more variations due to the injection of random noise during its sampling steps.
PLMS did not perform well in this test.
LMS Karras seems to have difficulty converging and stabilizes at a higher baseline.
Heun converges faster but is twice as slow since it is a second-order method. Therefore, we should compare Heun at 30 steps with Euler at 15 steps, for example.
> Ancestral Samplers
If a stable, reproducible image is your goal, you should avoid using ancestral samplers, as they all fail to converge.
> DPM and DPM2
DPM fast did not converge well. DPM2 and DPM2 Karras performs better than Euler but again in the expense of being two times slower.
DPM adaptive performs deceptively well because it uses its own adaptive sampling steps. It can be very slow.
> DPM++ solvers
DPM++ SDE and DPM++ SDE Karras suffer the same shortcoming as ancestral samplers. They not only don’t converge, but the images also fluctuate significantly as the number of steps changes.
DPM++ 2M and DPM++ 2M Karras perform well. The Karras variant converges faster when the number of steps is high enough.
> UniPC
UniPC converges a bit slower than Euler, but not too bad.
#### Speed
Although DPM adaptive performs well in convergence, it is also the slowest.
You may have noticed the rest of the rendering times fall into two groups, with the first group taking about the same time (about 1x), and the other group taking about twice as long (about 2x). This reflects the order of the solvers. 2nd order solvers, although more accurate, need to evaluate the denoising U-Net twice. So they are 2x slower.
#### Quality
Of course, speed and convergence mean nothing if the images look crappy.
**Final images**
Let’s first look at samples of the image.
DPM++ fast failed pretty badly. Ancestral samples did not converge to the image that other samplers converged to.
Ancestral samplers tend to converge to an image of a kitten, while the deterministic ones tend to converge to a cat. There are no correct answers as long as they look good to you.
Perceptual quality
An image can still look good even if it hasn’t converged. Let’s look at how quickly each sampler can produce a high-quality image.
You will see perceptual quality measured with BRISQUE (Blind/Referenceless Image Spatial Quality Evaluator). It measures the quality of natural images.
DDIM is doing surprisingly well here, capable of producing the highest quality image within the group in as few as 8 steps.
### So… which one is the best?
Here are my recommendations:
1. If you want to use something fast, converging, new, and with decent quality, excellent choices are
* DPM++ 2M Karras with 20 – 30 steps
* UniPC with 20-30 steps.
2. If you want good quality images and don’t care about convergence, good choices are
* DPM++ SDE Karras with 10-15 steps (Note: This is a slower sampler)
* DDIM with 10-15 steps.
3. Avoid using any ancestral samplers if you prefer stable, reproducible images.
4. Euler and Heun are fine choices if you prefer something simple. Reduce the number of steps for Heun to save time.
# SDKs for Model APIs
Source: https://novita.ai/docs/guides/model-apis-sdks
Novita AI provides official SDKs to help you easily integrate our services into your applications.
## JavaScript SDK
### Installation
Install our JavaScript SDK using `npm`:
```bash theme={"system"}
npm i novita-sdk
```
### Quick Start
Visit [https://github.com/novitalabs/javascript-sdk](https://github.com/novitalabs/javascript-sdk) for a quick start.
## Python SDK
### Installation
Install our Python SDK using `pip`:
```bash theme={"system"}
pip install novita-client
```
### Quick Start
Visit [https://github.com/novitalabs/python-sdk](https://github.com/novitalabs/python-sdk) for a quick start.
# Training Image Caption Guidance
Source: https://novita.ai/docs/guides/model-apis-training-guidance
### Preparation
Depending on how and what you are training, you may need to crop the photos to a specific width and height. Other types of training may categorize images into various sizes and do not require cropping. Look into what is required for the training method you are using, the model you are training, and the program you are using to train your model.
### Captioning – General Notes
#### Avoid Automated Captioning
* BLIP and deepbooru are exciting, but it is still a bit early for them.
* BLIP and deepbooru struggle with context and relative importance.
* It is faster to manually caption than to fix mistakes made by BLIP or deepbooru and still have to manually caption.
#### Caption in the Same Manner You Prompt
* Captioning and prompting are related.
* Recognize how you typically prompt. Do you use verbose sentences? Short descriptions? Vague or detailed prompts?
* Caption in a similar style and verbosity to how you usually prompt.
#### Follow a Set Structure per Concept
* Following a structure will benefit the learning process.
* You might have one structure for photographs and another for illustrations. However, try to avoid mixing and matching structures when captioning a single dataset.
#### Captions are Like Variables You Can Use in Your Prompts
Everything you describe in a caption can be thought of as a variable that you can manipulate in your prompt. This has two implications:
1. You want to describe as many details as possible about anything that isn’t the concept you are trying to implicitly teach. In other words, describe everything that you want to become a variable. For example, if you are teaching a specific face but want to be able to change the hair color, you should describe the hair color in each image so that "hair color" becomes one of your variables.
2. You don’t want to describe anything (beyond a class-level description) that you want to be implicitly taught. In other words, the thing you are trying to teach shouldn’t become a variable. For example, if you are teaching a specific face, you should not describe it as having a big nose. You don’t want the nose size to be variable because then it isn’t that specific face anymore. However, you can still caption "face" if you want, which provides context to the model you are training. This does have some implications described in the following point.
#### Leveraging Classes as Tags
* There are two concepts here.
1. Using generic class tags will bias that entire class toward your training data. This may or may not be desired depending on your goals.
2. Using generic class tags provides context to the learning process. Conceptually, it is easier to learn what a "face" is when the model already has a reasonable approximation of "face".
* If you want to bias the entire class of your model toward your training images, use broad class tags rather than specific ones. For example, if you want to teach your model that every man should look like Brad Pitt, your captions should contain the tag "man" but should not be more specific than that. This influences your model to produce a Brad Pitt-looking man whenever you use the word "man" in your prompt. This also allows your model to draw on and leverage what it already knows about the concept of "man" while it is training.
* If you want to reduce the impact of your training on the entire class, include specific tags and de-emphasize class tags. For example, if you want to teach your model that only "ohwxman" should look like Brad Pitt, and you don't want every "man" to look like Brad Pitt, you would not use "man" as a tag, only tagging it with "ohwxman". This reduces the impact of your training images on the tag "man" and strongly associates your training images with "ohwxman". Your model will draw on what it knows about "ohwxman", which is practically nothing (see note), thus building up knowledge almost solely from your training images, creating a very strong association.
* **Note:** This is simplified for the sake of understanding. This would actually be tokenized into two tokens, "ohwx" and "man", but these tokens would be strongly correlated for training purposes, which should still reduce the impact on the overall class of "man" compared to training with "man" as a token in the caption. The mathematics involved is quite complex and well beyond the scope of this document.
#### Consistent Captioning
* Use consistent captions across all of your training. This will help you consistently invoke your concept when prompting.
* Using inconsistent tags across your dataset will make the concept you are trying to teach harder for the model to grasp, as you are essentially forcing it to learn both the concept and the different phrasings for that concept. It’s much better to have it just learn the concept under a single term.
* For example, you probably don’t want to have both "legs raised in air" and "raised legs" if you are trying to teach a single concept of a person with their legs up in the air. You want to be able to consistently invoke this pose in your prompt, so choose one way to caption it.
#### Avoid Repetition
* Try to avoid repetition wherever possible. Similar to prompting, repeating words increases the weighting of those words.
* For example, if we repeat the word "background" too much, we may have three tags that say "background" (e.g., simple background, white background, lamp in background). Even though we want the background to have low weight, we have unintentionally increased the weighting significantly. It would be better to combine these or reword them (e.g., simple white background with a lamp).
#### Take Note of Ordering
* Again, just like with prompting, order matters for the relative weighting of tags.
* Having a specific structure or order that you generally use for captions can help you maintain the relative weightings of tags between images in your dataset, which should benefit the training process.
* Having a standardized ordering makes the whole captioning process faster as you become familiar with captioning in that structure.
#### Use Your Model's Existing Knowledge to Your Advantage
* Your model already produces decent results and reasonably understands what you are prompting. Take advantage of that by captioning with words that already work well in your prompts.
* You want to use descriptive words, but if you use words that are too obscure or niche, you likely can't leverage much of the existing knowledge. For example, you could say "sarcastic" or "mordacious". The model has some idea of what "sarcastic" conveys, but it likely has no clue what "mordacious" means.
* You can also look at this from the opposite perspective. If you were trying to teach the concept of "mordacious", you might have a dataset full of images that convey "sarcastic" and caption them with both the tags "sarcastic" and "mordacious" side by side (so that they are close in relative weighting).
### Captioning – Structure
This is mainly for people or characters, so it might not be quite as applicable to something like fantasy landscapes, but perhaps it can provide some inspiration.
#### General Format
#### Loose Associations
* This is where we put any final loose associations we have with the image.
* This could be anything that pops up in our head, usually “feelings” that we feel when looking at the image or concepts we feel are portrayed, really anything goes here as long as it exists in the image.
* Keep in mind this is for loose associations. If the image is very obviously portraying some feeling, we may want it tagged closer to the start of the caption for higher weighting.
* For example: happy, sad, joyous, hopeful, lonely, sombre
#### FULL EXAMPLE OF A SINGLE IMAGE
This is an example of how we would caption a single image we picked off of safebooru. We will assume that I want to train the style of this image and associate it with the tag "ohwxStyle", and we will assume that we have many images in this style within our dataset.
Sample Image: [https://safebooru.org/index.php?page=post\&s=view\&id=3887414](https://safebooru.org/index.php?page=post\&s=view\&id=3887414)
* Globals: ohwxStyle
* Type or Perspective Of a: anime, drawing, of a young woman, full body shot, from side
* Action words: sitting, looking at viewer, smiling, head tilt, holding a phone, eyes closed
* Subject description: short brown hair, pale pink dress with dark edges, stuffed animal in lap, brown slippers
* Notable details: sunlight through windows as lighting source
* Background or location: brown couch, red patterned fabric on couch, wooden floor, white water-stained paint on walls, refrigerator in background, coffee machine sitting on a countertop, table in front of couch, bananas and coffee pot on table, white board on wall, clock on wall, stuffed animal chicken on floor
* Loose associations: dreary environment
All together: ohwxStyle, anime, drawing, of a young woman, full body shot, from side, sitting, looking at viewer, smiling, head tilt, holding a phone, eyes closed, short brown hair, pale pink dress with dark edges, stuffed animal in lap, brown slippers, sunlight through windows as lighting source, brown couch, red patterned fabric on couch, wooden floor, white water-stained paint on walls, refrigerator in background, coffee machine sitting on a countertop, table in front of couch, bananas and coffee pot on table, white board on wall, clock on wall, stuffed animal chicken on floor, dreary environment
# API V2 to V3 Migration Guide
Source: https://novita.ai/docs/guides/model-apis-v2-to-v3-migration
## Text to Image
### Request Body Parameter Mapping
| V2 | V3 | Description |
|---|---|---|
| **extra**object | **extra**object | |
| enable\_nsfw\_detection boolean |
enable\_nsfw\_detection boolean |
|
| nsfw\_detection\_level Enum: `0, 1, 2` |
nsfw\_detection\_level Enum: `0, 1, 2` |
|
| enable\_progress\_info | Deprecated | |
| response\_image\_type Enum: `png`, `jpeg` |
response\_image\_type Enum: `png, webp, jpeg` |
V3 adds support for `webp`image format |
| **request**object | New Field All image generation parameters must be passed via the `request`in V3 |
|
| **prompt**string \ |
promptstring | Moved Inside |
| lorasobject\[] | Moved Inside **Migrate LoRA usage: From **`prompt`** to **`request.loras`** parameter** |
|
| model\_namestring | New Field Name of lora, retrieve the corresponding sd\_name\_in\_api value by invoking the [Get Model API](https://novita.ai/docs/api-reference/model-apis-get-model) endpoint with filter.types=lora as the query parameter. |
|
| strengthnumber(float32) | New Field The strength value of lora. The larger the value, the more biased the effect is towards lora, Range \[0, 1] |
|
| **negative\_prompt**string | negative\_promptstring | Moved Inside |
| **sampler\_name**string | sampler\_namestring | Moved Inside |
| **batch\_size**integer | image\_numinteger | Changed `num_images` **→** `request.image_num` |
| **n\_iter** | Deprecated | |
| **steps**string | stepsstring | Moved Inside |
| **cfg\_scale**integer | guidance\_scale number(float32) |
Changed `cfg_scale`**→** `request.guidance_scale` |
| **seed**integer | seedinteger | Moved Inside |
| **height**integer | heightinteger | Moved Inside Range Change: \[128, 2048]. |
| **width**integer | widthinteger | Moved Inside Range Change: \[128, 2048]. |
| **model\_name**string | model\_namestring | Moved Inside This parameter specifies the name of the model checkpoint. Retrieve the corresponding sd\_name value by invoking the [Query Model](https://novita.ai/docs/api-reference/model-apis-get-model) API with filter.types=checkpoint as the query parameter. |
| **restore\_faces**bool | restore\_facesbool | Moved Inside |
| **restore\_faces\_model** | Deprecated | |
| **sd\_vae**string | sd\_vaestring | Moved Inside |
| **clip\_skip**integer | clip\_skipinteger | Moved Inside |
| **enable\_hr**boolean | hires\_fixobject | Changed `enable_hr`**→** `request.hires_fix` |
| **hr\_upscaler** Enum: `Latent`, `ESRGAN_4x`, `R-ESRGAN 4x+`, `R-ESRGAN 4x+ Anime6B` |
upscaler Enum: `RealESRGAN_x4plus_anime_6B`, `RealESRNet_x4plus,Latent` |
Changed `hr_upscaler`**→**`request.hires_fix.upscaler` |
| **hr\_scale**number | Deprecated | |
| **hr\_resize\_x**integer | target\_widthinteger | Changed `hr_resize_x`**→** `request.hires_fix.target_width` |
| **hr\_resize\_y**integer | target\_heightinteger | Changed `hr_resize_y`**→** `request.hires_fix.target_height` |
| **img\_expire\_ttl**integer | Deprecated Default 3600s |
|
| **sd\_refiner**object | refinerobject | Changed `sd_refiner`**→** `request.refiner` |
| checkpointstring | Deprecated | |
| switch\_at number(float32) |
switch\_at number(float32) |
Changed `sd_refiner.switch_at`**→** `request.refiner.switch_at` |
| **controlnet\_units**object\[] | Deprecated `img2img` Only |
| V2 | V3 | Description |
|---|---|---|
| **code** | Deprecated | |
| **msg** | Deprecated | |
| **data** | Deprecated | |
| task\_id | **task\_id** | Changed `data.task_id`**→** `task_id` |
| warn | Deprecated |
| V2 | V3 | Description |
|---|---|---|
| **extra**object | **extra**object | |
| enable\_nsfw\_detection boolean |
enable\_nsfw\_detection boolean |
|
| nsfw\_detection\_level Enum: `0, 1, 2` |
nsfw\_detection\_level Enum: `0, 1, 2` |
|
| enable\_progress\_info | Deprecated | |
| response\_image\_type Enum: `png`, `jpeg` |
response\_image\_type Enum: `png, webp, jpeg` |
V3 adds support for `webp`image format |
| **request**object | New field All image generation parameters must be passed via the `request`in V3 |
|
| **prompt**string \ |
promptstring | Moved Inside |
| lorasobject\[] | Moved Inside **Migrate LoRA usage: From **`prompt`** to **`request.loras`** parameter** |
|
| model\_namestring | New Field Name of lora, retrieve the corresponding sd\_name\_in\_api value by invoking the [Get Model API](https://novita.ai/docs/api-reference/model-apis-get-model) endpoint with filter.types=lora as the query parameter. |
|
| strength number(float32) |
New Field The strength value of lora. The larger the value, the more biased the effect is towards lora, Range \[0, 1] |
|
| **negative\_prompt**string | negative\_prompt string |
Moved Inside |
| **sampler\_name**string | sampler\_namestring | Moved Inside |
| **batch\_size**integer | image\_numinteger | Changed `batch_size`**→** `request.image_num` |
| **n\_iter**integer | Deprecated | |
| **steps**string | stepsstring | Moved Inside |
| **cfg\_scale**integer | guidance\_scale number(float32) |
Changed `cfg_scale`**→** `request.guidance_scale` |
| **seed**integer | seedinteger | Moved Inside |
| **height**integer | heightinteger | Moved Inside Range Change: \[128, 2048]. |
| **width**integer | widthinteger | Moved Inside Range Change: \[128, 2048]. |
| **model\_name**string | model\_namestring | Moved Inside This parameter specifies the name of the model checkpoint. Retrieve the corresponding sd\_name value by invoking the [Query Model](https://novita.ai/docs/api-reference/model-apis-get-model) API with filter.types=checkpoint as the query parameter. |
| **init\_images**string\[] | image\_base64string | Changed `init_images`**→** `request.image_base64` |
| **denoising\_strength** number(float) |
strength number(float) |
Changed `denoising_strength`**→** `request.strength` |
| **restore\_faces**bool | Deprecated | |
| **sd\_vae**string | sd\_vaestring | Moved Inside |
| **clip\_skip**integer | clip\_skipinteger | Moved Inside |
| **mask**string | Deprecated Recommendation: Use V3 Inpainting API |
|
| **mask\_blur**integer | Deprecated Recommendation: Use V3 Inpainting API |
|
| **resize\_mode**integer | Deprecated | |
| **image\_cfg\_scale**integer | Deprecated | |
| **inpainting\_fill**integer | Deprecated Recommendation: Use V3 Inpainting API |
|
| **inpaint\_full\_res**integer | Deprecated Recommendation: Use V3 Inpainting API |
|
| **inpaint\_full\_res\_padding** integer |
Deprecated Recommendation: Use V3 Inpainting API |
|
| **inpainting\_mask\_invert** integer |
Deprecated Recommendation: Use V3 Inpainting API |
|
| **initial\_noise\_multiplier** number(float32) |
Deprecated | |
| **img\_expire\_ttl**integer | Deprecated Default 3600s |
|
| **sd\_refiner**object | refinerobject | Changed `sd_refiner`**→** `request.refiner` |
| checkpoint | Deprecated | |
| switch\_at number(float32) |
switch\_at number(float32) |
Moved Inside |
| controlnetobject | New Field | |
| **controlnet\_units**object\[] | unitsobject\[] | Changed `controlnet_units`**→** `request.controlnet.units` |
| modelstring | model\_name string |
Changed `controlnet_units.model`**→** `request.controlnet.units.model_name` |
| weightnumber | strength number(float32) |
Changed `controlnet_units.weight`**→** `request.controlnet.units.strength` |
| input\_imagestring | image\_base64 string |
Changed `controlnet_units.input_image`**→** `request.controlnet.units.image_base64` |
| modulestring,Enum | preprocessor string,Enum |
Changed `controlnet_units.module`**→** `request.controlnet.units.preprocessor` |
| control\_mode | Deprecated | |
| mask | Deprecated Recommendation: Use V3 Inpainting API |
|
| resize\_mode | Deprecated | |
| processor\_res | Deprecated | |
| threshold\_a | Deprecated | |
| threshold\_b | Deprecated | |
| guidance\_start number(float32) |
guidance\_start number(float32) |
Moved Inside |
| guidance\_end number(float32) |
guidance\_end number(float32) |
Moved Inside |
| pixel\_perfect | Deprecated |
| V2 | V3 | Description |
|---|---|---|
| code | Deprecated | |
| msg | Deprecated | |
| data | Deprecated | |
| task\_id | **task\_id** | Changed `data.task_id`**→** `task_id` |
| warn | Deprecated |
### Variational Autoencoders (VAEs) Overview
Variational Autoencoders (VAEs) address some of the limitations of traditional autoencoders by introducing a probabilistic approach to encoding and decoding. The motivation behind VAEs lies in their ability to generate new data samples by sampling from a learned distribution in the latent space, rather than from a fixed latent vector as was the case with vanilla autoencoders. This makes them suitable for generative tasks.
* **Probabilistic Nature**: Unlike deterministic autoencoders, VAEs model the latent space as a probability distribution. This produces a probability distribution function over the input encodings instead of just a single fixed vector, allowing for a more nuanced representation of uncertainty in the data. The decoder then samples from this probability distribution.
* **Role of Latent Space**: The latent space in VAEs serves as a continuous, structured representation of the input data. Since it is continuous by design, this allows for easy interpolation. Each point in the latent space corresponds to a potential output, enabling smooth transitions between different data points and ensuring that points closer in the latent space lead to similar generations.
The concept can be elucidated through a straightforward example, as presented below. Encoders within a neural network are tasked with acquiring a representation of input images in the form of a vector. This vector encapsulates various features such as a subject’s smile, hair color, gender, age, etc., denoted as a vector similar to \[0.4, 0.03, 0.032, …]. In this illustration, the focus is narrowed to a singular latent representation, specifically the attribute of a "smile".
Autoencoders vs VAEs - Sciforce MediumIn the context of Vanilla Autoencoders (AE), the smile feature is encapsulated as a fixed, deterministic value. In contrast, Variational Autoencoders (VAEs) are deliberately crafted to encapsulate this feature as a probabilistic distribution. This design choice facilitates the introduction of variability in generated images by enabling the sampling of values from the specified probability distribution.
In summary, VAEs go beyond mere data reconstruction; they generate new samples and provide a probabilistic framework for understanding latent representations. The inclusion of probabilistic elements in the model’s architecture sets VAEs apart from traditional autoencoders. Compared to traditional autoencoders, VAEs offer a richer understanding of the data distribution, making them particularly powerful for generative tasks.
# LLM API Metrics
Source: https://novita.ai/docs/guides/observability-llm-api-metrics
Novita AI provides comprehensive monitoring metrics for your LLM API usage. These metrics give you insights into the availability and performance of your LLM API requests.
You can access these metrics through the [LLM Metrics Console](https://novita.ai/model-api/console/llm-metrics).
## Available Metrics
* Keep your API key safe.
* **Note:** API keys are securely encrypted on the server. If you lose it, you’ll need to delete the old key and create a new one.
3. **Identify the Model ID**
* meta-llama/llama-4-scout-17b-16e-instruct
### **Step 2: Install OWL**
* To begin, download and install OWL by following the step-by-step instructions on the Github: [https://github.com/camel-ai/owl](https://github.com/camel-ai/owl).
### **Step 3: Configure OWL**
* Prepare OWL for use by setting up environment variables and adding your API key. Run the following commands in your terminal:
```bash theme={"system"}
cd owl
cp .env_template .env
```
* Start OWL and specify your task by executing:
```bash theme={"system"}
python examples/run_novita_ai.py
```
### **Step 4: View Results**
1. **Terminal Output**
* Once launched, OWL will display the execution results directly in the terminal window.
2. **Web Interface**
* For a more intuitive experience, use OWL’s web-based interface. Launch it by running:
```bash theme={"system"}
cd owl
python webapp.py
```
* Steps to use the Web UI:
* Select `run_novita_ai` from the left-hand menu and go to the `Environment Variable Management` tab on the right and input your `NOVITA_API_KEY`.
* Click `Run` to execute your task.
* To perform a new task, update the input field and click `Run` again.
* Results will be displayed in the terminal, or a new file will be generated in the root directory, depending on the task.
* Alternatively, you can directly input your desired task into the content box and click the `Run` button to execute it.
# Page Assist
Source: https://novita.ai/docs/guides/pageassist
Elevate Your Browsing Experience with Novita AI and Page Assist Integration.
Page Assist offers a seamless browser extension that integrates AI capabilities directly into your web interactions. By combining Novita AI with Page Assist, you can unlock a powerful AI-enhanced browsing experience that enhances productivity and insight generation.
This guide will walk you through the process of deploying and running Page Assist with Novita AI to supercharge your web interactions.
## **How to Integrate Novita AI with Page Assist**
You can find the GitHub repository of Page Assist here: [n4ze3m/page-assist](https://github.com/n4ze3m/page-assist).
### Step 1: Prepare Your Environment
* Install Bun or npm: Follow the installation guide for [Bun](https://bun.sh/) or use npm as an alternative.
### Step 2: Clone Page Assist Repository
1. Open your terminal and run:
```bash theme={"system"}
git clone https://github.com/n4ze3m/page-assist.git
cd page-assist
```
2. Install dependencies using Bun or npm:
```bash theme={"system"}
bun install
```
### Step 3: Build Page Assist Extension
* Build the extension for Chrome (default):
```bash theme={"system"}
bun run build
```
* For Firefox, use:
```bash theme={"system"}
bun build:firefox
```
### Step 4: Load the Extension
1. For Chrome:
* Navigate to `chrome://extensions`.
* Enable Developer Mode.
* Click `Load unpacked` and select the `build/chrome-xxx` (e.g. `build/chrome-mv3`) directory.
2. For Firefox:
* Go to `about:addons`.
* Click `Extensions` tab.
* Click `Manage Your Extensions`.
* Select `Load Temporary Add-on` and choose the `manifest.json` file from in the `build/firefox-xxx` (e.g. `build/firefox-mv3`) directory.
### Step 5: Configure Novita AI as OpenAI API Compatible Endpoint
1. Obtain Your Novita AI API Key:
* [Log in](https://novita.ai/user/login) to your Novita AI account.
* Navigate to [the Key Management page](https://novita.ai/settings/key-management).
* Generate a new API Key and copy it.
2. Set Up Novita AI Endpoint:
* In your Page Assist Settings, enter `Open Compatible API` to add provider.
* Choose Novita from the `Custom list`, and use your API key for authentication.
### Step 6: Choose your Model and Test Page Assist with Novita AI
* Choose your model from the model list provided by Novita AI.
* Interact with your Novita AI model by asking questions or analyzing web content.
# Payment Methods
Source: https://novita.ai/docs/guides/payment-methods
Novita AI uses **Stripe** for all payment processing.
# Portkey
Source: https://novita.ai/docs/guides/portkey
Streamline AI development by using Portkey AI Gateway with Novita AI for fast, secure, and reliable performance.
Portkey AI Gateway transforms how developers work with AI models like Novita AI, providing a unified interface for seamless access to multiple language models with fast, secure, and reliable routing. This integration simplifies AI development and improves application performance.
This guide will walk you through setting up Portkey AI Gateway and then integrating Novita AI API with Portkey.
## How to Set Up Portkey AI Gateway
Setting up Portkey AI Gateway is simple and efficient, requiring just three key steps: configuring the gateway, sending your first request, and optimizing routing and guardrails for seamless performance.
### Step 1: Setup your AI Gateway
To run the gateway locally, ensure Node.js and npm are installed on your system. Once ready, execute the following command:
```jason theme={"system"}
npx @portkey-ai/gateway
```
After the gateway starts, two key URLs will be displayed:
* The Gateway: `http://localhost:8787/v1`
* The Gateway Console: `http://localhost:8787/public/`
### Step 2: Make your first request
Begin by installing the Portkey AI Python library:
```python theme={"system"}
pip install -qU portkey-ai
```
Next, execute the following Python code to send your first request:
```python theme={"system"}
from portkey_ai import Portkey
# OpenAI compatible client
client = Portkey(
provider="openai", # or 'anthropic', 'bedrock', 'groq', etc
Authorization="sk-***" # the provider API key
)
# Make a request through your AI Gateway
client.chat.completions.create(
messages=[{"role": "user", "content": "What's the weather like?"}],
model="gpt-4o-mini"
)
```
Effortlessly monitor all your local logs in one centralized location using the Gateway Console at: `http://localhost:8787/public/`.
### Step 3: Routing & Guardrails
Portkey AI Gateway enables you to configure routing rules, add reliability features, and enforce guardrails. Below is an example configuration:
```python theme={"system"}
config = {
"retry": {"attempts": 5},
"output_guardrails": [{
"default.contains": {"operator": "none", "words": ["Apple"]},
"deny": True
}]
}
# Attach the config to the client
client = client.with_options(config=config)
client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Reply randomly with Apple or Bat"}]
)
# In this example, the guardrail denies all replies containing "Apple", so the response would always be "Bat". The retry configuration would attempt the request up to 5 times before giving up.
```
## How to Integrate Novita AI API with Portkey
To access the Novita AI API via the Portkey AI Gateway, follow these steps:
### Step 1: Install the Portkey SDK
Integrate the Portkey SDK into your application to seamlessly interact with Novita AI’s API through Portkey’s gateway.
**Node.JS**
```jason theme={"system"}
npm install --save portkey-ai
```
**Python**
```python theme={"system"}
pip install portkey-ai
```
### Step 2: Initialize Portkey with the Virtual Key
To integrate Novita AI with Portkey, retrieve your LLM API key from[ Novita AI ](https://novita.ai/settings/key-management)and add it to Portkey to generate the virtual key.
**Node.JS SDK**
```jason theme={"system"}
import Portkey from 'portkey-ai'
const portkey = new Portkey({
apiKey: "PORTKEY_API_KEY", // Replace with your Portkey API key
virtualKey: "VIRTUAL_KEY" // Replace with your virtual key for Novita AI
})
```
**Python SDK**
```python theme={"system"}
from portkey_ai import Portkey
portkey = Portkey(
api_key="PORTKEY_API_KEY", # Replace with your Portkey API key
virtual_key="VIRTUAL_KEY" # Replace with your virtual key for Novita AI
)
```
### Step 3: Invoke Chat Completions with Novita AI
Utilize the Portkey instance to send requests to Novita AI. If necessary, you can override the virtual key directly within the API call.
**Node.JS SDK**
```jason theme={"system"}
const chatCompletion = await portkey.chat.completions.create({
messages: [{ role: 'user', content: 'Say this is a test' }],
model: 'Nous-Hermes-2-Mixtral-8x7B-DPO'
});
console.log(chatCompletion.choices);
```
**Python SDK**
```python theme={"system"}
completion = portkey.chat.completions.create(
messages= [{ "role": 'user', "content": 'Say this is a test' }],
model= 'reka-core'
)
print(completion)
```
# null
Source: https://novita.ai/docs/guides/quickstart
Novita AI is the go-to inference platform for AI developers seeking a low-cost, reliable, and simple solution for shipping AI models. This is a quickstart guide to help you set up your account and get started.
## Template Management
* Display the list of available templates, including official templates provided by Novita official and your [custom templates](/guides/sandbox-template).
* Support searching for a template by ID.
* Support filtering templates by visibility (public, private, or all), vCPUs, and memory.
## Usage Statistics
Console provides daily resource usage and cost analysis to help you monitor your consumption and optimize your usage.
* **vCPU Hours**
* Display the total vCPU usage time for all sandboxes on the current day (unit: hours).
* This data can be accurate to the second, and can be used to evaluate the current computing resource usage.
* **Memory Hours**
* Display the total memory usage time for all sandboxes on the current day (unit: GB·hours).
* The system calculates the memory allocation and usage time for each sandbox during runtime.
* **Usage Costs**
* Display the total cost of resources on the current day (unit: USD).
* Includes the total cost of all vCPUs and memory. The storage space and templates are currently free to use.
# E2B SDK Compatibility
Source: https://novita.ai/docs/guides/sandbox-e2b-compatible
Novita Agent Sandbox provides a compatibility API that allows you to use the E2B SDK and CLI. This is useful if you are already using the E2B SDK and CLI and want to switch to Novita Agent Sandbox. However, we recommend using the Novita [Agent Sandbox SDK](/guides/sandbox-stable-sdk) to access all features.
## Installation
### Code Interpreter SDK
It will generate screenshots like below:
To run a more complete demo, please refer to [here](https://github.com/novitalabs/Novita-CollabHub/tree/main/examples/browser-use).
# Internet access
Source: https://novita.ai/docs/guides/sandbox-internet-access
export const SandboxConfigHint = () => {
if (typeof document === "undefined") {
return null;
} else {
return | Billing Item | Description |
|---|---|
| CPU | Billed based on the number of vCPU cores used and usage duration (accurate to the second). Current pricing can be found on the [here](https://novita.ai/pricing?sandbox=1). No billing occurs after the Sandbox is stopped. |
| RAM (memory) | Billed based on allocated memory capacity and usage duration (accurate to the second). Current pricing can be found on the [here](https://novita.ai/pricing?sandbox=1). No billing occurs after the Sandbox is stopped. |
| Storage | Currently free. Each Sandbox is allocated a fixed 20 GB storage space. |
| Templates | Currently free. |
| vCPUs (cores) | Unit Price |
|---|---|
| 1 | \$0.0000098/s |
| 2 | \$0.0000196/s |
| 3 | \$0.0000294/s |
| 4 | \$0.0000392/s |
| 5 | \$0.000049/s |
| 6 | \$0.0000588/s |
| 7 | \$0.0000686/s |
| 8 | \$0.0000784/s |
| Memory (MiB) | Unit Price |
|---|---|
| Valid values: multiples of 512 MiB, from 512 MiB to 8192 MiB | \$0.0000032/GiB/s |
| 512 MiB | \$0.0000016/s |
| 1 GiB | \$0.0000032/s |
| 2 GiB | \$0.0000064/s |
* If a Worker is configured with multiple GPUs, the unit price of that Worker will change; the final unit price on the configuration page shall prevail.
* Also, after creating a Serverless Endpoint, the actual pricing can be checked on the Serverless Endpoint management page.
## Account Delinquency Policy
### Handling of Delinquent Accounts
After your account becomes delinquent (insufficient credits in both account balance and vouchers), the platform will send you notifications and suspend the services. The specific impacts are as follows:
* Creation of new Serverless Endpoints is not supported;
* Only "viewing" or "deleting" of existing Serverless Endpoints is supported, modifications are not allowed, wherein:
* Workers in running state will no longer accept new requests, but Workers currently processing requests will continue to run until all existing requests are completed;
* Running Workers will be automatically released after processing existing requests;
* Eventually, the number of Workers in the Serverless Endpoint will scale down to 0, and no new Workers will be created.
### Service Restoration
When your account returns to a non-delinquent status, existing Serverless Endpoints will automatically scale up according to their configurations and resume service.
# Create Serverless Endpoint
Source: https://novita.ai/docs/guides/serverless-gpus-quickstart-create-endpoint
This guide takes deploying a **Llama 3.1 8B model** as an example to introduce how to create a Serverless Endpoint from scratch.
## Step 1: Prepare Container Image
You need to package the runtime environment into a Docker image and upload it to an image repository in advance. Currently, Novita AI supports specifying **"public image repository"** and **"private image repository"** (including image repository access credentials).
We use the vLLM official image repository for serving the Llama 3.1 8B model: `vllm/vllm-openai:latest`. You can use this image address directly.
## Step 2: Go to the Console
Go to the Serverless Console, select the appropriate GPU container instance specification, and click **"Create Endpoint"**.
# Team
Source: https://novita.ai/docs/guides/team
With Team feature, you can convert your personal account into a team account or join an existing team to collaborate with others.
2. **Option 2**: In the Console, go to the [Template ](https://novita.ai/gpus-console/templates)section and find the `New Template` button.
Once you click on it, you’ll open the `Create Template` popup.
**Here, you’ll need to configure your template:**
1. **Name Your Template:** Choose a clear name that makes it easy to identify and use. We suggest selecting a name that relates to the content of the image.
2. **Set up the image:** Bundle your runtime environment into an image and upload it to an image repository ahead of time. Then, paste the image URL into the `Container Image` field.\
Novita AI supports both public and private image repositories (with optional access credentials). If you're using a **private image repository**, you must provide Container Registry Credentials, which can be added under [**Settings > Container Registry Auth.**](https://novita.ai/gpus-console/settings)
3. **Set Template Visibility:** You can choose `Private`, which makes the template accessible only to you and your team. However, we strongly recommend selecting `Public` to share your work with the broader community. Public templates will appear in the Template Library, where all users can view, deploy, favorite, and share them—helping your work gain visibility and appreciation.
**Please note:** for security reasons, **public templates only support public image repositories**, and the Container Registry Credentials input will be disabled.
4. **Specify Container Disk Size:** Determine the disk size based on your needs. We provide 60 GB of free disk space by default.
5. **Advanced Configuration Options:** To improve usability, you can optionally provide advanced configuration options such as `Container Start Command`, `Local Mount`, `Expose HTTP/TCP Ports`, and `Environment Variables`.
You can find detailed explanations of these terms [here](https://novita.ai/docs/guides/gpu-instance-overview).
We strongly encourage you to create a `README` that clearly explains the purpose and configuration of your template. A well-written README helps both you and your team quickly understand the template and makes it easier for others to use—especially if you choose to share it publicly.
* To ensure your template is easy to adopt, we recommend keeping your README **concise and beginner-friendly**.
* If your template requires any **external dependencies**, please include **clear setup instructions** within the README. This guidance helps others successfully deploy your template and ensures a smooth experience
**Unsure What Kind of Template to Create?**
* We suggest checking out popular open-source projects such as **vLLM**, **SGLang**, **Ollama**, **ComfyUI**, **Stable Diffusion WebUI**, or base environments like **CUDA**, **PyTorch**, or OS-specific setups (e.g., **CentOS** / **Ubuntu**, different versions).
We look forward to seeing what you create!
***
## Then try to explore some templates in the library!
If you've created a public template, you’ll see it in the [**Console > Template Library**](https://novita.ai/gpus-console/templates-library) section.
In the `Template Library`, you’ll find templates uploaded by both Novita and the community. Clicking on a template will display the information about it, including the `README` and `configuration details`. If you like the template, you can click `Deploy` to instantly deploy it.
If you really like a template, you can click `Favorite` to save it to your favorites. This makes it easier to find later, and your `favoriting` helps bring attention to the template, allowing more users to discover it. Your favorited templates will be displayed when you filter by `My favorites`.
**Want to share this template with your friends?** Click the `Copy Link` button to easily share it. This way, they can check it out and use it!
***
## Use a template to create an instance
The `Template Library` is integrated into the `Explore` section during instance creation.
Click `Change Template` to view both personal/team-created templates and all templates from the Template Library. Here, you can also filter to select your favorited templates.
Click on a template’s icon to view its details. Once you’ve found the template you need, click on it to select it and proceed with instance creation.
***
## Join Us!
**Now that you know how to create, upload, and share templates, come join us in building a community-friendly template ecosystem for tech enthusiasts!**
**Get in Touch:**
* Discord: [novita.ai](https://discord.com/invite/a3vd9r3uET)
# Verba
Source: https://novita.ai/docs/guides/verba
Integrate Novita AI with Verba to simplify data management and unlock contextual answers instantly.
Novita AI has redefined LLM application development through its seamless integration with Verba. By merging Novita AI's innovative platform with Verba's advanced NLP framework, developers gain access to superior performance, enhanced customizability, and a user-friendly experience.
This guide will walk you through how to integrate Novita AI API with Verba.
## Integration Steps
### Step 1: **Visit the Verba Website**
* Open your browser and navigate to [https://verba.weaviate.io/](https://verba.weaviate.io/).
### Step 2: **Choose Deployment Option**
* Select your preferred deployment method on the homepage to proceed.
### Step 3: **Configure Settings**
* Click the "Start" button at the bottom-right corner.
* On the left-hand side, click **Config**.
* Under **Generator**, select **Novita**.
* Enter your [Novita API Key](https://novita.ai/settings/key-management) in the **API Key** field.
* Finally, click the **Save Config** button to apply your settings.
## Step 4: **Import Data**
* Click the **Import Data** button in the top-right corner to open the file import page.
* Select the **Files** button at the bottom to open the file selection window.
* Choose the files you want to import and click the **Import Selected** button at the bottom-left corner.
* A successfully imported file will display on the right-hand side of the page.
### Step 5: **Start Chatting**
* Click the **Chat** button at the top to return to the chat page.
* Enter your questions based on the imported file to receive accurate results instantly.