Dear Developer, We are writing to inform you of upcoming changes to our Serverless Endpoints. As part of our regular product lifecycle management to ensure high performance and resource optimization, we are deprecating two batches of models: an immediate batch effective today (May 22, 2026) and a scheduled batch on June 5, 2026 (UTC). For the scheduled batch, you have 14 days from today to update your integration before these models are removed from the Serverless Endpoints.Documentation Index
Fetch the complete documentation index at: https://novita.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Impacted Models & Recommended Alternatives
Please refer to the tables below to see if your application is using any of the impacted models and find their recommended alternatives.Effective Immediately (May 22, 2026)
| Deprecated model | Recommended alternatives | Notes |
|---|---|---|
baidu/ernie-4.5-21B-a3b | qwen/qwen3.5-35b-a3b, qwen/qwen3.6-35b-a3b | ā |
baidu/ernie-4.5-vl-28b-a3b | qwen/qwen3.5-35b-a3b, qwen/qwen3.6-35b-a3b | ā |
baidu/ernie-4.5-300b-a47b-paddle | deepseek/deepseek-v3.2 | ā |
baidu/ernie-4.5-21B-a3b-thinking | qwen/qwen3.5-35b-a3b, qwen/qwen3.6-35b-a3b | ā |
qwen/qwen2.5-7b-instruct | qwen/qwen3.5-27b, qwen/qwen3.6-27b | ā |
qwen/qwen2.5-vl-72b-instruct | qwen/qwen3.5-27b, qwen/qwen3.6-27b | ā |
qwen/qwen2.5-32b-instruct | qwen/qwen3.5-27b, qwen/qwen3.6-27b | ā |
qwen/qwen3-4b | qwen/qwen3.5-27b, qwen/qwen3.6-27b | ā |
Scheduled for June 5, 2026
| Deprecated model | Recommended alternatives | Notes |
|---|---|---|
qwen/qwen3-30b-a3b | qwen/qwen3.5-35b-a3b, qwen/qwen3.6-35b-a3b | Also available on Dedicated Endpoints |
qwen/qwen3-32b | qwen/qwen3.5-27b, qwen/qwen3.6-27b | Also available on Dedicated Endpoints |
nousresearch/hermes-2-pro-llama-3-8b | ā | Available on Dedicated Endpoints only |
meta-llama/llama-3.2-1b-instruct | ā | Available on Dedicated Endpoints only |
sao10k/l3-70b-euryale-v2.1 | ā | Available on Dedicated Endpoints only |
deepseek/deepseek-ocr | deepseek/deepseek-ocr-2 | ā |
What actions should you take?
To avoid service interruption, please choose one of the following options before the deadline:Option 1: Switch to the Recommended Alternative
For deprecated models that have a Serverless alternative listed above, update your API code to point to the recommended alternative. These newer models offer improved reasoning capabilities, faster inference speeds, and better cost-efficiency.Option 2: Continue Using Specific Models via Dedicated Endpoints
Several models in the June 5, 2026 batch remain available on Dedicated Endpoints:- Dedicated Endpoints only (no Serverless alternative):
nousresearch/hermes-2-pro-llama-3-8b,meta-llama/llama-3.2-1b-instruct,sao10k/l3-70b-euryale-v2.1 - Also available on Dedicated Endpoints (alongside Serverless alternatives):
qwen/qwen3-30b-a3b,qwen/qwen3-32b
- Benefit: This guarantees exclusive access, zero cold starts, and consistent performance.
- Action: https://novita.ai/dedicated-endpoint
Timeline
- Announcement Date: May 22, 2026
- Immediate End of Life (EOL): May 22, 2026 (UTC) ā effective upon announcement
- Scheduled End of Life (EOL): June 5, 2026 (UTC)