Dear Developer, We are writing to inform you of upcoming changes to our Serverless Endpoints. As part of our regular product lifecycle management to ensure high performance and resource optimization, we will be deprecating a list of specific models on January 8, 2026 (UTC). You have 14 days from today to update your integration before these models are removed from the Serverless Endpoints.Documentation Index
Fetch the complete documentation index at: https://novita.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Impacted Models & Recommended Alternatives
Please refer to the table below to see if your application is using any of the impacted models and find their recommended alternatives:| Deprecated model | Recommended alternatives |
|---|---|
| baichuan/baichuan-m2-32b | qwen/qwen3-32b-fp8 |
| paddlepaddle/paddleocr-vl | qwen/qwen3-vl-8b-instruct |
| qwen/qwen3-reranker-8b | baai/bge-reranker-v2-m3 |
| deepseek/deepseek-ocr | qwen/qwen3-vl-235b-a22b-instruct |
| thudm/glm-4.1v-9b-thinking | zai-org/glm-4.6v |
| deepseek/deepseek-r1-distill-qwen-14b | deepseek/deepseek-r1-0528 |
| meta-llama/llama-3.2-3b-instruct | meta-llama/llama-3.3-70b-instruct |
| google/gemma-3-12b-it | google/gemma-3-27b-it |
| gryphe/mythomax-l2-13b | deepseek/deepseek-v3-0324 |
| deepseek/deepseek-r1-0528-qwen3-8b | deepseek/deepseek-r1-0528 |
| deepseek/deepseek-r1-distill-qwen-32b | deepseek/deepseek-r1-0528 |
| qwen/qwen3-8b-fp8 | qwen/qwen3-next-80b-a3b-instruct |
What actions should you take?
To avoid service interruption, please choose one of the following options before the deadline:Option 1: Switch to the Alternative (Recommended)
Update your API code to point to the Recommended Alternative listed above. These newer models offer improved reasoning capabilities, faster inference speeds, and better cost-efficiency.Option 2: Continue Using Legacy Models (via Dedicated Endpoints)
If your workflow strictly requires a specific version from the deprecated list (e.g., for reproducibility or specific fine-tuning), you can deploy it on your own private GPU resources using our Dedicated Endpoints.- Benefit: This guarantees exclusive access, zero cold starts, and consistent performance.
- Action: https://novita.ai/dedicated-endpoint
Timeline
- Announcement Date: December 25, 2025
- End of Life (EOL): January 8, 2026 (UTC)