Impacted Models & Recommended Alternatives
Please refer to the table below to see if your application is using any of the impacted models and find their recommended alternatives:| Deprecated model | Recommended alternatives |
|---|---|
| baichuan/baichuan-m2-32b | qwen/qwen3-32b-fp8 |
| paddlepaddle/paddleocr-vl | qwen/qwen3-vl-8b-instruct |
| qwen/qwen3-reranker-8b | baai/bge-reranker-v2-m3 |
| deepseek/deepseek-ocr | qwen/qwen3-vl-235b-a22b-instruct |
| thudm/glm-4.1v-9b-thinking | zai-org/glm-4.6v |
| deepseek/deepseek-r1-distill-qwen-14b | deepseek/deepseek-r1-0528 |
| meta-llama/llama-3.2-3b-instruct | meta-llama/llama-3.3-70b-instruct |
| google/gemma-3-12b-it | google/gemma-3-27b-it |
| gryphe/mythomax-l2-13b | deepseek/deepseek-v3-0324 |
| deepseek/deepseek-r1-0528-qwen3-8b | deepseek/deepseek-r1-0528 |
| deepseek/deepseek-r1-distill-qwen-32b | deepseek/deepseek-r1-0528 |
| qwen/qwen3-8b-fp8 | qwen/qwen3-next-80b-a3b-instruct |
What actions should you take?
To avoid service interruption, please choose one of the following options before the deadline:Option 1: Switch to the Alternative (Recommended)
Update your API code to point to the Recommended Alternative listed above. These newer models offer improved reasoning capabilities, faster inference speeds, and better cost-efficiency.Option 2: Continue Using Legacy Models (via Dedicated Endpoints)
If your workflow strictly requires a specific version from the deprecated list (e.g., for reproducibility or specific fine-tuning), you can deploy it on your own private GPU resources using our Dedicated Endpoints.- Benefit: This guarantees exclusive access, zero cold starts, and consistent performance.
- Action: https://novita.ai/dedicated-endpoint
Timeline
- Announcement Date: December 25, 2025
- End of Life (EOL): January 8, 2026 (UTC)