LLM Serverless Model Deprecation Notice

Dear Developer, We are writing to inform you of upcoming changes to our Serverless Endpoints. As part of our regular product lifecycle management to ensure high performance and resource optimization, we will be deprecating a list of specific models on January 8, 2026 (UTC). You have 14 days from today to update your integration before these models are removed from the Serverless Endpoints.

Impacted Models & Recommended Alternatives

Please refer to the table below to see if your application is using any of the impacted models and find their recommended alternatives:

Deprecated model	Recommended alternatives
baichuan/baichuan-m2-32b	qwen/qwen3-32b-fp8
paddlepaddle/paddleocr-vl	qwen/qwen3-vl-8b-instruct
qwen/qwen3-reranker-8b	baai/bge-reranker-v2-m3
deepseek/deepseek-ocr	qwen/qwen3-vl-235b-a22b-instruct
thudm/glm-4.1v-9b-thinking	zai-org/glm-4.6v
deepseek/deepseek-r1-distill-qwen-14b	deepseek/deepseek-r1-0528
meta-llama/llama-3.2-3b-instruct	meta-llama/llama-3.3-70b-instruct
google/gemma-3-12b-it	google/gemma-3-27b-it
gryphe/mythomax-l2-13b	deepseek/deepseek-v3-0324
deepseek/deepseek-r1-0528-qwen3-8b	deepseek/deepseek-r1-0528
deepseek/deepseek-r1-distill-qwen-32b	deepseek/deepseek-r1-0528
qwen/qwen3-8b-fp8	qwen/qwen3-next-80b-a3b-instruct

What actions should you take?

To avoid service interruption, please choose one of the following options before the deadline:

Option 1: Switch to the Alternative (Recommended)

Update your API code to point to the Recommended Alternative listed above. These newer models offer improved reasoning capabilities, faster inference speeds, and better cost-efficiency.

Option 2: Continue Using Legacy Models (via Dedicated Endpoints)

If your workflow strictly requires a specific version from the deprecated list (e.g., for reproducibility or specific fine-tuning), you can deploy it on your own private GPU resources using our Dedicated Endpoints.

Benefit: This guarantees exclusive access, zero cold starts, and consistent performance.
Action: https://novita.ai/dedicated-endpoint

Timeline

Announcement Date: December 25, 2025
End of Life (EOL): January 8, 2026 (UTC)

Note: After the EOL date, API requests to any models in the Deprecated list via Serverless Endpoints will result in an error. If you have any questions or need assistance with the migration, please reach out to us via our Discord community or submit a support ticket. Best regards, The Novita AI Team

Last modified on December 25, 2025

January 19, 2026 Product Updates Nov 11, 2025 Updates

⌘I

​Impacted Models & Recommended Alternatives

​What actions should you take?

​Option 1: Switch to the Alternative (Recommended)

​Option 2: Continue Using Legacy Models (via Dedicated Endpoints)

​Timeline

Impacted Models & Recommended Alternatives

What actions should you take?

Option 1: Switch to the Alternative (Recommended)

Option 2: Continue Using Legacy Models (via Dedicated Endpoints)

Timeline