Skip to main content

Documentation Index

Fetch the complete documentation index at: https://novita.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Dear Developer, We are writing to inform you of upcoming changes to our Serverless Endpoints. As part of our regular product lifecycle management to ensure high performance and resource optimization, we are deprecating two batches of models: an immediate batch effective today (May 22, 2026) and a scheduled batch on June 5, 2026 (UTC). For the scheduled batch, you have 14 days from today to update your integration before these models are removed from the Serverless Endpoints.
Please refer to the tables below to see if your application is using any of the impacted models and find their recommended alternatives.

Effective Immediately (May 22, 2026)

Deprecated modelRecommended alternativesNotes
baidu/ernie-4.5-21B-a3bqwen/qwen3.5-35b-a3b, qwen/qwen3.6-35b-a3b—
baidu/ernie-4.5-vl-28b-a3bqwen/qwen3.5-35b-a3b, qwen/qwen3.6-35b-a3b—
baidu/ernie-4.5-300b-a47b-paddledeepseek/deepseek-v3.2—
baidu/ernie-4.5-21B-a3b-thinkingqwen/qwen3.5-35b-a3b, qwen/qwen3.6-35b-a3b—
qwen/qwen2.5-7b-instructqwen/qwen3.5-27b, qwen/qwen3.6-27b—
qwen/qwen2.5-vl-72b-instructqwen/qwen3.5-27b, qwen/qwen3.6-27b—
qwen/qwen2.5-32b-instructqwen/qwen3.5-27b, qwen/qwen3.6-27b—
qwen/qwen3-4bqwen/qwen3.5-27b, qwen/qwen3.6-27b—

Scheduled for June 5, 2026

Deprecated modelRecommended alternativesNotes
qwen/qwen3-30b-a3bqwen/qwen3.5-35b-a3b, qwen/qwen3.6-35b-a3bAlso available on Dedicated Endpoints
qwen/qwen3-32bqwen/qwen3.5-27b, qwen/qwen3.6-27bAlso available on Dedicated Endpoints
nousresearch/hermes-2-pro-llama-3-8b—Available on Dedicated Endpoints only
meta-llama/llama-3.2-1b-instruct—Available on Dedicated Endpoints only
sao10k/l3-70b-euryale-v2.1—Available on Dedicated Endpoints only
deepseek/deepseek-ocrdeepseek/deepseek-ocr-2—

What actions should you take?

To avoid service interruption, please choose one of the following options before the deadline: For deprecated models that have a Serverless alternative listed above, update your API code to point to the recommended alternative. These newer models offer improved reasoning capabilities, faster inference speeds, and better cost-efficiency.

Option 2: Continue Using Specific Models via Dedicated Endpoints

Several models in the June 5, 2026 batch remain available on Dedicated Endpoints:
  • Dedicated Endpoints only (no Serverless alternative): nousresearch/hermes-2-pro-llama-3-8b, meta-llama/llama-3.2-1b-instruct, sao10k/l3-70b-euryale-v2.1
  • Also available on Dedicated Endpoints (alongside Serverless alternatives): qwen/qwen3-30b-a3b, qwen/qwen3-32b
If your workflow requires any of these specific versions (e.g., for reproducibility or specific fine-tuning), you can deploy them on your own private GPU resources using our Dedicated Endpoints.

Timeline

  • Announcement Date: May 22, 2026
  • Immediate End of Life (EOL): May 22, 2026 (UTC) — effective upon announcement
  • Scheduled End of Life (EOL): June 5, 2026 (UTC)
Note: After each EOL date, API requests to the corresponding deprecated models via Serverless Endpoints will result in an error. If you have any questions or need assistance with the migration, please reach out to us via our Discord community or submit a support ticket. Best regards, The Novita AI Team