Reasoning Models

Overview

Reasoning models are advanced language models optimized for complex problem-solving tasks. By generating detailed reasoning steps (chain-of-thought), they improve the accuracy of answers in analytical scenarios.

Typical Use Cases

Complex Problem Solving: Suitable for tasks requiring step-by-step logic, such as math or scientific reasoning.
Decision Support Systems: Helps explain the logic behind conclusions by providing detailed reasoning processes.
Education and Training: Assists learners in understanding complex concepts by presenting derivation processes clearly.

Installation & Setup

Before using reasoning models, make sure the latest OpenAI SDK is installed:

pip install -U openai

API Usage

Use the /chat/completions endpoint to invoke reasoning models.

Request Parameters

max_tokens: Sets the maximum number of tokens the model can return.
temperature: Recommended between 0.5 and 0.7 (suggested: 0.6) to balance creativity and logic.
top_p: Recommended value is 0.95.

Example Code

Streaming Response

from openai import OpenAI

client = OpenAI(api_key="YOUR_API_KEY", base_url="https://api.novita.ai/v3/openai")
messages = [
    {"role": "user", "content": "Explain Newton's Second Law."}
]

response = client.chat.completions.create(
    model="deepseek/deepseek-r1",
    messages=messages,
    stream=True,
    max_tokens=4096
)

content = ""
reasoning_content = ""
for chunk in response:
    if chunk.choices[0].delta.content:
        content += chunk.choices[0].delta.content
    if chunk.choices[0].delta.reasoning_content:
        reasoning_content += chunk.choices[0].delta.reasoning_content

print("Final Answer:", content)
print("Reasoning Steps:", reasoning_content)

Non-Streaming Response

response = client.chat.completions.create(
    model="deepseek/deepseek-r1",
    messages=[
        {"role": "user", "content": "What is the greenhouse effect? How can it be mitigated?"}
    ],
    stream=False,
    max_tokens=4096
)

content = response.choices[0].message.content
reasoning_content = response.choices[0].message.reasoning_content

print("Final Answer:", content)
print("Reasoning Steps:", reasoning_content)

Context Management

Reasoning outputs are not automatically carried over to the next round of dialogue. You must manually maintain the message history:

messages.append({"role": "assistant", "content": content})
messages.append({"role": "user", "content": "Please continue explaining the solution."})

Supported Models

DeepSeek Series

deepseek/deepseek-r1-0528
deepseek/deepseek-r1-0528-qwen3-8b
deepseek/deepseek-r1-turbo
deepseek/deepseek-r1-distill-qwen-32b
deepseek/deepseek-r1-distill-qwen-14b
deepseek/deepseek-r1-distill-llama-70b
deepseek/deepseek-r1-distill-llama-8b
deepseek/deepseek-r1/community

Qwen Series

qwen/qwen3-235b-a22b-fp8
qwen/qwen3-30b-a3b-fp8
qwen/qwen3-32b-fp8
qwen/qwen3-8b-fp8
qwen/qwen3-4b-fp8

GLM Series

thudm/glm-z1-rumination-32b-0414
thudm/glm-z1-9b-0414
thudm/glm-z1-32b-0414

LLaMA Series

meta-llama/llama-4-maverick-17b-128e-instruct-fp8

Visit the model library for the latest list and details.

Billing

Billing is based on the number of tokens for both input and output.
Please refer to each model’s pricing page for specific billing rules and token conversion details.

Notes & Best Practices

Avoid placing reasoning instructions in the system message. Instead, make the intent explicit in the user message.
For mathematical tasks, clearly instruct the model, e.g., “Please reason step by step and provide a final answer.”
To prevent the model from skipping reasoning steps, consider asking for a newline before the final answer.

Get started

Model APIs

Agent Sandbox

GPUs

Observability

Resources

Overview

Typical Use Cases

Installation & Setup

API Usage

Request Parameters

Example Code

Streaming Response

Non-Streaming Response

Context Management

Supported Models

DeepSeek Series

Qwen Series

GLM Series

LLaMA Series

Billing

Notes & Best Practices

Get started

Model APIs

Agent Sandbox

GPUs

Observability

Resources

​Overview

​Typical Use Cases

​Installation & Setup

​API Usage

​Request Parameters

​Example Code

​Streaming Response

​Non-Streaming Response

​Context Management

​Supported Models

​DeepSeek Series

​Qwen Series

​GLM Series

​LLaMA Series

​Billing

​Notes & Best Practices

Overview

Typical Use Cases

Installation & Setup

API Usage

Request Parameters

Example Code

Streaming Response

Non-Streaming Response

Context Management

Supported Models

DeepSeek Series

Qwen Series

GLM Series

LLaMA Series

Billing

Notes & Best Practices