Last Updated: 2025-12-03
Status: Supported (OpenAI-Compatible)
1. Overview
Interleaved Thinking is an advanced reasoning framework that enables models to perform explicit reasoning steps between tool calls. Models with Interleaved Thinking can:- Reflect on the current environment and tool outputs
- Decide the next action based on updated reasoning
- Maintain a continuous reasoning chain across multiple tool invocations
- Provide transparent, inspectable multi-step thinking through
reasoning_detailsorreasoning_content
2. Key Concepts
2.1 Interleaving
Instead of executing a single reasoning phase followed by a tool call, the model performs:2.2 Reasoning Details (reasoning_details)
For some models, the content of the model’s thinking will be returned in the form of a separate structure:
2.3 Conversation Memory Requirement
To maintain reasoning continuity: You must append the model’s full response includingreasoning_details, tool_calls, and content to subsequent messages.
Failing to preserve the chain may result in:
- Incorrect tool use
- Lost reasoning context
- Repeated or circular tool calls
- Reduced reliability
3. API Behavior
3.1 Request Format
No changes are required on the user side. Interleaved Thinking works with the standard OpenAI-compatible Chat Completions API.3.2 Response Format
The model may return the following fields:reasoning_content: original thinking contentreasoning_details: structured reasoning segments, this field is optioanltool_calls: tool invocation plancontent: natural language output
4. Example Request (MiniMax-M2)
5. Example Response (Non-Streaming)
6. Streaming Response Example
7. Developer Notes
7.1 Models That Support Interleaved Thinking
All models exposingreasoning_details through OpenAI-compatible APIs, including:
- MiniMax-M2
- (Upcoming) Novita Reasoning Series
- Other reasoning-enabled partner models
7.2 Pricing
Billing is based on reasoning tokens, following the model’s pricing rules.reasoning_details will increase token usage.
7.3 Error Handling
You may encounter:- Missing tool parameters
- Recursive or repeated tool calls
- Incorrect assumptions in the reasoning phase
8. Best Practices
✓ Always include full model messages in the next request Include:contenttool_callsreasoning_details
- Monitor the reasoning process
- Detect incorrect tool plans early
- Provide faster user feedback
- Parameter validators
- Execution sandbox
- Maximum recursion safeguards
9. Summary
Interleaved Thinking significantly enhances multi-step reasoning and tool-use reliability:- Transparent and inspectable step-wise reasoning
- Adaptive planning between tool invocations
- Stronger context retention across long workflows
- Fully compatible with OpenAI-style Chat Completions API