Model Library/MiniMax M1
minimaxai/minimax-m1-80k

MiniMax M1

minimaxai/minimax-m1-80k
MiniMax-M1: The World's First Open-Weight, Large-Scale Hybrid Attention Inference Model MiniMax-M1 adopts a Mixture of Experts (MoE) architecture and integrates the Flash Attention mechanism. The model contains a total of 456 billion parameters, with 45.9 billion parameters activated per token. Natively, the M1 model supports a context length of 1 million tokens—8 times that of DeepSeek R1. Additionally, by combining the CISPO algorithm with an efficient hybrid attention design for reinforcement learning training, MiniMax-M1 achieves industry-leading performance in long-context reasoning and real-world software engineering scenarios.

Features

Serverless API

Docs

minimaxai/minimax-m1-80k is available via Novita's serverless API, where you pay per token. There are several ways to call the API, including OpenAI-compatible endpoints with exceptional reasoning performance.

Available Serverless

Run queries immediately, pay only for usage

Input$0.55 / M Tokens
Output$2.2 / M Tokens

Use the following code examples to integrate with our API:

1from openai import OpenAI
2
3client = OpenAI(
4    api_key="<Your API Key>",
5    base_url="https://api.novita.ai/openai"
6)
7
8response = client.chat.completions.create(
9    model="minimaxai/minimax-m1-80k",
10    messages=[
11        {"role": "system", "content": "You are a helpful assistant."},
12        {"role": "user", "content": "Hello, how are you?"}
13    ],
14    max_tokens=40000,
15    temperature=0.7
16)
17
18print(response.choices[0].message.content)

Info

Provider
MiniMax
Quantization
bf16

Supported Functionality

Context Length
1000000
Max Output
40000
Serverless
Supported
Function Calling
Supported
Structured Output
Supported
Reasoning
Supported
Input Capabilities
text
Output Capabilities
text

Everything you need to build production AI.

200+ models, on-demand GPUs, and secure agent runtimes — unified under one API. Free to start, scales as you grow.