Pricing/Kimi K2.6
moonshotai/kimi-k2.6

Kimi K2.6

moonshotai/kimi-k2.6
Kimi K2.6 is an open-source, native multimodal agentic model that significantly advances practical capabilities in long-horizon coding, coding-driven design, and swarm-based task orchestration. It robustly executes complex, end-to-end development tasks across multiple programming languages and domains, seamlessly transforming simple prompts and visual inputs into production-ready, aesthetically precise interfaces and full-stack workflows. Uniquely engineered for high scalability, K2.6 can horizontally orchestrate up to 300 domain-specialized sub-agents through 4,000 coordinated steps, dynamically decomposing intricate tasks to deliver diverse end-to-end outputs—from documents and spreadsheets to fully functional websites—in a single autonomous run. Furthermore, its proactive execution capabilities empower persistent, 24/7 background agents to manage schedules, deploy code, and orchestrate cross-platform operations entirely without human oversight, establishing it as a premier foundational model for next-gener

Features

Serverless API

Docs

moonshotai/kimi-k2.6 is available via Novita's serverless API, where you pay per token. There are several ways to call the API, including OpenAI-compatible endpoints with exceptional reasoning performance.

Available Serverless

Run queries immediately, pay only for usage

Input$0.8 / M Tokens
Cache Read$0.16 / M Tokens
Output$3.4 / M Tokens

Use the following code examples to integrate with our API:

1from openai import OpenAI
2
3client = OpenAI(
4    api_key="<Your API Key>",
5    base_url="https://api.novita.ai/openai"
6)
7
8response = client.chat.completions.create(
9    model="moonshotai/kimi-k2.6",
10    messages=[
11        {"role": "system", "content": "You are a helpful assistant."},
12        {"role": "user", "content": "Hello, how are you?"}
13    ],
14    max_tokens=262144,
15    temperature=0.7
16)
17
18print(response.choices[0].message.content)

Info

Provider
MoonshotAI
Quantization
-

Supported Functionality

Context Length
262144
Max Output
262144
Serverless
Supported
Reasoning
Supported
Structured Output
Supported
Function Calling
Supported
Anthropic API
Supported
Input Capabilities
text, image, video
Output Capabilities
text

Everything you need to build production AI.

200+ models, on-demand GPUs, and secure agent runtimes — unified under one API. Free to start, scales as you grow.