Step 3.7 Flash

stepfun/step-3.7-flash

Step-3.7-Flash is StepFun's flagship high-efficiency multimodal reasoning model, built on a sparse Mixture-of-Experts architecture (198B total / ~11B active parameters) that delivers high throughput and low latency for real-time agentic workflows and high-frequency calls. It natively understands images and video — no extra vision model needed in agent frameworks — and offers a 256K context window with three selectable reasoning levels (low/medium/high) to balance speed, cost and reasoning depth. With reliable multi-step tool calling, task decomposition and plan execution, it excels at coding, solution planning and vision-to-workflow tasks, producing complete plans in a single call.

Características

API serverless

Documentación

stepfun/step-3.7-flash is available via Novita's serverless API, where you pay per token. There are several ways to call the API, including OpenAI-compatible endpoints with exceptional reasoning performance.

Serverless disponible

Ejecuta consultas de inmediato, paga solo por el uso

Entrada$0.2 / M Tokens

Lectura de caché$0.04 / M Tokens

Salida$1.15 / M Tokens

Usa los siguientes ejemplos de código para integrarte con nuestra API:

1from openai import OpenAI
2
3client = OpenAI(
4    api_key="<Your API Key>",
5    base_url="https://api.novita.ai/openai"
6)
7
8response = client.chat.completions.create(
9    model="stepfun/step-3.7-flash",
10    messages=[
11        {"role": "system", "content": "You are a helpful assistant."},
12        {"role": "user", "content": "Hello, how are you?"}
13    ],
14    max_tokens=256000,
15    temperature=0.7
16)
17
18print(response.choices[0].message.content)

Información

Proveedor

StepFun

Cuantización

fp8

Funcionalidad compatible

Longitud del contexto

262144

Salida máxima

256000

Serverless

Compatible

Function Calling

Compatible

Structured Output

Compatible

Reasoning

Compatible

Capacidades de entrada

text, image, video

Capacidades de salida

text

Todo lo que necesitas para crear IA de producción.

Más de 200 modelos, GPUs bajo demanda y entornos de ejecución seguros para agentes, unificados bajo una API. Gratis para empezar, escala a medida que creces.