Model Library/Step-3.7-Flash
StepFun

Step-3.7-Flash

stepfun/step-3.7-flash
Step-3.7-Flash is StepFun's flagship high-efficiency multimodal reasoning model, built on a sparse Mixture-of-Experts architecture (198B total / ~11B active parameters) that delivers high throughput and low latency for real-time agentic workflows and high-frequency calls. It natively understands images and video — no extra vision model needed in agent frameworks — and offers a 256K context window with three selectable reasoning levels (low/medium/high) to balance speed, cost and reasoning depth. With reliable multi-step tool calling, task decomposition and plan execution, it excels at coding, solution planning and vision-to-workflow tasks, producing complete plans in a single call.

Características

API serverless

Documentación

stepfun/step-3.7-flash is available via Novita's serverless API, where you pay per token. There are several ways to call the API, including OpenAI-compatible endpoints with exceptional reasoning performance.

Serverless disponible

Ejecuta consultas de inmediato, paga solo por el uso

Entrada$0.2 / M Tokens
Lectura de caché$0.04 / M Tokens
Salida$1.15 / M Tokens

Usa los siguientes ejemplos de código para integrarte con nuestra API:

1from openai import OpenAI
2
3client = OpenAI(
4    api_key="<Your API Key>",
5    base_url="https://api.novita.ai/openai"
6)
7
8response = client.chat.completions.create(
9    model="stepfun/step-3.7-flash",
10    messages=[
11        {"role": "system", "content": "You are a helpful assistant."},
12        {"role": "user", "content": "Hello, how are you?"}
13    ],
14    max_tokens=256000,
15    temperature=0.7
16)
17
18print(response.choices[0].message.content)

Información

Proveedor
StepFun
Cuantización
-

Funcionalidad compatible

Longitud del contexto
262144
Salida máxima
256000
Serverless
Compatible
Function Calling
Compatible
Structured Output
Compatible
Reasoning
Compatible
Capacidades de entrada
text, image, video
Capacidades de salida
text

Todo lo que necesitas para crear IA de producción.

Más de 200 modelos, GPUs bajo demanda y entornos de ejecución seguros para agentes, unificados bajo una API. Gratis para empezar, escala a medida que creces.