Gemma 4 26B A4B

google/gemma-4-26b-a4b-it

Gemma 4 26B A4B is built for developers who need scalable performance without sacrificing core capabilities.Crucially, it retains the massive 256K-token context window of the 31B model, making it highly competitive for long-context RAG and processing extensive, image-rich document datasets. It fully supports the series' core innovations: native Thinking mode for advanced logic, Interleaved Multimodal Input for dynamic text-image workflows, and flawless document/UI parsing. Equipped with native Function Calling and robust coding proficiencies, the 26B A4B is the ideal, cost-effective engine for powering real-world agentic workflows, visual automation, and global applications across its 140+ pre-trained languages.

Características

API serverless

Documentación

google/gemma-4-26b-a4b-it is available via Novita's serverless API, where you pay per token. There are several ways to call the API, including OpenAI-compatible endpoints with exceptional reasoning performance.

Serverless disponible

Ejecuta consultas de inmediato, paga solo por el uso

Entrada$0.13 / M Tokens

Salida$0.4 / M Tokens

Usa los siguientes ejemplos de código para integrarte con nuestra API:

1from openai import OpenAI
2
3client = OpenAI(
4    api_key="<Your API Key>",
5    base_url="https://api.novita.ai/openai"
6)
7
8response = client.chat.completions.create(
9    model="google/gemma-4-26b-a4b-it",
10    messages=[
11        {"role": "system", "content": "You are a helpful assistant."},
12        {"role": "user", "content": "Hello, how are you?"}
13    ],
14    max_tokens=131072,
15    temperature=0.7
16)
17
18print(response.choices[0].message.content)

Información

Proveedor

Gemma

Cuantización

bf16

Funcionalidad compatible

Longitud del contexto

262144

Salida máxima

131072

Serverless

Compatible

Structured Output

Compatible

Function Calling

Compatible

Reasoning

Compatible

Capacidades de entrada

text, image

Capacidades de salida

text

Todo lo que necesitas para crear IA de producción.

Más de 200 modelos, GPUs bajo demanda y entornos de ejecución seguros para agentes, unificados bajo una API. Gratis para empezar, escala a medida que creces.