Pricing/GLM 4.5V
zai-org/glm-4.5v

GLM 4.5V

zai-org/glm-4.5v
Z.ai's GLM-4.5V sets a new standard in visual reasoning, achieving SOTA performance across 42 benchmarks among open-source models. Beyond benchmarks, it excels in real-world applications through hybrid training, enabling comprehensive visual understanding—from image/video analysis and GUI interaction to complex document processing and precise visual grounding. In China's GeoGuessr challenge, GLM-4.5V surpassed 99% of 21,000 human players within 16 hours, reaching 66th place in a week. Built on the GLM-4.5-Air foundation and inheriting GLM-4.1V-Thinking's approach, it leverages a 106B-parameter MoE architecture for scalable, efficient performance. This model bridges advanced AI research with practical deployment, delivering unmatched visual intelligence

Features

Serverless API

Docs

zai-org/glm-4.5v is available via Novita's serverless API, where you pay per token. There are several ways to call the API, including OpenAI-compatible endpoints with exceptional reasoning performance.

On-demand Deployments

Docs

On-demand deployments allow you to use zai-org/glm-4.5v on dedicated GPUs with high-performance serving stack with high reliability and no rate limits.

Available Serverless

Run queries immediately, pay only for usage

Input$0.6 / M Tokens
Cache Read$0.11 / M Tokens
Output$1.8 / M Tokens

Use the following code examples to integrate with our API:

1from openai import OpenAI
2
3client = OpenAI(
4    api_key="<Your API Key>",
5    base_url="https://api.novita.ai/openai"
6)
7
8response = client.chat.completions.create(
9    model="zai-org/glm-4.5v",
10    messages=[
11        {"role": "system", "content": "You are a helpful assistant."},
12        {"role": "user", "content": "Hello, how are you?"}
13    ],
14    max_tokens=16384,
15    temperature=0.7
16)
17
18print(response.choices[0].message.content)

Info

Provider
Zai-org
Quantization
fp8

Supported Functionality

Context Length
65536
Max Output
16384
Serverless
Supported
Function Calling
Supported
Structured Output
Supported
Reasoning
Supported
Input Capabilities
text, video, image
Output Capabilities
text

Everything you need to build production AI.

200+ models, on-demand GPUs, and secure agent runtimes — unified under one API. Free to start, scales as you grow.