Key capabilities
- OpenAI-compatible — Works as a drop-in replacement with the OpenAI SDK
- 1M context window — Solid, lossless 1M-token context that stays stable on long-horizon tasks, not just nominal length
- 128K max output — Generates up to 128K tokens in a single response
- Thinking mode — Chain-of-thought reasoning (forced thinking when enabled on GLM-5.2)
- Adjustable reasoning effort —
reasoning_efforttunes how hard the model thinks - Open-source SOTA coding — Top-ranked open model on long-horizon coding benchmarks, comparable to the strongest closed models
- Function calling, structured output & MCP — Robust tool use, JSON output, and MCP tool/data-source integration
- Streaming — Real-time token streaming via SSE
Output specifications
| Property | Value |
|---|---|
| Input modality | Text |
| Output modality | Text |
| Context window | 1M tokens |
| Max output tokens | 128K |
Quick example
Thinking mode
GLM-5.2 supports a chain-of-thought thinking mode. Whenthinking.type is enabled (the default), GLM-5.2 always thinks before answering. Set it to disabled to skip reasoning for lightweight tasks.
Python
reasoning_effort controls how hard the model reasons (effective only when thinking is enabled). GLM-5.2 accepts max, xhigh, high, medium, low, minimal, and none; for compatibility, none/minimal make the model skip thinking, low/medium map to high, and xhigh maps to max. Default: max.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Must be glm-5.2 |
messages | array | Yes | List of { role, content } objects |
thinking | object | No | { "type": "enabled" | "disabled" }. Controls chain-of-thought. Default: enabled (forced thinking) |
reasoning_effort | string | No | max, xhigh, high, medium, low, minimal, none. Effective when thinking is enabled. Default: max |
max_tokens | integer | No | Maximum tokens to generate (up to 131072). Recommended ≥ 1024 |
temperature | float | No | 0–1. Controls randomness. Default: 1 |
top_p | float | No | Nucleus sampling threshold. Default: 0.95 |
stream | boolean | No | Enable SSE streaming. Default: false |
tools | array | No | Function/MCP tool definitions for tool use |
response_format | object | No | { "type": "json_object" } for structured JSON output |
stop | string / array | No | Sequences that stop generation |
API Reference
View the interactive API playground for GLM-5.2.