glm-5.2
Creates a model response for the given chat conversation.
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
Model ID
glm-5.2 "glm-5.2"
A list of messages comprising the conversation so far.
1[{ "role": "user", "content": "Hello!" }]Controls chain-of-thought. GLM-5.2 performs forced thinking when enabled.
Controls how hard the model reasons; effective only when thinking is enabled. none/minimal skip thinking, low/medium map to high, xhigh maps to max.
max, xhigh, high, medium, low, minimal, none The maximum number of tokens to generate (up to 128K). Recommended >= 1024.
1 <= x <= 131072Sampling temperature. Higher values make output more random.
0 <= x <= 11
Nucleus sampling threshold.
0.01 <= x <= 1If true, stream partial message deltas using SSE.
A list of tools (functions or MCP) the model may call.
Output format. Use { "type": "json_object" } for structured JSON output.
Sequences where the model will stop generating further tokens.