glm-5.2 - Anyfast

curl --request POST \ --url https://www.anyfast.ai/v1/chat/completions \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data ' { "model": "glm-5.2", "messages": [ { "role": "user", "content": "Hello!" } ], "thinking": { "type": "enabled" }, "reasoning_effort": "max", "max_tokens": 65536, "temperature": 1, "top_p": 0.95, "stream": false, "tools": [ {} ], "response_format": { "type": "text" }, "stop": "<string>" } '

{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 123, "model": "glm-5.2", "choices": [ { "index": 123, "message": { "role": "assistant", "content": "Hello! How can I help you today?", "reasoning_content": "<string>" } } ], "usage": { "prompt_tokens": 123, "completion_tokens": 123, "prompt_tokens_details": { "cached_tokens": 123 }, "total_tokens": 123 } }

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

model

enum<string>

required

Model ID

Available options:

glm-5.2

Example:

"glm-5.2"

messages

object[]

required

A list of messages comprising the conversation so far.

Minimum array length: 1

Show child attributes

Example:

[{ "role": "user", "content": "Hello!" }]

thinking

object

Controls chain-of-thought. GLM-5.2 performs forced thinking when enabled.

Show child attributes

reasoning_effort

enum<string>

default:max

Controls how hard the model reasons; effective only when thinking is enabled. none/minimal skip thinking, low/medium map to high, xhigh maps to max.

Available options:

max,

xhigh,

high,

medium,

low,

minimal,

none

max_tokens

integer

default:65536

The maximum number of tokens to generate (up to 128K). Recommended >= 1024.

Required range: 1 <= x <= 131072

temperature

number

default:1

Sampling temperature. Higher values make output more random.

Required range: 0 <= x <= 1

Example:

1

top_p

number

default:0.95

Nucleus sampling threshold.

Required range: 0.01 <= x <= 1

stream

boolean

default:false

If true, stream partial message deltas using SSE.

tools

object[]

A list of tools (functions or MCP) the model may call.

response_format

object

Output format. Use { "type": "json_object" } for structured JSON output.

Show child attributes

stop

Sequences where the model will stop generating further tokens.

Response

Completion generated successfully

string

required

Example:

"chatcmpl-abc123"

object

string

required

Example:

"chat.completion"

created

integer

required

Unix timestamp

model

string

required

Example:

"glm-5.2"

choices

object[]

required

Show child attributes

usage

object

Show child attributes