跳转到主要内容
POST
/
v1
/
chat
/
completions
Chat Completion
curl --request POST \
  --url https://www.anyfast.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "glm-5.2",
  "messages": [
    {
      "role": "user",
      "content": "Hello!"
    }
  ],
  "thinking": {
    "type": "enabled"
  },
  "reasoning_effort": "max",
  "max_tokens": 65536,
  "temperature": 1,
  "top_p": 0.95,
  "stream": false,
  "tools": [
    {}
  ],
  "response_format": {
    "type": "text"
  },
  "stop": "<string>"
}
'
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 123,
  "model": "glm-5.2",
  "choices": [
    {
      "index": 123,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?",
        "reasoning_content": "<string>"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "prompt_tokens_details": {
      "cached_tokens": 123
    },
    "total_tokens": 123
  }
}

授权

Authorization
string
header
必填

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

请求体

application/json
model
enum<string>
必填

Model ID

可用选项:
glm-5.2
示例:

"glm-5.2"

messages
object[]
必填

A list of messages comprising the conversation so far.

Minimum array length: 1
示例:
[{ "role": "user", "content": "Hello!" }]
thinking
object

Controls chain-of-thought. GLM-5.2 performs forced thinking when enabled.

reasoning_effort
enum<string>
默认值:max

Controls how hard the model reasons; effective only when thinking is enabled. none/minimal skip thinking, low/medium map to high, xhigh maps to max.

可用选项:
max,
xhigh,
high,
medium,
low,
minimal,
none
max_tokens
integer
默认值:65536

The maximum number of tokens to generate (up to 128K). Recommended >= 1024.

必填范围: 1 <= x <= 131072
temperature
number
默认值:1

Sampling temperature. Higher values make output more random.

必填范围: 0 <= x <= 1
示例:

1

top_p
number
默认值:0.95

Nucleus sampling threshold.

必填范围: 0.01 <= x <= 1
stream
boolean
默认值:false

If true, stream partial message deltas using SSE.

tools
object[]

A list of tools (functions or MCP) the model may call.

response_format
object

Output format. Use { "type": "json_object" } for structured JSON output.

stop

Sequences where the model will stop generating further tokens.

响应

Completion generated successfully

id
string
必填
示例:

"chatcmpl-abc123"

object
string
必填
示例:

"chat.completion"

created
integer
必填

Unix timestamp

model
string
必填
示例:

"glm-5.2"

choices
object[]
必填
usage
object