Skip to main content
POST
/
v1
/
chat
/
completions
Chat Completion
curl --request POST \
  --url https://www.anyfast.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "kimi-k2.6",
  "messages": [
    {
      "role": "user",
      "content": "Hello!"
    }
  ],
  "thinking": {},
  "max_completion_tokens": 2,
  "temperature": 1,
  "top_p": 0.5,
  "frequency_penalty": 0,
  "presence_penalty": 0,
  "response_format": {},
  "tools": [
    {}
  ],
  "stream": false,
  "stop": "<string>"
}
'
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 123,
  "model": "kimi-k2.6",
  "choices": [
    {
      "index": 123,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?",
        "reasoning_content": "<string>"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123,
    "prompt_tokens_details": {
      "cached_tokens": 123
    }
  }
}
Kimi-K2.6 supports text, image, and video input, thinking / non-thinking modes, and Tool Use (function calling).

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
model
enum<string>
required

Model ID

Available options:
kimi-k2.6
Example:

"kimi-k2.6"

messages
object[]
required

A list of messages comprising the conversation so far.

Minimum array length: 1
Example:
[{ "role": "user", "content": "Hello!" }]
thinking
object

Controls thinking mode and whether reasoning_content from previous turns is preserved. Default {"type": "enabled"}.

max_completion_tokens
integer

The maximum number of tokens to generate. (max_tokens is deprecated and not honored.)

Required range: x >= 1
temperature
number
default:1

Sampling temperature. Higher values make output more random.

Required range: 0 <= x <= 2
Example:

1

top_p
number

Nucleus sampling threshold.

Required range: 0 <= x <= 1
frequency_penalty
number
default:0

Penalizes repeated tokens based on their frequency in the text so far.

Required range: -2 <= x <= 2
presence_penalty
number
default:0

Penalizes tokens that have already appeared in the text.

Required range: -2 <= x <= 2
response_format
object

Specifies the output format, e.g. JSON Mode.

tools
object[]

A list of tools the model may call (function calling).

stream
boolean
default:false

If true, stream partial message deltas using SSE.

stop

Sequences where the model will stop generating further tokens.

Response

Completion generated successfully

id
string
required
Example:

"chatcmpl-abc123"

object
string
required
Example:

"chat.completion"

created
integer
required

Unix timestamp

model
string
required
Example:

"kimi-k2.6"

choices
object[]
required
usage
object