deepseek-v4-flash

curl --request POST \ --url https://www.anyfast.ai/v1/chat/completions \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data ' { "model": "deepseek-v4-flash", "messages": [ { "role": "user", "content": "Hello!" } ], "thinking": {}, "max_tokens": 2, "response_format": { "type": "text" }, "stop": "<string>", "stream": false, "stream_options": { "include_usage": true }, "temperature": 1, "top_p": 1, "tools": [ { "type": "function" } ], "frequency_penalty": 0, "presence_penalty": 0, "user_id": "<string>" } '

{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 123, "model": "deepseek-v4-flash", "choices": [ { "index": 123, "message": { "role": "assistant", "content": "Hello! How can I help you today?", "reasoning_content": "<string>", "tool_calls": [ { "id": "<string>", "type": "function", "function": { "name": "<string>", "arguments": "<string>" } } ] }, "matched_stop": "<string>" } ], "usage": { "prompt_tokens": 123, "completion_tokens": 123, "total_tokens": 123, "prompt_tokens_details": { "cached_tokens": 123, "audio_tokens": 123, "text_tokens": 123 }, "completion_tokens_details": { "reasoning_tokens": 123, "accepted_prediction_tokens": 123, "rejected_prediction_tokens": 123 } }, "system_fingerprint": "<string>" }

Authorizations

Authorization

string

header

required

Authentication via Bearer token. Create an API Key in the Anyfast console and pass it as Bearer YOUR_API_KEY in the Authorization header.

Body

application/json

model

enum<string>

required

Model ID

Available options:

deepseek-v4-flash

Example:

"deepseek-v4-flash"

messages

object[]

required

A list of messages comprising the conversation so far.

Minimum array length: 1

Show child attributes

Example:

[{ "role": "user", "content": "Hello!" }]

thinking

object | null

Enable or disable thinking mode.

Show child attributes

max_tokens

integer

The maximum number of tokens to generate.

Required range: x >= 1

response_format

object

Set to {"type": "json_object"} to enable JSON mode.

Show child attributes

stop

Sequences where the model will stop generating further tokens. Up to 16 strings.

stream

boolean

default:false

If true, stream partial message deltas using SSE.

stream_options

object

Options for streaming. Only valid when stream is true.

Show child attributes

temperature

number

default:1

Sampling temperature. Higher values make output more random.

Required range: 0 <= x <= 2

Example:

1

top_p

number

default:1

Nucleus sampling threshold.

Required range: 0 <= x <= 1

tools

object[]

A list of tools the model may call. Currently only functions are supported.

Show child attributes

tool_choice

Controls which tool is called. none, auto, required, or a specific function.

Available options:

none,

auto,

required

frequency_penalty

number

default:0

Deprecated by DeepSeek. Passed through but has no effect.

Required range: -2 <= x <= 2

presence_penalty

number

default:0

Deprecated by DeepSeek. Passed through but has no effect.

Required range: -2 <= x <= 2

user_id

string

Custom user ID for content safety and KVCache isolation.

Maximum string length: 512

Pattern: ^[a-zA-Z0-9\-_]+$

Response

Completion generated successfully

string

Example:

"chatcmpl-abc123"

object

string

Example:

"chat.completion"

created

integer

Unix timestamp

model

string

Example:

"deepseek-v4-flash"

choices

object[]

Show child attributes

usage

object

Show child attributes

system_fingerprint

string | null

Backend configuration fingerprint.