Chat Compatible Thinking

curl --request POST \
  --url https://api.example.com/v1/chat/completions \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "messages": [
    {}
  ],
  "reasoning_effort": "<string>",
  "temperature": 123,
  "top_p": 123,
  "max_tokens": 123,
  "stream": true
}
'

{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "model": "<string>",
  "choices": [
    {}
  ],
  "usage": {}
}

POST

chat

completions

Chat Compatible Thinking

curl --request POST \
  --url https://api.example.com/v1/chat/completions \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "messages": [
    {}
  ],
  "reasoning_effort": "<string>",
  "temperature": 123,
  "top_p": 123,
  "max_tokens": 123,
  "stream": true
}
'

{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "model": "<string>",
  "choices": [
    {}
  ],
  "usage": {}
}

Official documentation: https://ai.google.dev/gemini-api/docs/text-generation

Use Gemini models with thinking/reasoning capabilities through the standard OpenAI Chat Completions API format.

Overview

This endpoint enables Gemini’s thinking mode through the OpenAI-compatible chat format. By adding the reasoning_effort parameter, you can control how much reasoning the model performs before responding.

Authentication

All requests require a Bearer token in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Request Parameters

model

string

required

The Gemini model ID. For example: gemini-2.5-pro, gemini-2.5-flash.

messages

array

required

A list of messages comprising the conversation.

reasoning_effort

string

Controls the thinking effort level. Values: low, medium, high.

temperature

number

default:"1"

Sampling temperature between 0 and 2.

top_p

number

default:"1"

Nucleus sampling parameter.

max_tokens

integer

Maximum number of tokens to generate.

stream

boolean

default:"false"

Whether to stream responses.

Request Example

curl -X POST https://www.anyfast.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-pro",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ],
    "temperature": 0.1,
    "top_p": 1.0,
    "stream": true,
    "reasoning_effort": "low"
  }'

Response Example

{
  "id": "chatcmpl-gemini-think-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gemini-2.5-pro",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 8,
    "total_tokens": 18
  }
}

Response Fields

string

Unique identifier for the completion.

object

string

Object type, which is chat.completion.

created

integer

Unix timestamp of when the completion was created.

model

string

The model used.

choices

array

List of completion choices.

usage

object

Usage statistics for the request.

Reasoning Effort Levels

Level	Description
`low`	Minimal thinking, faster responses
`medium`	Balanced thinking and speed
`high`	Maximum reasoning depth

Available Models

gemini-2.5-pro
gemini-2.5-flash

Chat Compatible Format Chat Compatible Vision

⌘I

Chat

Responses

Image Models

Video Models

GPTs

Doubao Series

Chat Compatible Thinking

Overview

Authentication

Request Parameters

Request Example

Response Example

Response Fields

Reasoning Effort Levels

Available Models

Chat

Responses

Image Models

Video Models

GPTs

Doubao Series

​Overview

​Authentication

​Request Parameters

​Request Example

​Response Example

​Response Fields

​Reasoning Effort Levels

​Available Models

Overview

Authentication

Request Parameters

Request Example

Response Example

Response Fields

Reasoning Effort Levels

Available Models