POST
/
v1
/
chat
/
completions
Chat Compatible Thinking
curl --request POST \
  --url https://api.example.com/v1/chat/completions \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "messages": [
    {}
  ],
  "reasoning_effort": "<string>",
  "temperature": 123,
  "top_p": 123,
  "max_tokens": 123,
  "stream": true
}
'
{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "model": "<string>",
  "choices": [
    {}
  ],
  "usage": {}
}
Official documentation: https://ai.google.dev/gemini-api/docs/text-generation
Use Gemini models with thinking/reasoning capabilities through the standard OpenAI Chat Completions API format.

Overview

This endpoint enables Gemini’s thinking mode through the OpenAI-compatible chat format. By adding the reasoning_effort parameter, you can control how much reasoning the model performs before responding.

Authentication

All requests require a Bearer token in the Authorization header:
Authorization: Bearer YOUR_API_KEY

Request Parameters

model
string
required
The Gemini model ID. For example: gemini-2.5-pro, gemini-2.5-flash.
messages
array
required
A list of messages comprising the conversation.
reasoning_effort
string
Controls the thinking effort level. Values: low, medium, high.
temperature
number
default:"1"
Sampling temperature between 0 and 2.
top_p
number
default:"1"
Nucleus sampling parameter.
max_tokens
integer
Maximum number of tokens to generate.
stream
boolean
default:"false"
Whether to stream responses.

Request Example

curl -X POST https://www.anyfast.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-pro",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ],
    "temperature": 0.1,
    "top_p": 1.0,
    "stream": true,
    "reasoning_effort": "low"
  }'

Response Example

{
  "id": "chatcmpl-gemini-think-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gemini-2.5-pro",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 8,
    "total_tokens": 18
  }
}

Response Fields

id
string
Unique identifier for the completion.
object
string
Object type, which is chat.completion.
created
integer
Unix timestamp of when the completion was created.
model
string
The model used.
choices
array
List of completion choices.
usage
object
Usage statistics for the request.

Reasoning Effort Levels

LevelDescription
lowMinimal thinking, faster responses
mediumBalanced thinking and speed
highMaximum reasoning depth

Available Models

  • gemini-2.5-pro
  • gemini-2.5-flash