Creates a model response for Gemini models using OpenAI-compatible format.
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Gemini model ID to use for completion.
gemini-3.1-pro-preview, gemini-3.1-flash-image-preview, gemini-3.1-flash-lite-preview, gemini-3-pro-preview, gemini-3-pro-image-preview, gemini-3-flash-preview, gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite, gemini-2.0-flash "gemini-2.5-pro"
A list of messages comprising the conversation so far.
1[{ "role": "user", "content": "Hello!" }]
The maximum number of tokens to generate in the chat completion.
x >= 1Sampling temperature between 0 and 2.
0 <= x <= 21
Nucleus sampling threshold.
0 <= x <= 1Penalizes new tokens based on their existing frequency in the text so far.
-2 <= x <= 2Penalizes new tokens based on whether they appear in the text so far.
-2 <= x <= 2If true, partial message deltas will be sent as server-sent events.
Sequences where the API will stop generating further tokens.
How many chat completion choices to generate for each input message.
x >= 1An object specifying the format that the model must output. Setting to {"type": "json_object"} enables JSON mode.