Chat Completions
API reference for the /v1/chat/completions endpoint.
The Chat Completions endpoint creates a model response for a conversation.
Endpoint
POST /v1/chat/completions
Request Body
{
"model": "claude-3-opus-20240229",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 1000,
"stream": false
}
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model ID to use (e.g., claude-3-opus-20240229, gpt-4-turbo) |
messages | array | Yes | Array of message objects |
temperature | number | No | Sampling temperature (0-2). Default: 1 |
max_tokens | integer | No | Maximum tokens to generate |
stream | boolean | No | Enable streaming. Default: false |
top_p | number | No | Nucleus sampling parameter (0-1) |
frequency_penalty | number | No | Reduce repetition (-2 to 2) |
presence_penalty | number | No | Encourage new topics (-2 to 2) |
stop | string/array | No | Stop sequences |
user | string | No | User ID for tracking |
tools | array | No | Available tools/functions |
tool_choice | string/object | No | Tool selection mode |
Message Object
{
"role": "user",
"content": "Hello!"
}
| Field | Type | Description |
|---|---|---|
role | string | system, user, assistant, or tool |
content | string | Message content |
name | string | Optional name for the participant |
tool_calls | array | Tool calls made by assistant |
tool_call_id | string | ID of tool call being responded to |
Response
Non-Streaming Response
{
"id": "chatcmpl-abc123def456",
"object": "chat.completion",
"created": 1699000000,
"model": "claude-3-opus-20240229",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 10,
"total_tokens": 35
}
}
Streaming Response
When stream: true, responses are sent as Server-Sent Events:
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1699000000,"model":"claude-3-opus-20240229","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1699000000,"model":"claude-3-opus-20240229","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1699000000,"model":"claude-3-opus-20240229","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1699000000,"model":"claude-3-opus-20240229","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]
Examples
Basic Request
curl -X POST https://api.yourdomain.com/v1/chat/completions \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-opus-20240229",
"messages": [
{"role": "user", "content": "What is the capital of France?"}
]
}'
With System Message
curl -X POST https://api.yourdomain.com/v1/chat/completions \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4-turbo",
"messages": [
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "Write a Python function to reverse a string."}
],
"temperature": 0.5
}'
Streaming
curl -N -X POST https://api.yourdomain.com/v1/chat/completions \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-1.5-pro",
"messages": [
{"role": "user", "content": "Tell me a short story."}
],
"stream": true
}'
Python Example
from openai import OpenAI
client = OpenAI(
base_url="https://api.yourdomain.com/v1",
api_key="your-api-key"
)
# Non-streaming
response = client.chat.completions.create(
model="claude-3-opus-20240229",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)
# Streaming
stream = client.chat.completions.create(
model="claude-3-opus-20240229",
messages=[{"role": "user", "content": "Count to 10."}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
TypeScript Example
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.yourdomain.com/v1',
apiKey: 'your-api-key',
});
// Non-streaming
const response = await client.chat.completions.create({
model: 'claude-3-opus-20240229',
messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content);
// Streaming
const stream = await client.chat.completions.create({
model: 'claude-3-opus-20240229',
messages: [{ role: 'user', content: 'Count to 10.' }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}
Tool/Function Calling
{
"model": "claude-3-opus-20240229",
"messages": [
{"role": "user", "content": "What's the weather in Tokyo?"}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name"
}
},
"required": ["location"]
}
}
}
]
}
Error Responses
Invalid Model
{
"error": {
"type": "invalid_request_error",
"message": "Model 'invalid-model' not found",
"code": "model_not_found"
}
}
Rate Limited
{
"error": {
"type": "rate_limit_error",
"message": "Rate limit exceeded. Please retry after 60 seconds.",
"code": "rate_limit_exceeded"
}
}
Provider Error
{
"error": {
"type": "api_error",
"message": "Upstream provider returned an error",
"code": "provider_error"
}
}