Chat Completions

API reference for the /v1/chat/completions endpoint.

The Chat Completions endpoint creates a model response for a conversation.

Endpoint

POST /v1/chat/completions

Request Body

{
  "model": "claude-3-opus-20240229",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "temperature": 0.7,
  "max_tokens": 1000,
  "stream": false
}

Parameters

Parameter	Type	Required	Description
`model`	string	Yes	Model ID to use (e.g., `claude-3-opus-20240229`, `gpt-4-turbo`)
`messages`	array	Yes	Array of message objects
`temperature`	number	No	Sampling temperature (0-2). Default: 1
`max_tokens`	integer	No	Maximum tokens to generate
`stream`	boolean	No	Enable streaming. Default: false
`top_p`	number	No	Nucleus sampling parameter (0-1)
`frequency_penalty`	number	No	Reduce repetition (-2 to 2)
`presence_penalty`	number	No	Encourage new topics (-2 to 2)
`stop`	string/array	No	Stop sequences
`user`	string	No	User ID for tracking
`tools`	array	No	Available tools/functions
`tool_choice`	string/object	No	Tool selection mode

Message Object

{
  "role": "user",
  "content": "Hello!"
}

Field	Type	Description
`role`	string	`system`, `user`, `assistant`, or `tool`
`content`	string	Message content
`name`	string	Optional name for the participant
`tool_calls`	array	Tool calls made by assistant
`tool_call_id`	string	ID of tool call being responded to

Response

Non-Streaming Response

{
  "id": "chatcmpl-abc123def456",
  "object": "chat.completion",
  "created": 1699000000,
  "model": "claude-3-opus-20240229",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 10,
    "total_tokens": 35
  }
}

Streaming Response

When stream: true, responses are sent as Server-Sent Events:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1699000000,"model":"claude-3-opus-20240229","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1699000000,"model":"claude-3-opus-20240229","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1699000000,"model":"claude-3-opus-20240229","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1699000000,"model":"claude-3-opus-20240229","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Examples

Basic Request

curl -X POST https://api.yourdomain.com/v1/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-opus-20240229",
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

With System Message

curl -X POST https://api.yourdomain.com/v1/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4-turbo",
    "messages": [
      {"role": "system", "content": "You are a helpful coding assistant."},
      {"role": "user", "content": "Write a Python function to reverse a string."}
    ],
    "temperature": 0.5
  }'

Streaming

curl -N -X POST https://api.yourdomain.com/v1/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-1.5-pro",
    "messages": [
      {"role": "user", "content": "Tell me a short story."}
    ],
    "stream": true
  }'

Python Example

from openai import OpenAI

client = OpenAI(
    base_url="https://api.yourdomain.com/v1",
    api_key="your-api-key"
)

# Non-streaming
response = client.chat.completions.create(
    model="claude-3-opus-20240229",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)
print(response.choices[0].message.content)

# Streaming
stream = client.chat.completions.create(
    model="claude-3-opus-20240229",
    messages=[{"role": "user", "content": "Count to 10."}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

TypeScript Example

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.yourdomain.com/v1',
  apiKey: 'your-api-key',
});

// Non-streaming
const response = await client.chat.completions.create({
  model: 'claude-3-opus-20240229',
  messages: [{ role: 'user', content: 'Hello!' }],
});

console.log(response.choices[0].message.content);

// Streaming
const stream = await client.chat.completions.create({
  model: 'claude-3-opus-20240229',
  messages: [{ role: 'user', content: 'Count to 10.' }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Tool/Function Calling

{
  "model": "claude-3-opus-20240229",
  "messages": [
    {"role": "user", "content": "What's the weather in Tokyo?"}
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "City name"
            }
          },
          "required": ["location"]
        }
      }
    }
  ]
}

Error Responses

Invalid Model

{
  "error": {
    "type": "invalid_request_error",
    "message": "Model 'invalid-model' not found",
    "code": "model_not_found"
  }
}

Rate Limited

{
  "error": {
    "type": "rate_limit_error",
    "message": "Rate limit exceeded. Please retry after 60 seconds.",
    "code": "rate_limit_exceeded"
  }
}

Provider Error

{
  "error": {
    "type": "api_error",
    "message": "Upstream provider returned an error",
    "code": "provider_error"
  }
}