Anthropic Claude
Configure and use Anthropic Claude models through InferXgate.
InferXgate provides full support for Anthropic’s Claude models, including the latest Claude 4 and Claude 3.5 series.
Configuration
Add your Anthropic API key to the environment:
ANTHROPIC_API_KEY=sk-ant-api03-...
Available Models
Claude 4 Series
| Model ID | Description | Context Window |
|---|---|---|
claude-opus-4-5-20251101 | Most capable, extended thinking | 200K |
claude-sonnet-4-5-20250929 | Advanced performance and speed | 200K |
claude-opus-4-1-20250414 | Previous flagship model | 200K |
claude-sonnet-4-20250514 | Balanced Claude 4 | 200K |
claude-opus-4-20250514 | Claude 4 base | 200K |
Claude 3.5 & 3 Series
| Model ID | Description | Context Window |
|---|---|---|
claude-3-5-haiku-20241022 | Fast and efficient | 200K |
claude-3-haiku-20240307 | Legacy fast model | 200K |
Usage Example
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:3000/v1",
api_key="your-api-key"
)
response = client.chat.completions.create(
model="claude-opus-4-5-20251101",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing."}
],
max_tokens=1000
)
print(response.choices[0].message.content)
Streaming
stream = client.chat.completions.create(
model="claude-sonnet-4-5-20250929",
messages=[{"role": "user", "content": "Write a poem."}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
Supported Features
- Chat completions
- Streaming responses
- System messages
- Multi-turn conversations
- Tool/function calling
- Vision (image inputs)
- Extended thinking (Opus 4.5)
Pricing
Costs are passed through from Anthropic:
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| Claude Opus 4.5 | $15.00 | $75.00 |
| Claude Sonnet 4.5 | $3.00 | $15.00 |
| Claude Haiku 4.5 | $0.80 | $4.00 |
| Claude Opus 4.1 | $15.00 | $75.00 |
| Claude Sonnet 4 | $3.00 | $15.00 |
| Claude Opus 4 | $15.00 | $75.00 |
| Claude 3.5 Haiku | $0.80 | $4.00 |
| Claude 3 Haiku | $0.25 | $1.25 |
Best Practices
- Use Haiku for simple tasks - Save costs on classification, extraction
- Use Opus 4.5 for complex reasoning - Best for analysis, coding, writing with extended thinking
- Set appropriate max_tokens - Avoid unnecessary token usage
- Enable caching - Reduce costs for repeated queries