API Reference Overview
Complete API reference for InferXgate endpoints, authentication, and response formats.
InferXgate provides an OpenAI-compatible REST API for interacting with multiple LLM providers through a single interface.
Base URL
http://localhost:3000/v1
For production deployments, replace with your domain:
https://api.yourdomain.com/v1
Authentication
All API requests require authentication via one of these methods:
Bearer Token (JWT)
curl -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIs..." \
https://api.yourdomain.com/v1/chat/completions
API Key
curl -H "Authorization: Bearer ix-api-key-..." \
https://api.yourdomain.com/v1/chat/completions
X-API-Key Header
curl -H "X-API-Key: ix-api-key-..." \
https://api.yourdomain.com/v1/chat/completions
Available Endpoints
| Endpoint | Method | Description |
|---|---|---|
/v1/chat/completions | POST | Create a chat completion |
/v1/models | GET | List available models |
/health | GET | Health check |
/metrics | GET | Prometheus metrics |
/stats | GET | Usage statistics |
/auth/register | POST | Register a new user |
/auth/login | POST | Login and get JWT token |
/auth/keys | GET/POST | Manage API keys |
Request Format
All POST requests should include:
Content-Type: application/json
Response Format
Success Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1699000000,
"model": "claude-3-opus-20240229",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 15,
"total_tokens": 25
}
}
Error Response
{
"error": {
"type": "invalid_request_error",
"message": "Invalid API key provided",
"code": "invalid_api_key"
}
}
HTTP Status Codes
| Code | Description |
|---|---|
| 200 | Success |
| 400 | Bad Request - Invalid parameters |
| 401 | Unauthorized - Invalid or missing authentication |
| 403 | Forbidden - Insufficient permissions |
| 404 | Not Found - Endpoint doesn’t exist |
| 429 | Too Many Requests - Rate limit exceeded |
| 500 | Internal Server Error |
| 502 | Bad Gateway - Provider error |
| 503 | Service Unavailable |
Rate Limiting
Rate limits are applied per API key:
- Default: 60 requests per minute
- Headers: Rate limit info is returned in response headers
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 55
X-RateLimit-Reset: 1699000060
Streaming
Streaming responses use Server-Sent Events (SSE):
curl -N https://api.yourdomain.com/v1/chat/completions \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-opus-20240229",
"messages": [{"role": "user", "content": "Hello"}],
"stream": true
}'
Response:
data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"Hello"}}]}
data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"!"}}]}
data: [DONE]
SDKs and Libraries
InferXgate is compatible with standard OpenAI SDKs:
Python
from openai import OpenAI
client = OpenAI(
base_url="https://api.yourdomain.com/v1",
api_key="your-api-key"
)
TypeScript/JavaScript
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.yourdomain.com/v1',
apiKey: 'your-api-key',
});
Go
import "github.com/sashabaranov/go-openai"
config := openai.DefaultConfig("your-api-key")
config.BaseURL = "https://api.yourdomain.com/v1"
client := openai.NewClientWithConfig(config)
Next Steps
- Chat Completions - Main completion endpoint
- Models - List available models
- Authentication - Set up authentication