API Reference Overview

Complete API reference for InferXgate endpoints, authentication, and response formats.

InferXgate provides an OpenAI-compatible REST API for interacting with multiple LLM providers through a single interface.

Base URL

http://localhost:3000/v1

For production deployments, replace with your domain:

https://api.yourdomain.com/v1

Authentication

All API requests require authentication via one of these methods:

Bearer Token (JWT)

curl -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIs..." \
  https://api.yourdomain.com/v1/chat/completions

API Key

curl -H "Authorization: Bearer ix-api-key-..." \
  https://api.yourdomain.com/v1/chat/completions

X-API-Key Header

curl -H "X-API-Key: ix-api-key-..." \
  https://api.yourdomain.com/v1/chat/completions

Available Endpoints

Endpoint	Method	Description
`/v1/chat/completions`	POST	Create a chat completion
`/v1/models`	GET	List available models
`/health`	GET	Health check
`/metrics`	GET	Prometheus metrics
`/stats`	GET	Usage statistics
`/auth/register`	POST	Register a new user
`/auth/login`	POST	Login and get JWT token
`/auth/keys`	GET/POST	Manage API keys

Request Format

All POST requests should include:

Content-Type: application/json

Response Format

Success Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1699000000,
  "model": "claude-3-opus-20240229",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 15,
    "total_tokens": 25
  }
}

Error Response

{
  "error": {
    "type": "invalid_request_error",
    "message": "Invalid API key provided",
    "code": "invalid_api_key"
  }
}

HTTP Status Codes

Code	Description
200	Success
400	Bad Request - Invalid parameters
401	Unauthorized - Invalid or missing authentication
403	Forbidden - Insufficient permissions
404	Not Found - Endpoint doesn’t exist
429	Too Many Requests - Rate limit exceeded
500	Internal Server Error
502	Bad Gateway - Provider error
503	Service Unavailable

Rate Limiting

Rate limits are applied per API key:

Default: 60 requests per minute
Headers: Rate limit info is returned in response headers

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 55
X-RateLimit-Reset: 1699000060

Streaming

Streaming responses use Server-Sent Events (SSE):

curl -N https://api.yourdomain.com/v1/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-opus-20240229",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

Response:

data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"Hello"}}]}

data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"!"}}]}

data: [DONE]

SDKs and Libraries

InferXgate is compatible with standard OpenAI SDKs:

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.yourdomain.com/v1",
    api_key="your-api-key"
)

TypeScript/JavaScript

import OpenAI from 'openai';

const client = new OpenAI({
    baseURL: 'https://api.yourdomain.com/v1',
    apiKey: 'your-api-key',
});

Go

import "github.com/sashabaranov/go-openai"

config := openai.DefaultConfig("your-api-key")
config.BaseURL = "https://api.yourdomain.com/v1"
client := openai.NewClientWithConfig(config)

Next Steps

Chat Completions - Main completion endpoint
Models - List available models
Authentication - Set up authentication