API Reference Overview

Complete API reference for InferXgate endpoints, authentication, and response formats.

InferXgate provides an OpenAI-compatible REST API for interacting with multiple LLM providers through a single interface.

Base URL

http://localhost:3000/v1

For production deployments, replace with your domain:

https://api.yourdomain.com/v1

Authentication

All API requests require authentication via one of these methods:

Bearer Token (JWT)

curl -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIs..." \
  https://api.yourdomain.com/v1/chat/completions

API Key

curl -H "Authorization: Bearer ix-api-key-..." \
  https://api.yourdomain.com/v1/chat/completions

X-API-Key Header

curl -H "X-API-Key: ix-api-key-..." \
  https://api.yourdomain.com/v1/chat/completions

Available Endpoints

EndpointMethodDescription
/v1/chat/completionsPOSTCreate a chat completion
/v1/modelsGETList available models
/healthGETHealth check
/metricsGETPrometheus metrics
/statsGETUsage statistics
/auth/registerPOSTRegister a new user
/auth/loginPOSTLogin and get JWT token
/auth/keysGET/POSTManage API keys

Request Format

All POST requests should include:

Content-Type: application/json

Response Format

Success Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1699000000,
  "model": "claude-3-opus-20240229",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 15,
    "total_tokens": 25
  }
}

Error Response

{
  "error": {
    "type": "invalid_request_error",
    "message": "Invalid API key provided",
    "code": "invalid_api_key"
  }
}

HTTP Status Codes

CodeDescription
200Success
400Bad Request - Invalid parameters
401Unauthorized - Invalid or missing authentication
403Forbidden - Insufficient permissions
404Not Found - Endpoint doesn’t exist
429Too Many Requests - Rate limit exceeded
500Internal Server Error
502Bad Gateway - Provider error
503Service Unavailable

Rate Limiting

Rate limits are applied per API key:

  • Default: 60 requests per minute
  • Headers: Rate limit info is returned in response headers
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 55
X-RateLimit-Reset: 1699000060

Streaming

Streaming responses use Server-Sent Events (SSE):

curl -N https://api.yourdomain.com/v1/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-opus-20240229",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

Response:

data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"Hello"}}]}

data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"!"}}]}

data: [DONE]

SDKs and Libraries

InferXgate is compatible with standard OpenAI SDKs:

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.yourdomain.com/v1",
    api_key="your-api-key"
)

TypeScript/JavaScript

import OpenAI from 'openai';

const client = new OpenAI({
    baseURL: 'https://api.yourdomain.com/v1',
    apiKey: 'your-api-key',
});

Go

import "github.com/sashabaranov/go-openai"

config := openai.DefaultConfig("your-api-key")
config.BaseURL = "https://api.yourdomain.com/v1"
client := openai.NewClientWithConfig(config)

Next Steps