Metrics

API reference for the /metrics endpoint exposing Prometheus metrics.

InferXgate exposes Prometheus-compatible metrics for monitoring and alerting.

Endpoint

GET /metrics

Available Metrics

Request Metrics

# Total requests
inferxgate_requests_total{provider="anthropic",model="claude-3-opus",status="success"} 1234

# Request latency histogram
inferxgate_request_duration_seconds_bucket{provider="openai",le="0.1"} 500
inferxgate_request_duration_seconds_bucket{provider="openai",le="0.5"} 900
inferxgate_request_duration_seconds_bucket{provider="openai",le="1.0"} 950

# Active requests
inferxgate_active_requests{provider="anthropic"} 5

Token Metrics

# Tokens processed
inferxgate_tokens_total{provider="anthropic",type="prompt"} 500000
inferxgate_tokens_total{provider="anthropic",type="completion"} 250000

Cache Metrics

# Cache hit/miss
inferxgate_cache_hits_total 8000
inferxgate_cache_misses_total 2000

# Cache size
inferxgate_cache_size_bytes 104857600

Provider Health

# Provider status (1=healthy, 0=unhealthy)
inferxgate_provider_healthy{provider="anthropic"} 1
inferxgate_provider_healthy{provider="openai"} 1

Grafana Dashboard

Import the pre-built dashboard:

# Dashboard ID: 12345
# Or download from: https://github.com/jasmedia/inferxgate/grafana

Example

curl https://api.yourdomain.com/metrics

Prometheus Configuration

scrape_configs:
  - job_name: 'inferxgate'
    static_configs:
      - targets: ['inferxgate:3000']
    metrics_path: /metrics
    scrape_interval: 15s