Metrics
API reference for the /metrics endpoint exposing Prometheus metrics.
InferXgate exposes Prometheus-compatible metrics for monitoring and alerting.
Endpoint
GET /metrics
Available Metrics
Request Metrics
# Total requests
inferxgate_requests_total{provider="anthropic",model="claude-3-opus",status="success"} 1234
# Request latency histogram
inferxgate_request_duration_seconds_bucket{provider="openai",le="0.1"} 500
inferxgate_request_duration_seconds_bucket{provider="openai",le="0.5"} 900
inferxgate_request_duration_seconds_bucket{provider="openai",le="1.0"} 950
# Active requests
inferxgate_active_requests{provider="anthropic"} 5
Token Metrics
# Tokens processed
inferxgate_tokens_total{provider="anthropic",type="prompt"} 500000
inferxgate_tokens_total{provider="anthropic",type="completion"} 250000
Cache Metrics
# Cache hit/miss
inferxgate_cache_hits_total 8000
inferxgate_cache_misses_total 2000
# Cache size
inferxgate_cache_size_bytes 104857600
Provider Health
# Provider status (1=healthy, 0=unhealthy)
inferxgate_provider_healthy{provider="anthropic"} 1
inferxgate_provider_healthy{provider="openai"} 1
Grafana Dashboard
Import the pre-built dashboard:
# Dashboard ID: 12345
# Or download from: https://github.com/jasmedia/inferxgate/grafana
Example
curl https://api.yourdomain.com/metrics
Prometheus Configuration
scrape_configs:
- job_name: 'inferxgate'
static_configs:
- targets: ['inferxgate:3000']
metrics_path: /metrics
scrape_interval: 15s