Changelog

Track the latest updates, improvements, and fixes to InferXgate.

v0.3.0

latest

2025-12-01

Added
  • Semantic caching with configurable similarity threshold
  • Cost-based load balancing strategy
  • Azure OpenAI provider support
  • Request/response logging with configurable verbosity
  • Prometheus histogram metrics for latency percentiles
Improved
  • Redis connection pooling for better performance
  • Streaming response handling with proper backpressure
  • Error messages now include provider-specific details
  • Documentation site with comprehensive guides
Fixed
  • Memory leak in long-running streaming connections
  • Race condition in concurrent cache writes
  • Incorrect token counting for Claude 3 models

v0.2.0

2025-11-15

Added
  • Google Gemini provider support
  • JWT authentication with custom claims
  • Virtual API keys for team management
  • Health check endpoint with provider status
  • Configurable request timeout per provider
Improved
  • Reduced memory usage by 40% under load
  • Better error handling for network timeouts
  • Cache key generation is now deterministic
Fixed
  • Streaming responses not flushing properly
  • Rate limiter not resetting at window boundary
  • Config file hot-reload causing brief downtime

v0.1.0

2025-11-01

Added
  • Initial release of InferXgate
  • OpenAI-compatible API endpoint
  • Anthropic Claude provider support
  • OpenAI provider support
  • Redis response caching
  • Round-robin load balancing
  • Basic rate limiting
  • Prometheus metrics endpoint
  • Docker image and docker-compose setup

Stay Updated

Get notified about new releases and important updates.