Changelog
Track the latest updates, improvements, and fixes to InferXgate.
v0.3.0
latest2025-12-01
Added
- Semantic caching with configurable similarity threshold
- Cost-based load balancing strategy
- Azure OpenAI provider support
- Request/response logging with configurable verbosity
- Prometheus histogram metrics for latency percentiles
Improved
- Redis connection pooling for better performance
- Streaming response handling with proper backpressure
- Error messages now include provider-specific details
- Documentation site with comprehensive guides
Fixed
- Memory leak in long-running streaming connections
- Race condition in concurrent cache writes
- Incorrect token counting for Claude 3 models
v0.2.0
2025-11-15
Added
- Google Gemini provider support
- JWT authentication with custom claims
- Virtual API keys for team management
- Health check endpoint with provider status
- Configurable request timeout per provider
Improved
- Reduced memory usage by 40% under load
- Better error handling for network timeouts
- Cache key generation is now deterministic
Fixed
- Streaming responses not flushing properly
- Rate limiter not resetting at window boundary
- Config file hot-reload causing brief downtime
v0.1.0
2025-11-01
Added
- Initial release of InferXgate
- OpenAI-compatible API endpoint
- Anthropic Claude provider support
- OpenAI provider support
- Redis response caching
- Round-robin load balancing
- Basic rate limiting
- Prometheus metrics endpoint
- Docker image and docker-compose setup