LLM Gateway#
Purpose#
Build a gateway layer that sits in front of one or more large language model APIs, providing a unified interface for routing, authentication, rate limiting, logging, and cost control.
Status#
Active
Goals#
- Provide a single endpoint that abstracts multiple LLM providers (OpenAI, Anthropic, etc.)
- Enforce API key management and per-consumer authentication
- Track usage, costs, and latency per consumer and model
- Support request routing rules (e.g. route by model name, fallback on error)
- Enable rate limiting and budget caps per consumer
Scope#
Included#
- HTTP proxy layer that accepts OpenAI-compatible requests
- Provider adapters for at least Anthropic and OpenAI
- Request and response logging with token counts
- Per-consumer API key issuance and validation
- Rate limiting and monthly spend caps
- Admin interface or config file for managing consumers and rules
Not included#
- Fine-tuning or model hosting
- Training data management
- End-user chat UI
- Multi-region deployment (deferred)
Tasks#
- Define the API contract (request/response schema)
- Stand up a basic HTTP proxy that forwards to one provider
- Add provider adapters (Anthropic, OpenAI)
- Implement consumer key management
- Add request logging with token and cost tracking
- Add rate limiting per consumer
- Add budget cap enforcement
- Write admin config or dashboard
- Document the setup and deployment steps
- Test with real traffic
Links and files#
- Repository: https://github.com/asrafuli/gato
- Docs:
- Notes:
- Related workflow:
- Related how-to:
Decisions#
| Date | Decision | Reason |
|---|---|---|
| 2026-05-10 | Start with OpenAI-compatible request schema | Maximises compatibility with existing tooling |
Outcome#
- A running gateway service that proxies LLM requests
- Per-consumer usage reports
- Documentation covering setup, configuration, and adding new providers
Notes#
Capture rough notes, lessons, and things to remember.