LLM Gateway#

Purpose#

Build a gateway layer that sits in front of one or more large language model APIs, providing a unified interface for routing, authentication, rate limiting, logging, and cost control.

Status#

Active

Goals#

  • Provide a single endpoint that abstracts multiple LLM providers (OpenAI, Anthropic, etc.)
  • Enforce API key management and per-consumer authentication
  • Track usage, costs, and latency per consumer and model
  • Support request routing rules (e.g. route by model name, fallback on error)
  • Enable rate limiting and budget caps per consumer

Scope#

Included#

  • HTTP proxy layer that accepts OpenAI-compatible requests
  • Provider adapters for at least Anthropic and OpenAI
  • Request and response logging with token counts
  • Per-consumer API key issuance and validation
  • Rate limiting and monthly spend caps
  • Admin interface or config file for managing consumers and rules

Not included#

  • Fine-tuning or model hosting
  • Training data management
  • End-user chat UI
  • Multi-region deployment (deferred)

Tasks#

  • Define the API contract (request/response schema)
  • Stand up a basic HTTP proxy that forwards to one provider
  • Add provider adapters (Anthropic, OpenAI)
  • Implement consumer key management
  • Add request logging with token and cost tracking
  • Add rate limiting per consumer
  • Add budget cap enforcement
  • Write admin config or dashboard
  • Document the setup and deployment steps
  • Test with real traffic

Decisions#

DateDecisionReason
2026-05-10Start with OpenAI-compatible request schemaMaximises compatibility with existing tooling

Outcome#

  • A running gateway service that proxies LLM requests
  • Per-consumer usage reports
  • Documentation covering setup, configuration, and adding new providers

Notes#

Capture rough notes, lessons, and things to remember.