Skip to content

Models & Providers

Model ID Format

All model IDs follow the pattern <provider>/<model-name>:

gemini/gemini-2.5-flash
openrouter/anthropic/claude-3-5-sonnet
kimi/moonshot-v1-8k
minimax/MiniMax-M2.5
local/llama-3

The gateway parses the prefix to route requests to the correct provider.

Providers

Gemini (Google)

ModelPrompt ($/1M)Completion ($/1M)Context
gemini/gemini-2.5-pro$1.25$10.001M
gemini/gemini-2.5-flash$0.15$0.601M
gemini/gemini-2.5-flash-lite$0.10$0.401M
gemini/gemini-2.0-flash$0.10$0.401M
gemini/gemini-1.5-pro$1.25$5.002M
gemini/gemini-1.5-flash$0.075$0.301M

Kimi (Moonshot AI)

ModelPrompt ($/1M)Completion ($/1M)Context
kimi/kimi-k2.5$0.60$3.00262k
kimi/moonshot-v1-8k$0.20$2.008k
kimi/moonshot-v1-32k$1.00$3.0032k
kimi/moonshot-v1-128k$2.00$5.00131k

INFO

kimi-k2.5 ignores temperature, top_p, and penalty parameters.

MiniMax

ModelPrompt ($/1M)Completion ($/1M)Context
minimax/MiniMax-M2.7$0.30$1.20204k
minimax/MiniMax-M2.5$0.118$0.95196k
minimax/MiniMax-M2$0.255$1.00196k
minimax/MiniMax-M1$0.40$1.761M
minimax/MiniMax-Text-01$0.20$1.101M

INFO

MiniMax ignores presence_penalty and frequency_penalty parameters.

OpenRouter

Provides access to 400+ models. Models are fetched dynamically from the OpenRouter API. Pricing comes from the API response.

Local

For self-hosted models (Ollama, vLLM, or any OpenAI-compatible endpoint). Free pricing ($0/$0). Context defaults to 4096 if not specified by the model metadata.

Pricing

Prices shown above are downstream costs (what the gateway pays the provider). The gateway applies a configurable markup (default 20%) on top.

Effective cost = downstream cost x (1 + markup%)

Adding a New Provider

  1. Implement the LLMProvider interface in src/services/llm/
  2. Register the provider in src/services/llm/index.ts
  3. Add the API key env variable to src/config.ts

Released under the MIT License.