# Agent Friendly World

> Credit-based LLM inference gateway with x402 USDC payments on Base. OpenAI-compatible API.

## Base URL

- Staging: https://agent-router.gaib.cloud
- Local: http://localhost:8080

## Authentication

Two methods:
1. API Key: `Authorization: Bearer sk-<64 hex chars>` — for inference and usage queries
2. SIWE (Sign-In with Ethereum): signed message + signature — for API key management

## Endpoints

### GET /health
Returns: `{"status":"ok"}`

### GET /v1/models
No auth. Returns all available models with pricing.
Response: `{"object":"list","data":[{"id":"gemini/gemini-2.5-flash","name":"gemini-2.5-flash","provider":"gemini","contextLength":1048576,"promptPricePer1MTokens":0.15,"completionPricePer1M":0.60}]}`

### POST /v1/topup
x402 USDC payment to fund wallet balance.
Body: `{"amount": 5}` (USD, minimum $1)
First call returns 402 with payment requirements. Retry with X-Payment header after signing.
Response: `{"balance_usdc":5000000,"credited_usdc":5000000}`

### POST /v1/auth/keys
Create API key. SIWE auth in body.
Body: `{"message":"<siwe>","signature":"0x...","label":"my-app"}`
Response 201: `{"id":1,"key":"sk-...","label":"my-app","created_at":"..."}`
Key is shown ONLY ONCE.

### GET /v1/auth/keys
List API keys. SIWE via query: `?message=<siwe>&signature=<sig>`
Response: `{"data":[{"id":1,"label":"my-app","created_at":"...","revoked_at":null}]}`

### DELETE /v1/auth/keys/:key_id
Revoke API key. SIWE auth in body.
Body: `{"message":"<siwe>","signature":"0x..."}`
Response: `{"revoked":true}`

### POST /v1/chat/completions
OpenAI-compatible inference. Requires Bearer API key.
Body: `{"model":"gemini/gemini-2.5-flash","messages":[{"role":"user","content":"Hello"}],"max_tokens":512,"stream":false}`
Response: `{"id":"chatcmpl-...","object":"chat.completion","choices":[{"index":0,"message":{"role":"assistant","content":"..."},"finish_reason":"stop"}],"usage":{"prompt_tokens":20,"completion_tokens":9,"total_tokens":29}}`
Streaming: set `"stream":true` for SSE.
NOT SUPPORTED: `thinking`, `reasoning_effort` parameters (returns 400).

### GET /v1/usage/:wallet
Balance and per-key usage. API key or SIWE auth.
Response: `{"wallet":"0x...","balance_usdc":4950000,"locked_usdc":0,"available_usdc":4950000,"keys":[{"api_key_id":1,"label":"my-app","request_count":3,"total_prompt_tokens":66,"total_completion_tokens":174,"total_charged_usdc":50000}]}`

## Model ID Format

`<provider>/<model-name>` — e.g. `gemini/gemini-2.5-flash`, `openrouter/anthropic/claude-3-5-sonnet`, `kimi/kimi-k2.5`

## Providers and Pricing (USD per 1M tokens, before 20% markup)

### Gemini
- gemini/gemini-2.5-pro: prompt $1.25, completion $10.00, context 1M
- gemini/gemini-2.5-flash: prompt $0.15, completion $0.60, context 1M
- gemini/gemini-2.5-flash-lite: prompt $0.10, completion $0.40, context 1M
- gemini/gemini-2.0-flash: prompt $0.10, completion $0.40, context 1M
- gemini/gemini-1.5-pro: prompt $1.25, completion $5.00, context 2M
- gemini/gemini-1.5-flash: prompt $0.075, completion $0.30, context 1M

### Kimi (Moonshot AI)
- kimi/kimi-k2.5: prompt $0.60, completion $3.00, context 262k (ignores temperature/top_p)
- kimi/moonshot-v1-8k: prompt $0.20, completion $2.00, context 8k
- kimi/moonshot-v1-32k: prompt $1.00, completion $3.00, context 32k
- kimi/moonshot-v1-128k: prompt $2.00, completion $5.00, context 131k

### MiniMax
- minimax/MiniMax-M2.7: prompt $0.30, completion $1.20, context 204k
- minimax/MiniMax-M2.5: prompt $0.118, completion $0.95, context 196k
- minimax/MiniMax-M2: prompt $0.255, completion $1.00, context 196k
- minimax/MiniMax-M1: prompt $0.40, completion $1.76, context 1M
- minimax/MiniMax-Text-01: prompt $0.20, completion $1.10, context 1M

### OpenRouter
400+ models, pricing fetched dynamically from OpenRouter API.

### Local
Self-hosted (Ollama/vLLM), free ($0/$0).

## Units
- balance_usdc: micro-USDC (6 decimals). Divide by 1_000_000 for USD.
- amount in topup: USD float.

## Error Codes
- 400 VALIDATION_ERROR: bad params, unknown model, thinking/reasoning_effort
- 401 UNAUTHORIZED: bad API key or SIWE
- 402 INSUFFICIENT_BALANCE: not enough credit
- 404 NOT_FOUND: resource not found
- 429 RATE_LIMITED: per-wallet rate limit hit (includes retry_after_ms)
- 500 INTERNAL_ERROR: server error
- 502 UPSTREAM_ERROR: LLM provider failed (no charge)

## Quick Example

```typescript
import OpenAI from 'openai'
const client = new OpenAI({ baseURL: 'https://agent-router.gaib.cloud/v1', apiKey: 'sk-...' })
const res = await client.chat.completions.create({ model: 'gemini/gemini-2.5-flash', messages: [{ role: 'user', content: 'Hello' }] })
console.log(res.choices[0].message.content)
```