# Agent Friendly World > Credit-based LLM inference gateway with x402 USDC payments on Base. OpenAI-compatible API. ## Base URL - Staging: https://agent-router.gaib.cloud - Local: http://localhost:8080 ## Authentication Two methods: 1. API Key: `Authorization: Bearer sk-<64 hex chars>` — for inference and usage queries 2. SIWE (Sign-In with Ethereum): signed message + signature — for API key management ## Endpoints ### GET /health Returns: `{"status":"ok"}` ### GET /v1/models No auth. Returns all available models with pricing. Response: `{"object":"list","data":[{"id":"gemini/gemini-2.5-flash","name":"gemini-2.5-flash","provider":"gemini","contextLength":1048576,"promptPricePer1MTokens":0.15,"completionPricePer1M":0.60}]}` ### POST /v1/topup x402 USDC payment to fund wallet balance. Body: `{"amount": 5}` (USD, minimum $1) First call returns 402 with payment requirements. Retry with X-Payment header after signing. Response: `{"balance_usdc":5000000,"credited_usdc":5000000}` ### POST /v1/auth/keys Create API key. SIWE auth in body. Body: `{"message":"","signature":"0x...","label":"my-app"}` Response 201: `{"id":1,"key":"sk-...","label":"my-app","created_at":"..."}` Key is shown ONLY ONCE. ### GET /v1/auth/keys List API keys. SIWE via query: `?message=&signature=` Response: `{"data":[{"id":1,"label":"my-app","created_at":"...","revoked_at":null}]}` ### DELETE /v1/auth/keys/:key_id Revoke API key. SIWE auth in body. Body: `{"message":"","signature":"0x..."}` Response: `{"revoked":true}` ### POST /v1/chat/completions OpenAI-compatible inference. Requires Bearer API key. Body: `{"model":"gemini/gemini-2.5-flash","messages":[{"role":"user","content":"Hello"}],"max_tokens":512,"stream":false}` Response: `{"id":"chatcmpl-...","object":"chat.completion","choices":[{"index":0,"message":{"role":"assistant","content":"..."},"finish_reason":"stop"}],"usage":{"prompt_tokens":20,"completion_tokens":9,"total_tokens":29}}` Streaming: set `"stream":true` for SSE. NOT SUPPORTED: `thinking`, `reasoning_effort` parameters (returns 400). ### GET /v1/usage/:wallet Balance and per-key usage. API key or SIWE auth. Response: `{"wallet":"0x...","balance_usdc":4950000,"locked_usdc":0,"available_usdc":4950000,"keys":[{"api_key_id":1,"label":"my-app","request_count":3,"total_prompt_tokens":66,"total_completion_tokens":174,"total_charged_usdc":50000}]}` ## Model ID Format `/` — e.g. `gemini/gemini-2.5-flash`, `openrouter/anthropic/claude-3-5-sonnet`, `kimi/kimi-k2.5` ## Providers and Pricing (USD per 1M tokens, before 20% markup) ### Gemini - gemini/gemini-2.5-pro: prompt $1.25, completion $10.00, context 1M - gemini/gemini-2.5-flash: prompt $0.15, completion $0.60, context 1M - gemini/gemini-2.5-flash-lite: prompt $0.10, completion $0.40, context 1M - gemini/gemini-2.0-flash: prompt $0.10, completion $0.40, context 1M - gemini/gemini-1.5-pro: prompt $1.25, completion $5.00, context 2M - gemini/gemini-1.5-flash: prompt $0.075, completion $0.30, context 1M ### Kimi (Moonshot AI) - kimi/kimi-k2.5: prompt $0.60, completion $3.00, context 262k (ignores temperature/top_p) - kimi/moonshot-v1-8k: prompt $0.20, completion $2.00, context 8k - kimi/moonshot-v1-32k: prompt $1.00, completion $3.00, context 32k - kimi/moonshot-v1-128k: prompt $2.00, completion $5.00, context 131k ### MiniMax - minimax/MiniMax-M2.7: prompt $0.30, completion $1.20, context 204k - minimax/MiniMax-M2.5: prompt $0.118, completion $0.95, context 196k - minimax/MiniMax-M2: prompt $0.255, completion $1.00, context 196k - minimax/MiniMax-M1: prompt $0.40, completion $1.76, context 1M - minimax/MiniMax-Text-01: prompt $0.20, completion $1.10, context 1M ### OpenRouter 400+ models, pricing fetched dynamically from OpenRouter API. ### Local Self-hosted (Ollama/vLLM), free ($0/$0). ## Units - balance_usdc: micro-USDC (6 decimals). Divide by 1_000_000 for USD. - amount in topup: USD float. ## Error Codes - 400 VALIDATION_ERROR: bad params, unknown model, thinking/reasoning_effort - 401 UNAUTHORIZED: bad API key or SIWE - 402 INSUFFICIENT_BALANCE: not enough credit - 404 NOT_FOUND: resource not found - 429 RATE_LIMITED: per-wallet rate limit hit (includes retry_after_ms) - 500 INTERNAL_ERROR: server error - 502 UPSTREAM_ERROR: LLM provider failed (no charge) ## Quick Example ```typescript import OpenAI from 'openai' const client = new OpenAI({ baseURL: 'https://agent-router.gaib.cloud/v1', apiKey: 'sk-...' }) const res = await client.chat.completions.create({ model: 'gemini/gemini-2.5-flash', messages: [{ role: 'user', content: 'Hello' }] }) console.log(res.choices[0].message.content) ```