2.0 KiB

Raw Blame History

Provider Setup Guide

Free Tier Providers

Groq (Fastest)

URL: https://console.groq.com
Free Tier: 20 RPM, variable TPM
Models: Llama 3.3 70B, Llama 3.1 8B
Best For: Speed, quick coding tasks
Tip: Create multiple accounts with different phones for load balancing

Mistral (High Volume)

URL: https://console.mistral.ai
Free Tier: 1 billion tokens/month
Models: Mistral Small, Medium
Best For: High-volume processing, chatbots

OpenRouter (Universal Access)

URL: https://openrouter.ai
Free Tier: 50 requests/day
Access: Kimi K2:free, Gemini Flash:free
Best For: Testing, fallback access

Cohere (Embeddings)

URL: https://cohere.com
Free Tier: 1,000 calls/month
Best For: Embeddings, RAG systems

Trial/Cheap Providers

Anthropic Claude (Highest Quality)

URL: https://console.anthropic.com
Trial: $5 free credits (new users)
Student: $500 credits (apply with .edu)
Cost: $3/M input (Sonnet), $0.25/M (Haiku)
Best For: Complex reasoning, analysis, code review

Moonshot Kimi (Best Value)

URL: https://platform.moonshot.ai
Bonus: $5 signup credit
Cost: $0.60/M input, $2.50/M output
Context: 128K tokens
Best For: Coding, long documents, Chinese content

DeepSeek (Cheapest Reasoning)

URL: https://platform.deepseek.com
Cost: $0.14/M input, $0.28/M output
Best For: Reasoning tasks, math, code

Configuration Priority

The system routes requests in this priority:

Fast tasks → Groq (free, instant)
High volume → Mistral (1B tokens)
Complex coding → Kimi (cheap, 128K context)
Quality critical → Claude (expensive but best)
Fallback → OpenRouter free tier

Rate Limit Management

The router automatically:

Tracks RPM/TPM across all providers
Distributes load (multiple Groq accounts)
Falls back when limits approached
Caches responses to reduce API calls