llm-hub/docs/PROVIDERS.md

# Provider Setup Guide

## Free Tier Providers

### Groq (Fastest)
- **URL**: https://console.groq.com
- **Free Tier**: 20 RPM, variable TPM
- **Models**: Llama 3.3 70B, Llama 3.1 8B
- **Best For**: Speed, quick coding tasks
- **Tip**: Create multiple accounts with different phones for load balancing

### Mistral (High Volume)
- **URL**: https://console.mistral.ai
- **Free Tier**: 1 billion tokens/month
- **Models**: Mistral Small, Medium
- **Best For**: High-volume processing, chatbots

### OpenRouter (Universal Access)
- **URL**: https://openrouter.ai
- **Free Tier**: 50 requests/day
- **Access**: Kimi K2:free, Gemini Flash:free
- **Best For**: Testing, fallback access

### Cohere (Embeddings)
- **URL**: https://cohere.com
- **Free Tier**: 1,000 calls/month
- **Best For**: Embeddings, RAG systems

## Trial/Cheap Providers

### Anthropic Claude (Highest Quality)
- **URL**: https://console.anthropic.com
- **Trial**: $5 free credits (new users)
- **Student**: $500 credits (apply with .edu)
- **Cost**: $3/M input (Sonnet), $0.25/M (Haiku)
- **Best For**: Complex reasoning, analysis, code review

### Moonshot Kimi (Best Value)
- **URL**: https://platform.moonshot.ai
- **Bonus**: $5 signup credit
- **Cost**: $0.60/M input, $2.50/M output
- **Context**: 128K tokens
- **Best For**: Coding, long documents, Chinese content

### DeepSeek (Cheapest Reasoning)
- **URL**: https://platform.deepseek.com
- **Cost**: $0.14/M input, $0.28/M output
- **Best For**: Reasoning tasks, math, code

## Configuration Priority

The system routes requests in this priority:

1. **Fast tasks** → Groq (free, instant)
2. **High volume** → Mistral (1B tokens)
3. **Complex coding** → Kimi (cheap, 128K context)
4. **Quality critical** → Claude (expensive but best)
5. **Fallback** → OpenRouter free tier

## Rate Limit Management

The router automatically:
- Tracks RPM/TPM across all providers
- Distributes load (multiple Groq accounts)
- Falls back when limits approached
- Caches responses to reduce API calls