docs: rewrite README with comprehensive documentation and examples
This commit is contained in:
parent
2cafb31cb4
commit
23f2e9541e
403
README.md
403
README.md
|
|
@ -1,60 +1,407 @@
|
||||||
|
<div align="center">
|
||||||
|
|
||||||
# 🤖 Agentic LLM Hub
|
# 🤖 Agentic LLM Hub
|
||||||
|
|
||||||
Self-hosted AI agent platform with multi-provider LLM aggregation, reasoning engines (ReAct, Plan-and-Execute, Reflexion), MCP tools, and web IDE.
|
**Self-hosted AI agent platform with multi-provider LLM aggregation, advanced reasoning engines, and integrated development environment.**
|
||||||
|
|
||||||
|
[](https://www.docker.com/)
|
||||||
|
[](LICENSE)
|
||||||
|
[](https://www.linux.org/)
|
||||||
|
|
||||||
|
[Quick Start](#-quick-start) • [Features](#-features) • [Documentation](docs/) • [API Reference](docs/API.md)
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📋 Overview
|
||||||
|
|
||||||
|
Agentic LLM Hub is a production-ready, self-hosted platform that unifies multiple LLM providers behind a single API gateway. It features intelligent request routing, multiple reasoning engines (ReAct, Plan-and-Execute, Reflexion), persistent memory via vector storage, and a complete web-based IDE with AI assistance.
|
||||||
|
|
||||||
|
### Why Agentic LLM Hub?
|
||||||
|
|
||||||
|
- **🔌 Multi-Provider Aggregation** — Seamlessly route between 7+ LLM providers with automatic failover
|
||||||
|
- **🧠 Advanced Reasoning** — Choose from multiple reasoning modes based on task complexity
|
||||||
|
- **💰 Cost Optimization** — Free tier providers prioritized, paid providers as fallback
|
||||||
|
- **🖥️ Complete Workspace** — Browser-based VS Code with AI assistant integration
|
||||||
|
- **🔧 Extensible** — MCP tool ecosystem for custom integrations
|
||||||
|
- **📊 Persistent Memory** — Vector-based conversation history and knowledge storage
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ✨ Features
|
||||||
|
|
||||||
|
| Component | Technology | Purpose |
|
||||||
|
|-----------|------------|---------|
|
||||||
|
| **LLM Gateway** | [LiteLLM](https://github.com/BerriAI/litellm) | Unified API for 7+ providers with load balancing |
|
||||||
|
| **Reasoning Engine** | Custom implementation | ReAct, Plan-and-Execute, Reflexion modes |
|
||||||
|
| **Vector Memory** | [ChromaDB](https://www.trychroma.com/) | Persistent embeddings & conversation history |
|
||||||
|
| **Cache Layer** | Redis | Response caching & session management |
|
||||||
|
| **Web IDE** | [code-server](https://github.com/coder/code-server) | VS Code in browser with Continue.dev |
|
||||||
|
| **Chat Interface** | [Open WebUI](https://github.com/open-webui/open-webui) | Modern conversational UI |
|
||||||
|
| **Tool Gateway** | MCP Gateway | Extensible tool ecosystem |
|
||||||
|
|
||||||
|
### Supported Providers
|
||||||
|
|
||||||
|
| Provider | Free Tier | Best For |
|
||||||
|
|----------|-----------|----------|
|
||||||
|
| **Groq** | 20 RPM | Speed, quick coding |
|
||||||
|
| **Mistral** | 1B tokens/month | High volume processing |
|
||||||
|
| **OpenRouter** | 50 req/day | Universal fallback access |
|
||||||
|
| **Anthropic Claude** | $5 trial | Complex reasoning |
|
||||||
|
| **Moonshot Kimi** | $5 signup | Coding, 128K context |
|
||||||
|
| **DeepSeek** | Pay-as-you-go | Cheapest reasoning |
|
||||||
|
| **Cohere** | 1K calls/month | Embeddings |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## 🚀 Quick Start
|
## 🚀 Quick Start
|
||||||
|
|
||||||
|
### Prerequisites
|
||||||
|
|
||||||
|
- **OS**: Debian 12, Ubuntu 22.04+, or Proxmox LXC
|
||||||
|
- **RAM**: 4GB minimum (8GB recommended with IDE)
|
||||||
|
- **Storage**: 20GB free space
|
||||||
|
- **Docker** & Docker Compose
|
||||||
|
|
||||||
|
### Installation
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 1. Clone from your Gitea
|
# 1. Clone the repository
|
||||||
git clone https://gitea.yourdomain.com/youruser/llm-hub.git
|
git clone https://github.com/yourusername/llm-hub.git
|
||||||
cd llm-hub
|
cd llm-hub
|
||||||
|
|
||||||
# 2. Configure
|
# 2. Configure environment
|
||||||
cp .env.example .env
|
cp .env.example .env
|
||||||
nano .env # Add your API keys
|
nano .env # Add your API keys
|
||||||
|
|
||||||
# 3. Deploy
|
# 3. Deploy
|
||||||
./setup.sh && ./start.sh
|
make setup
|
||||||
|
make start
|
||||||
```
|
```
|
||||||
|
|
||||||
## 📡 Access Points
|
### Or use scripts directly:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
chmod +x setup.sh start.sh
|
||||||
|
./setup.sh && ./start.sh full
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🌐 Access Points
|
||||||
|
|
||||||
|
Once running, access the services at:
|
||||||
|
|
||||||
| Service | URL | Description |
|
| Service | URL | Description |
|
||||||
|---------|-----|-------------|
|
|---------|-----|-------------|
|
||||||
| VS Code IDE | `http://your-ip:8443` | Full IDE with Continue.dev |
|
| 📝 **Web UI** | `http://localhost:3000` | Chat interface with Open WebUI |
|
||||||
| Agent API | `http://your-ip:8080/v1` | Main API endpoint |
|
| 💻 **VS Code IDE** | `http://localhost:8443` | Full IDE with AI assistant |
|
||||||
| LiteLLM | `http://your-ip:4000` | LLM Gateway |
|
| 🔌 **Agent API** | `http://localhost:8080/v1` | Main API endpoint |
|
||||||
| MCP Tools | `http://your-ip:8001/docs` | Tool OpenAPI docs |
|
| ⚡ **LiteLLM** | `http://localhost:4000` | LLM Gateway & model management |
|
||||||
| ChromaDB | `http://your-ip:8000` | Vector memory |
|
| 🔧 **MCP Tools** | `http://localhost:8001/docs` | Tool OpenAPI documentation |
|
||||||
| Web UI | `http://your-ip:3000` | Chat interface |
|
| 🧠 **ChromaDB** | `http://localhost:8000` | Vector memory dashboard |
|
||||||
|
|
||||||
## 🔧 Supported Providers
|
---
|
||||||
|
|
||||||
- **Groq** (Free tier, fast)
|
|
||||||
- **Mistral** (1B tokens/month free)
|
|
||||||
- **Anthropic Claude** (Trial credits)
|
|
||||||
- **Moonshot Kimi** ($5 signup bonus)
|
|
||||||
- **OpenRouter** (Free tier access)
|
|
||||||
- **Cohere** (1K calls/month)
|
|
||||||
- **DeepSeek** (Cheap reasoning)
|
|
||||||
|
|
||||||
## 🧠 Reasoning Modes
|
## 🧠 Reasoning Modes
|
||||||
|
|
||||||
- `react` - Fast iterative reasoning
|
Choose the right reasoning strategy for your task:
|
||||||
- `plan_execute` - Complex multi-step tasks
|
|
||||||
- `reflexion` - Self-correcting with verification
|
| Mode | Description | Speed | Accuracy | Best For |
|
||||||
- `auto` - Automatic selection
|
|------|-------------|-------|----------|----------|
|
||||||
|
| `react` | Iterative thought-action loops | ⚡ Fast | ⭐⭐⭐ Medium | Simple Q&A, debugging |
|
||||||
|
| `plan_execute` | Plan first, then execute | 🚀 Medium | ⭐⭐⭐⭐ High | Multi-step tasks, automation |
|
||||||
|
| `reflexion` | Self-correcting with verification | 🐢 Slow | ⭐⭐⭐⭐⭐ Very High | Code review, critical analysis |
|
||||||
|
| `auto` | Automatic mode selection | ⚡ Variable | ⭐⭐⭐⭐ Adaptive | General purpose |
|
||||||
|
|
||||||
|
Set your default mode in `.env`:
|
||||||
|
```bash
|
||||||
|
DEFAULT_REASONING=auto
|
||||||
|
```
|
||||||
|
|
||||||
|
Or specify per-request:
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:8080/v1/chat/completions \
|
||||||
|
-H "Authorization: Bearer sk-agent-xxx" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"message": "Refactor this codebase",
|
||||||
|
"reasoning_mode": "reflexion"
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🛠️ Configuration
|
||||||
|
|
||||||
|
### 1. API Keys
|
||||||
|
|
||||||
|
Edit `.env` and add at least one provider:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Required: At least one LLM provider
|
||||||
|
GROQ_API_KEY_1=gsk_xxx
|
||||||
|
MISTRAL_API_KEY=your_key
|
||||||
|
|
||||||
|
# Recommended: Multiple providers for redundancy
|
||||||
|
ANTHROPIC_API_KEY=sk-ant-xxx
|
||||||
|
MOONSHOT_API_KEY=sk-xxx
|
||||||
|
OPENROUTER_API_KEY=sk-or-xxx
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Security
|
||||||
|
|
||||||
|
Change default passwords in `.env`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Code-Server access
|
||||||
|
IDE_PASSWORD=your-secure-password
|
||||||
|
IDE_SUDO_PASSWORD=your-admin-password
|
||||||
|
|
||||||
|
# API access
|
||||||
|
MASTER_KEY=sk-agent-$(openssl rand -hex 16)
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Advanced Settings
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Enable self-reflection
|
||||||
|
ENABLE_REFLECTION=true
|
||||||
|
|
||||||
|
# Maximum iterations per request
|
||||||
|
MAX_ITERATIONS=10
|
||||||
|
|
||||||
|
# Knowledge graph (requires more RAM)
|
||||||
|
ENABLE_KNOWLEDGE_GRAPH=false
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 💻 Usage Examples
|
||||||
|
|
||||||
|
### Python SDK
|
||||||
|
|
||||||
|
```python
|
||||||
|
import requests
|
||||||
|
|
||||||
|
API_URL = "http://localhost:8080/v1"
|
||||||
|
API_KEY = "sk-agent-xxx"
|
||||||
|
|
||||||
|
response = requests.post(
|
||||||
|
f"{API_URL}/chat/completions",
|
||||||
|
headers={"Authorization": f"Bearer {API_KEY}"},
|
||||||
|
json={
|
||||||
|
"message": "Create a Python script to fetch weather data",
|
||||||
|
"reasoning_mode": "plan_execute",
|
||||||
|
"session_id": "my-session-001"
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
result = response.json()
|
||||||
|
print(result["response"])
|
||||||
|
print(f"Steps taken: {len(result['steps'])}")
|
||||||
|
```
|
||||||
|
|
||||||
|
### cURL
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Simple query
|
||||||
|
curl -X POST http://localhost:8080/v1/chat/completions \
|
||||||
|
-H "Authorization: Bearer sk-agent-xxx" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"message": "Explain quantum computing",
|
||||||
|
"reasoning_mode": "react"
|
||||||
|
}'
|
||||||
|
|
||||||
|
# Complex task with history
|
||||||
|
curl -X POST http://localhost:8080/v1/chat/completions \
|
||||||
|
-H "Authorization: Bearer sk-agent-xxx" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"message": "Build a REST API with FastAPI",
|
||||||
|
"reasoning_mode": "plan_execute",
|
||||||
|
"max_iterations": 15
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
### OpenAI-Compatible API
|
||||||
|
|
||||||
|
```python
|
||||||
|
from openai import OpenAI
|
||||||
|
|
||||||
|
client = OpenAI(
|
||||||
|
base_url="http://localhost:8080/v1",
|
||||||
|
api_key="sk-agent-xxx"
|
||||||
|
)
|
||||||
|
|
||||||
|
response = client.chat.completions.create(
|
||||||
|
model="agent/orchestrator",
|
||||||
|
messages=[{"role": "user", "content": "Hello!"}],
|
||||||
|
extra_body={"reasoning_mode": "auto"}
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📊 Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────┐
|
||||||
|
│ Agentic LLM Hub │
|
||||||
|
├─────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||||
|
│ │ Web UI │ │ VS Code IDE │ │ API │ │
|
||||||
|
│ │ (:3000) │ │ (:8443) │ │ (:8080) │ │
|
||||||
|
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
||||||
|
│ │ │ │ │
|
||||||
|
│ └─────────────────┼─────────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ┌────────────┴────────────┐ │
|
||||||
|
│ │ Agent Core │ │
|
||||||
|
│ │ (Reasoning Engines) │ │
|
||||||
|
│ └────────────┬────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ┌───────────────────┼───────────────────┐ │
|
||||||
|
│ │ │ │ │
|
||||||
|
│ ┌────┴────┐ ┌─────┴─────┐ ┌──────┴──────┐ │
|
||||||
|
│ │ Redis │ │ LiteLLM │ │ ChromaDB │ │
|
||||||
|
│ │ (Cache) │ │ (Gateway) │ │ (Memory) │ │
|
||||||
|
│ │(:6379) │ │ (:4000) │ │ (:8000) │ │
|
||||||
|
│ └─────────┘ └─────┬─────┘ └─────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ┌─────────────┼─────────────┐ │
|
||||||
|
│ │ │ │ │
|
||||||
|
│ ┌─────┴──┐ ┌─────┴──┐ ┌─────┴──┐ │
|
||||||
|
│ │ Groq │ │ Claude │ │ Mistral│ │
|
||||||
|
│ └───┬────┘ └───┬────┘ └───┬────┘ │
|
||||||
|
│ │ │ │ │
|
||||||
|
│ ┌───┴──┐ ┌────┴───┐ ┌────┴───┐ │
|
||||||
|
│ │ Kimi │ │DeepSeek│ │ ... │ │
|
||||||
|
│ └──────┘ └────────┘ └────────┘ │
|
||||||
|
│ │
|
||||||
|
└─────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🧰 Management Commands
|
||||||
|
|
||||||
|
Use the Makefile for common operations:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
make setup # Initial setup
|
||||||
|
make start # Start all services (full profile)
|
||||||
|
make start-ide # Start with IDE only
|
||||||
|
make stop # Stop all services
|
||||||
|
make logs # View logs
|
||||||
|
make status # Check service status
|
||||||
|
make update # Pull latest and update images
|
||||||
|
make backup # Backup data directories
|
||||||
|
make clean # Remove containers (data preserved)
|
||||||
|
```
|
||||||
|
|
||||||
|
Or use Docker Compose profiles:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Core services only
|
||||||
|
docker-compose up -d
|
||||||
|
|
||||||
|
# Full stack with IDE and UI
|
||||||
|
docker-compose --profile ide --profile ui up -d
|
||||||
|
|
||||||
|
# With MCP tools
|
||||||
|
docker-compose --profile mcp up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## 📚 Documentation
|
## 📚 Documentation
|
||||||
|
|
||||||
- [Setup Guide](docs/SETUP.md)
|
- **[Setup Guide](docs/SETUP.md)** — Detailed installation and configuration
|
||||||
- [API Reference](docs/API.md)
|
- **[API Reference](docs/API.md)** — Complete API documentation with examples
|
||||||
- [Provider Guide](docs/PROVIDERS.md)
|
- **[Provider Guide](docs/PROVIDERS.md)** — Provider setup and rate limits
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## 🔄 Updates
|
## 🔄 Updates
|
||||||
|
|
||||||
|
Update to the latest version:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
# Automatic update
|
||||||
|
make update
|
||||||
|
|
||||||
|
# Or manual
|
||||||
git pull origin main
|
git pull origin main
|
||||||
docker-compose pull
|
docker-compose pull
|
||||||
docker-compose up -d
|
docker-compose up -d
|
||||||
```
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🐛 Troubleshooting
|
||||||
|
|
||||||
|
### Common Issues
|
||||||
|
|
||||||
|
| Issue | Solution |
|
||||||
|
|-------|----------|
|
||||||
|
| Docker fails in LXC | Enable nesting: `features: nesting=1,keyctl=1` |
|
||||||
|
| Port conflicts | Edit `docker-compose.yml` port mappings |
|
||||||
|
| Permission denied | Run `chown -R 1000:1000 workspace/` |
|
||||||
|
| API not responding | Check `docker-compose logs agent-core` |
|
||||||
|
| Out of memory | Increase swap or reduce to core services only |
|
||||||
|
|
||||||
|
### Health Checks
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check API health
|
||||||
|
curl http://localhost:8080/health
|
||||||
|
|
||||||
|
# Check LiteLLM
|
||||||
|
curl http://localhost:4000/health
|
||||||
|
|
||||||
|
# View all logs
|
||||||
|
make logs
|
||||||
|
|
||||||
|
# Check container status
|
||||||
|
docker-compose ps
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🛡️ Security Considerations
|
||||||
|
|
||||||
|
1. **Change default passwords** in `.env` before deploying
|
||||||
|
2. **Use HTTPS** in production (reverse proxy recommended)
|
||||||
|
3. **Restrict network access** to admin ports (8080, 8443)
|
||||||
|
4. **Rotate API keys** regularly
|
||||||
|
5. **Review provider rate limits** to prevent unexpected costs
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🤝 Contributing
|
||||||
|
|
||||||
|
Contributions are welcome! Please:
|
||||||
|
|
||||||
|
1. Fork the repository
|
||||||
|
2. Create a feature branch
|
||||||
|
3. Submit a pull request
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📄 License
|
||||||
|
|
||||||
|
This project is licensed under the MIT License.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
<div align="center">
|
||||||
|
|
||||||
|
**[⬆ Back to Top](#-agentic-llm-hub)**
|
||||||
|
|
||||||
|
Built with ❤️ for the self-hosting community
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue