docs: rewrite README with comprehensive documentation and examples

This commit is contained in:
ImpulsiveFPS 2026-02-01 15:34:21 +01:00
parent 2cafb31cb4
commit 23f2e9541e
1 changed files with 375 additions and 28 deletions

403
README.md
View File

@ -1,60 +1,407 @@
<div align="center">
# 🤖 Agentic LLM Hub # 🤖 Agentic LLM Hub
Self-hosted AI agent platform with multi-provider LLM aggregation, reasoning engines (ReAct, Plan-and-Execute, Reflexion), MCP tools, and web IDE. **Self-hosted AI agent platform with multi-provider LLM aggregation, advanced reasoning engines, and integrated development environment.**
[![Docker](https://img.shields.io/badge/docker-ready-blue?logo=docker)](https://www.docker.com/)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
[![Platform](https://img.shields.io/badge/platform-linux-lightgrey?logo=linux)](https://www.linux.org/)
[Quick Start](#-quick-start) • [Features](#-features) • [Documentation](docs/) • [API Reference](docs/API.md)
</div>
---
## 📋 Overview
Agentic LLM Hub is a production-ready, self-hosted platform that unifies multiple LLM providers behind a single API gateway. It features intelligent request routing, multiple reasoning engines (ReAct, Plan-and-Execute, Reflexion), persistent memory via vector storage, and a complete web-based IDE with AI assistance.
### Why Agentic LLM Hub?
- **🔌 Multi-Provider Aggregation** — Seamlessly route between 7+ LLM providers with automatic failover
- **🧠 Advanced Reasoning** — Choose from multiple reasoning modes based on task complexity
- **💰 Cost Optimization** — Free tier providers prioritized, paid providers as fallback
- **🖥️ Complete Workspace** — Browser-based VS Code with AI assistant integration
- **🔧 Extensible** — MCP tool ecosystem for custom integrations
- **📊 Persistent Memory** — Vector-based conversation history and knowledge storage
---
## ✨ Features
| Component | Technology | Purpose |
|-----------|------------|---------|
| **LLM Gateway** | [LiteLLM](https://github.com/BerriAI/litellm) | Unified API for 7+ providers with load balancing |
| **Reasoning Engine** | Custom implementation | ReAct, Plan-and-Execute, Reflexion modes |
| **Vector Memory** | [ChromaDB](https://www.trychroma.com/) | Persistent embeddings & conversation history |
| **Cache Layer** | Redis | Response caching & session management |
| **Web IDE** | [code-server](https://github.com/coder/code-server) | VS Code in browser with Continue.dev |
| **Chat Interface** | [Open WebUI](https://github.com/open-webui/open-webui) | Modern conversational UI |
| **Tool Gateway** | MCP Gateway | Extensible tool ecosystem |
### Supported Providers
| Provider | Free Tier | Best For |
|----------|-----------|----------|
| **Groq** | 20 RPM | Speed, quick coding |
| **Mistral** | 1B tokens/month | High volume processing |
| **OpenRouter** | 50 req/day | Universal fallback access |
| **Anthropic Claude** | $5 trial | Complex reasoning |
| **Moonshot Kimi** | $5 signup | Coding, 128K context |
| **DeepSeek** | Pay-as-you-go | Cheapest reasoning |
| **Cohere** | 1K calls/month | Embeddings |
---
## 🚀 Quick Start ## 🚀 Quick Start
### Prerequisites
- **OS**: Debian 12, Ubuntu 22.04+, or Proxmox LXC
- **RAM**: 4GB minimum (8GB recommended with IDE)
- **Storage**: 20GB free space
- **Docker** & Docker Compose
### Installation
```bash ```bash
# 1. Clone from your Gitea # 1. Clone the repository
git clone https://gitea.yourdomain.com/youruser/llm-hub.git git clone https://github.com/yourusername/llm-hub.git
cd llm-hub cd llm-hub
# 2. Configure # 2. Configure environment
cp .env.example .env cp .env.example .env
nano .env # Add your API keys nano .env # Add your API keys
# 3. Deploy # 3. Deploy
./setup.sh && ./start.sh make setup
make start
``` ```
## 📡 Access Points ### Or use scripts directly:
```bash
chmod +x setup.sh start.sh
./setup.sh && ./start.sh full
```
---
## 🌐 Access Points
Once running, access the services at:
| Service | URL | Description | | Service | URL | Description |
|---------|-----|-------------| |---------|-----|-------------|
| VS Code IDE | `http://your-ip:8443` | Full IDE with Continue.dev | | 📝 **Web UI** | `http://localhost:3000` | Chat interface with Open WebUI |
| Agent API | `http://your-ip:8080/v1` | Main API endpoint | | 💻 **VS Code IDE** | `http://localhost:8443` | Full IDE with AI assistant |
| LiteLLM | `http://your-ip:4000` | LLM Gateway | | 🔌 **Agent API** | `http://localhost:8080/v1` | Main API endpoint |
| MCP Tools | `http://your-ip:8001/docs` | Tool OpenAPI docs | | **LiteLLM** | `http://localhost:4000` | LLM Gateway & model management |
| ChromaDB | `http://your-ip:8000` | Vector memory | | 🔧 **MCP Tools** | `http://localhost:8001/docs` | Tool OpenAPI documentation |
| Web UI | `http://your-ip:3000` | Chat interface | | 🧠 **ChromaDB** | `http://localhost:8000` | Vector memory dashboard |
## 🔧 Supported Providers ---
- **Groq** (Free tier, fast)
- **Mistral** (1B tokens/month free)
- **Anthropic Claude** (Trial credits)
- **Moonshot Kimi** ($5 signup bonus)
- **OpenRouter** (Free tier access)
- **Cohere** (1K calls/month)
- **DeepSeek** (Cheap reasoning)
## 🧠 Reasoning Modes ## 🧠 Reasoning Modes
- `react` - Fast iterative reasoning Choose the right reasoning strategy for your task:
- `plan_execute` - Complex multi-step tasks
- `reflexion` - Self-correcting with verification | Mode | Description | Speed | Accuracy | Best For |
- `auto` - Automatic selection |------|-------------|-------|----------|----------|
| `react` | Iterative thought-action loops | ⚡ Fast | ⭐⭐⭐ Medium | Simple Q&A, debugging |
| `plan_execute` | Plan first, then execute | 🚀 Medium | ⭐⭐⭐⭐ High | Multi-step tasks, automation |
| `reflexion` | Self-correcting with verification | 🐢 Slow | ⭐⭐⭐⭐⭐ Very High | Code review, critical analysis |
| `auto` | Automatic mode selection | ⚡ Variable | ⭐⭐⭐⭐ Adaptive | General purpose |
Set your default mode in `.env`:
```bash
DEFAULT_REASONING=auto
```
Or specify per-request:
```bash
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer sk-agent-xxx" \
-H "Content-Type: application/json" \
-d '{
"message": "Refactor this codebase",
"reasoning_mode": "reflexion"
}'
```
---
## 🛠️ Configuration
### 1. API Keys
Edit `.env` and add at least one provider:
```bash
# Required: At least one LLM provider
GROQ_API_KEY_1=gsk_xxx
MISTRAL_API_KEY=your_key
# Recommended: Multiple providers for redundancy
ANTHROPIC_API_KEY=sk-ant-xxx
MOONSHOT_API_KEY=sk-xxx
OPENROUTER_API_KEY=sk-or-xxx
```
### 2. Security
Change default passwords in `.env`:
```bash
# Code-Server access
IDE_PASSWORD=your-secure-password
IDE_SUDO_PASSWORD=your-admin-password
# API access
MASTER_KEY=sk-agent-$(openssl rand -hex 16)
```
### 3. Advanced Settings
```bash
# Enable self-reflection
ENABLE_REFLECTION=true
# Maximum iterations per request
MAX_ITERATIONS=10
# Knowledge graph (requires more RAM)
ENABLE_KNOWLEDGE_GRAPH=false
```
---
## 💻 Usage Examples
### Python SDK
```python
import requests
API_URL = "http://localhost:8080/v1"
API_KEY = "sk-agent-xxx"
response = requests.post(
f"{API_URL}/chat/completions",
headers={"Authorization": f"Bearer {API_KEY}"},
json={
"message": "Create a Python script to fetch weather data",
"reasoning_mode": "plan_execute",
"session_id": "my-session-001"
}
)
result = response.json()
print(result["response"])
print(f"Steps taken: {len(result['steps'])}")
```
### cURL
```bash
# Simple query
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer sk-agent-xxx" \
-H "Content-Type: application/json" \
-d '{
"message": "Explain quantum computing",
"reasoning_mode": "react"
}'
# Complex task with history
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer sk-agent-xxx" \
-H "Content-Type: application/json" \
-d '{
"message": "Build a REST API with FastAPI",
"reasoning_mode": "plan_execute",
"max_iterations": 15
}'
```
### OpenAI-Compatible API
```python
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="sk-agent-xxx"
)
response = client.chat.completions.create(
model="agent/orchestrator",
messages=[{"role": "user", "content": "Hello!"}],
extra_body={"reasoning_mode": "auto"}
)
```
---
## 📊 Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ Agentic LLM Hub │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Web UI │ │ VS Code IDE │ │ API │ │
│ │ (:3000) │ │ (:8443) │ │ (:8080) │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ └─────────────────┼─────────────────┘ │
│ │ │
│ ┌────────────┴────────────┐ │
│ │ Agent Core │ │
│ │ (Reasoning Engines) │ │
│ └────────────┬────────────┘ │
│ │ │
│ ┌───────────────────┼───────────────────┐ │
│ │ │ │ │
│ ┌────┴────┐ ┌─────┴─────┐ ┌──────┴──────┐ │
│ │ Redis │ │ LiteLLM │ │ ChromaDB │ │
│ │ (Cache) │ │ (Gateway) │ │ (Memory) │ │
│ │(:6379) │ │ (:4000) │ │ (:8000) │ │
│ └─────────┘ └─────┬─────┘ └─────────────┘ │
│ │ │
│ ┌─────────────┼─────────────┐ │
│ │ │ │ │
│ ┌─────┴──┐ ┌─────┴──┐ ┌─────┴──┐ │
│ │ Groq │ │ Claude │ │ Mistral│ │
│ └───┬────┘ └───┬────┘ └───┬────┘ │
│ │ │ │ │
│ ┌───┴──┐ ┌────┴───┐ ┌────┴───┐ │
│ │ Kimi │ │DeepSeek│ │ ... │ │
│ └──────┘ └────────┘ └────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
```
---
## 🧰 Management Commands
Use the Makefile for common operations:
```bash
make setup # Initial setup
make start # Start all services (full profile)
make start-ide # Start with IDE only
make stop # Stop all services
make logs # View logs
make status # Check service status
make update # Pull latest and update images
make backup # Backup data directories
make clean # Remove containers (data preserved)
```
Or use Docker Compose profiles:
```bash
# Core services only
docker-compose up -d
# Full stack with IDE and UI
docker-compose --profile ide --profile ui up -d
# With MCP tools
docker-compose --profile mcp up -d
```
---
## 📚 Documentation ## 📚 Documentation
- [Setup Guide](docs/SETUP.md) - **[Setup Guide](docs/SETUP.md)** — Detailed installation and configuration
- [API Reference](docs/API.md) - **[API Reference](docs/API.md)** — Complete API documentation with examples
- [Provider Guide](docs/PROVIDERS.md) - **[Provider Guide](docs/PROVIDERS.md)** — Provider setup and rate limits
---
## 🔄 Updates ## 🔄 Updates
Update to the latest version:
```bash ```bash
# Automatic update
make update
# Or manual
git pull origin main git pull origin main
docker-compose pull docker-compose pull
docker-compose up -d docker-compose up -d
``` ```
---
## 🐛 Troubleshooting
### Common Issues
| Issue | Solution |
|-------|----------|
| Docker fails in LXC | Enable nesting: `features: nesting=1,keyctl=1` |
| Port conflicts | Edit `docker-compose.yml` port mappings |
| Permission denied | Run `chown -R 1000:1000 workspace/` |
| API not responding | Check `docker-compose logs agent-core` |
| Out of memory | Increase swap or reduce to core services only |
### Health Checks
```bash
# Check API health
curl http://localhost:8080/health
# Check LiteLLM
curl http://localhost:4000/health
# View all logs
make logs
# Check container status
docker-compose ps
```
---
## 🛡️ Security Considerations
1. **Change default passwords** in `.env` before deploying
2. **Use HTTPS** in production (reverse proxy recommended)
3. **Restrict network access** to admin ports (8080, 8443)
4. **Rotate API keys** regularly
5. **Review provider rate limits** to prevent unexpected costs
---
## 🤝 Contributing
Contributions are welcome! Please:
1. Fork the repository
2. Create a feature branch
3. Submit a pull request
---
## 📄 License
This project is licensed under the MIT License.
---
<div align="center">
**[⬆ Back to Top](#-agentic-llm-hub)**
Built with ❤️ for the self-hosting community
</div>