docs: rewrite README with comprehensive documentation and examples

2026-02-01 15:34:21 +01:00 · 2026-02-01 15:34:21 +01:00 · 23f2e9541e
parent 2cafb31cb4
commit 23f2e9541e
1 changed files with 375 additions and 28 deletions
--- a/README.md
+++ b/README.md
@ -1,60 +1,407 @@
+<div align="center">
+
 # 🤖 Agentic LLM Hub

-Self-hosted AI agent platform with multi-provider LLM aggregation, reasoning engines (ReAct, Plan-and-Execute, Reflexion), MCP tools, and web IDE.
+**Self-hosted AI agent platform with multi-provider LLM aggregation, advanced reasoning engines, and integrated development environment.**
+
+[![Docker](https://img.shields.io/badge/docker-ready-blue?logo=docker)](https://www.docker.com/)
+[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
+[![Platform](https://img.shields.io/badge/platform-linux-lightgrey?logo=linux)](https://www.linux.org/)
+
+[Quick Start](#-quick-start) • [Features](#-features) • [Documentation](docs/) • [API Reference](docs/API.md)
+
+</div>
+
+---
+
+## 📋 Overview
+
+Agentic LLM Hub is a production-ready, self-hosted platform that unifies multiple LLM providers behind a single API gateway. It features intelligent request routing, multiple reasoning engines (ReAct, Plan-and-Execute, Reflexion), persistent memory via vector storage, and a complete web-based IDE with AI assistance.
+
+### Why Agentic LLM Hub?
+
+- **🔌 Multi-Provider Aggregation** — Seamlessly route between 7+ LLM providers with automatic failover
+- **🧠 Advanced Reasoning** — Choose from multiple reasoning modes based on task complexity
+- **💰 Cost Optimization** — Free tier providers prioritized, paid providers as fallback
+- **🖥️ Complete Workspace** — Browser-based VS Code with AI assistant integration
+- **🔧 Extensible** — MCP tool ecosystem for custom integrations
+- **📊 Persistent Memory** — Vector-based conversation history and knowledge storage
+
+---
+
+## ✨ Features
+
+| Component | Technology | Purpose |
+|-----------|------------|---------|
+| **LLM Gateway** | [LiteLLM](https://github.com/BerriAI/litellm) | Unified API for 7+ providers with load balancing |
+| **Reasoning Engine** | Custom implementation | ReAct, Plan-and-Execute, Reflexion modes |
+| **Vector Memory** | [ChromaDB](https://www.trychroma.com/) | Persistent embeddings & conversation history |
+| **Cache Layer** | Redis | Response caching & session management |
+| **Web IDE** | [code-server](https://github.com/coder/code-server) | VS Code in browser with Continue.dev |
+| **Chat Interface** | [Open WebUI](https://github.com/open-webui/open-webui) | Modern conversational UI |
+| **Tool Gateway** | MCP Gateway | Extensible tool ecosystem |
+
+### Supported Providers
+
+| Provider | Free Tier | Best For |
+|----------|-----------|----------|
+| **Groq** | 20 RPM | Speed, quick coding |
+| **Mistral** | 1B tokens/month | High volume processing |
+| **OpenRouter** | 50 req/day | Universal fallback access |
+| **Anthropic Claude** | $5 trial | Complex reasoning |
+| **Moonshot Kimi** | $5 signup | Coding, 128K context |
+| **DeepSeek** | Pay-as-you-go | Cheapest reasoning |
+| **Cohere** | 1K calls/month | Embeddings |
+
+---

 ## 🚀 Quick Start

+### Prerequisites
+
+- **OS**: Debian 12, Ubuntu 22.04+, or Proxmox LXC
+- **RAM**: 4GB minimum (8GB recommended with IDE)
+- **Storage**: 20GB free space
+- **Docker** & Docker Compose
+
+### Installation
+
 ```bash
-# 1. Clone from your Gitea
-git clone https://gitea.yourdomain.com/youruser/llm-hub.git
+# 1. Clone the repository
+git clone https://github.com/yourusername/llm-hub.git
 cd llm-hub

-# 2. Configure
+# 2. Configure environment
 cp .env.example .env
 nano .env  # Add your API keys

 # 3. Deploy
-./setup.sh && ./start.sh
+make setup
+make start
 ```

-## 📡 Access Points
+### Or use scripts directly:
+
+```bash
+chmod +x setup.sh start.sh
+./setup.sh && ./start.sh full
+```
+
+---
+
+## 🌐 Access Points
+
+Once running, access the services at:

 | Service | URL | Description |
 |---------|-----|-------------|
-| VS Code IDE | `http://your-ip:8443` | Full IDE with Continue.dev |
-| Agent API | `http://your-ip:8080/v1` | Main API endpoint |
-| LiteLLM | `http://your-ip:4000` | LLM Gateway |
-| MCP Tools | `http://your-ip:8001/docs` | Tool OpenAPI docs |
-| ChromaDB | `http://your-ip:8000` | Vector memory |
-| Web UI | `http://your-ip:3000` | Chat interface |
+| 📝 **Web UI** | `http://localhost:3000` | Chat interface with Open WebUI |
+| 💻 **VS Code IDE** | `http://localhost:8443` | Full IDE with AI assistant |
+| 🔌 **Agent API** | `http://localhost:8080/v1` | Main API endpoint |
+| ⚡ **LiteLLM** | `http://localhost:4000` | LLM Gateway & model management |
+| 🔧 **MCP Tools** | `http://localhost:8001/docs` | Tool OpenAPI documentation |
+| 🧠 **ChromaDB** | `http://localhost:8000` | Vector memory dashboard |

-## 🔧 Supported Providers
-
- **Groq** (Free tier, fast)
- **Mistral** (1B tokens/month free)
- **Anthropic Claude** (Trial credits)
- **Moonshot Kimi** ($5 signup bonus)
- **OpenRouter** (Free tier access)
- **Cohere** (1K calls/month)
- **DeepSeek** (Cheap reasoning)
+---

 ## 🧠 Reasoning Modes

- `react` - Fast iterative reasoning
- `plan_execute` - Complex multi-step tasks
- `reflexion` - Self-correcting with verification
- `auto` - Automatic selection
+Choose the right reasoning strategy for your task:
+
+| Mode | Description | Speed | Accuracy | Best For |
+|------|-------------|-------|----------|----------|
+| `react` | Iterative thought-action loops | ⚡ Fast | ⭐⭐⭐ Medium | Simple Q&A, debugging |
+| `plan_execute` | Plan first, then execute | 🚀 Medium | ⭐⭐⭐⭐ High | Multi-step tasks, automation |
+| `reflexion` | Self-correcting with verification | 🐢 Slow | ⭐⭐⭐⭐⭐ Very High | Code review, critical analysis |
+| `auto` | Automatic mode selection | ⚡ Variable | ⭐⭐⭐⭐ Adaptive | General purpose |
+
+Set your default mode in `.env`:
+```bash
+DEFAULT_REASONING=auto
+```
+
+Or specify per-request:
+```bash
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "Authorization: Bearer sk-agent-xxx" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "message": "Refactor this codebase",
+    "reasoning_mode": "reflexion"
+  }'
+```
+
+---
+
+## 🛠️ Configuration
+
+### 1. API Keys
+
+Edit `.env` and add at least one provider:
+
+```bash
+# Required: At least one LLM provider
+GROQ_API_KEY_1=gsk_xxx
+MISTRAL_API_KEY=your_key
+
+# Recommended: Multiple providers for redundancy
+ANTHROPIC_API_KEY=sk-ant-xxx
+MOONSHOT_API_KEY=sk-xxx
+OPENROUTER_API_KEY=sk-or-xxx
+```
+
+### 2. Security
+
+Change default passwords in `.env`:
+
+```bash
+# Code-Server access
+IDE_PASSWORD=your-secure-password
+IDE_SUDO_PASSWORD=your-admin-password
+
+# API access
+MASTER_KEY=sk-agent-$(openssl rand -hex 16)
+```
+
+### 3. Advanced Settings
+
+```bash
+# Enable self-reflection
+ENABLE_REFLECTION=true
+
+# Maximum iterations per request
+MAX_ITERATIONS=10
+
+# Knowledge graph (requires more RAM)
+ENABLE_KNOWLEDGE_GRAPH=false
+```
+
+---
+
+## 💻 Usage Examples
+
+### Python SDK
+
+```python
+import requests
+
+API_URL = "http://localhost:8080/v1"
+API_KEY = "sk-agent-xxx"
+
+response = requests.post(
+    f"{API_URL}/chat/completions",
+    headers={"Authorization": f"Bearer {API_KEY}"},
+    json={
+        "message": "Create a Python script to fetch weather data",
+        "reasoning_mode": "plan_execute",
+        "session_id": "my-session-001"
+    }
+)
+
+result = response.json()
+print(result["response"])
+print(f"Steps taken: {len(result['steps'])}")
+```
+
+### cURL
+
+```bash
+# Simple query
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "Authorization: Bearer sk-agent-xxx" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "message": "Explain quantum computing",
+    "reasoning_mode": "react"
+  }'
+
+# Complex task with history
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "Authorization: Bearer sk-agent-xxx" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "message": "Build a REST API with FastAPI",
+    "reasoning_mode": "plan_execute",
+    "max_iterations": 15
+  }'
+```
+
+### OpenAI-Compatible API
+
+```python
+from openai import OpenAI
+
+client = OpenAI(
+    base_url="http://localhost:8080/v1",
+    api_key="sk-agent-xxx"
+)
+
+response = client.chat.completions.create(
+    model="agent/orchestrator",
+    messages=[{"role": "user", "content": "Hello!"}],
+    extra_body={"reasoning_mode": "auto"}
+)
+```
+
+---
+
+## 📊 Architecture
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                     Agentic LLM Hub                         │
+├─────────────────────────────────────────────────────────────┤
+│                                                             │
+│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
+│  │   Web UI     │  │  VS Code IDE │  │    API       │     │
+│  │  (:3000)     │  │   (:8443)    │  │  (:8080)     │     │
+│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘     │
+│         │                 │                 │              │
+│         └─────────────────┼─────────────────┘              │
+│                           │                                │
+│              ┌────────────┴────────────┐                  │
+│              │     Agent Core          │                  │
+│              │  (Reasoning Engines)    │                  │
+│              └────────────┬────────────┘                  │
+│                           │                                │
+│       ┌───────────────────┼───────────────────┐           │
+│       │                   │                   │            │
+│  ┌────┴────┐       ┌─────┴─────┐      ┌──────┴──────┐     │
+│  │  Redis  │       │  LiteLLM  │      │  ChromaDB   │     │
+│  │ (Cache) │       │ (Gateway) │      │  (Memory)   │     │
+│  │(:6379)  │       │  (:4000)  │      │   (:8000)   │     │
+│  └─────────┘       └─────┬─────┘      └─────────────┘     │
+│                          │                                 │
+│            ┌─────────────┼─────────────┐                  │
+│            │             │             │                   │
+│      ┌─────┴──┐    ┌─────┴──┐   ┌─────┴──┐               │
+│      │  Groq  │    │ Claude │   │ Mistral│               │
+│      └───┬────┘    └───┬────┘   └───┬────┘               │
+│          │             │             │                     │
+│      ┌───┴──┐     ┌────┴───┐   ┌────┴───┐                │
+│      │ Kimi │     │DeepSeek│   │  ...   │                │
+│      └──────┘     └────────┘   └────────┘                │
+│                                                             │
+└─────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## 🧰 Management Commands
+
+Use the Makefile for common operations:
+
+```bash
+make setup      # Initial setup
+make start      # Start all services (full profile)
+make start-ide  # Start with IDE only
+make stop       # Stop all services
+make logs       # View logs
+make status     # Check service status
+make update     # Pull latest and update images
+make backup     # Backup data directories
+make clean      # Remove containers (data preserved)
+```
+
+Or use Docker Compose profiles:
+
+```bash
+# Core services only
+docker-compose up -d
+
+# Full stack with IDE and UI
+docker-compose --profile ide --profile ui up -d
+
+# With MCP tools
+docker-compose --profile mcp up -d
+```
+
+---

 ## 📚 Documentation

- [Setup Guide](docs/SETUP.md)
- [API Reference](docs/API.md)
- [Provider Guide](docs/PROVIDERS.md)
+- **[Setup Guide](docs/SETUP.md)** — Detailed installation and configuration
+- **[API Reference](docs/API.md)** — Complete API documentation with examples
+- **[Provider Guide](docs/PROVIDERS.md)** — Provider setup and rate limits
+
+---

 ## 🔄 Updates

+Update to the latest version:
+
 ```bash
+# Automatic update
+make update
+
+# Or manual
 git pull origin main
 docker-compose pull
 docker-compose up -d
 ```
+
+---
+
+## 🐛 Troubleshooting
+
+### Common Issues
+
+| Issue | Solution |
+|-------|----------|
+| Docker fails in LXC | Enable nesting: `features: nesting=1,keyctl=1` |
+| Port conflicts | Edit `docker-compose.yml` port mappings |
+| Permission denied | Run `chown -R 1000:1000 workspace/` |
+| API not responding | Check `docker-compose logs agent-core` |
+| Out of memory | Increase swap or reduce to core services only |
+
+### Health Checks
+
+```bash
+# Check API health
+curl http://localhost:8080/health
+
+# Check LiteLLM
+curl http://localhost:4000/health
+
+# View all logs
+make logs
+
+# Check container status
+docker-compose ps
+```
+
+---
+
+## 🛡️ Security Considerations
+
+1. **Change default passwords** in `.env` before deploying
+2. **Use HTTPS** in production (reverse proxy recommended)
+3. **Restrict network access** to admin ports (8080, 8443)
+4. **Rotate API keys** regularly
+5. **Review provider rate limits** to prevent unexpected costs
+
+---
+
+## 🤝 Contributing
+
+Contributions are welcome! Please:
+
+1. Fork the repository
+2. Create a feature branch
+3. Submit a pull request
+
+---
+
+## 📄 License
+
+This project is licensed under the MIT License.
+
+---
+
+<div align="center">
+
+**[⬆ Back to Top](#-agentic-llm-hub)**
+
+Built with ❤️ for the self-hosting community
+
+</div>