docs: rewrite README with comprehensive documentation and examples

2026-02-01 15:34:21 +01:00 · 2026-02-01 15:34:21 +01:00 · 23f2e9541e
parent 2cafb31cb4
commit 23f2e9541e
1 changed files with 375 additions and 28 deletions
--- a/README.md
+++ b/README.md
@ -1,60 +1,407 @@
 <div align="center">
 # 🤖 Agentic LLM Hub
-Self-hosted AI agent platform with multi-provider LLM aggregation, reasoning engines (ReAct, Plan-and-Execute, Reflexion), MCP tools, and web IDE.
+**Self-hosted AI agent platform with multi-provider LLM aggregation, advanced reasoning engines, and integrated development environment.**
 [![Docker](https://img.shields.io/badge/docker-ready-blue?logo=docker)](https://www.docker.com/)
 [![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
 [![Platform](https://img.shields.io/badge/platform-linux-lightgrey?logo=linux)](https://www.linux.org/)
 [Quick Start](#-quick-start) • [Features](#-features) • [Documentation](docs/) • [API Reference](docs/API.md)
 </div>
 ---
 ## 📋 Overview
 Agentic LLM Hub is a production-ready, self-hosted platform that unifies multiple LLM providers behind a single API gateway. It features intelligent request routing, multiple reasoning engines (ReAct, Plan-and-Execute, Reflexion), persistent memory via vector storage, and a complete web-based IDE with AI assistance.
 ### Why Agentic LLM Hub?
 - **🔌 Multi-Provider Aggregation** — Seamlessly route between 7+ LLM providers with automatic failover
 - **🧠 Advanced Reasoning** — Choose from multiple reasoning modes based on task complexity
 - **💰 Cost Optimization** — Free tier providers prioritized, paid providers as fallback
 - **🖥️ Complete Workspace** — Browser-based VS Code with AI assistant integration
 - **🔧 Extensible** — MCP tool ecosystem for custom integrations
 - **📊 Persistent Memory** — Vector-based conversation history and knowledge storage
 ---
 ## ✨ Features
 | Component | Technology | Purpose |
 |-----------|------------|---------|
 | **LLM Gateway** | [LiteLLM](https://github.com/BerriAI/litellm) | Unified API for 7+ providers with load balancing |
 | **Reasoning Engine** | Custom implementation | ReAct, Plan-and-Execute, Reflexion modes |
 | **Vector Memory** | [ChromaDB](https://www.trychroma.com/) | Persistent embeddings & conversation history |
 | **Cache Layer** | Redis | Response caching & session management |
 | **Web IDE** | [code-server](https://github.com/coder/code-server) | VS Code in browser with Continue.dev |
 | **Chat Interface** | [Open WebUI](https://github.com/open-webui/open-webui) | Modern conversational UI |
 | **Tool Gateway** | MCP Gateway | Extensible tool ecosystem |
 ### Supported Providers
 | Provider | Free Tier | Best For |
 |----------|-----------|----------|
 | **Groq** | 20 RPM | Speed, quick coding |
 | **Mistral** | 1B tokens/month | High volume processing |
 | **OpenRouter** | 50 req/day | Universal fallback access |
 | **Anthropic Claude** | $5 trial | Complex reasoning |
 | **Moonshot Kimi** | $5 signup | Coding, 128K context |
 | **DeepSeek** | Pay-as-you-go | Cheapest reasoning |
 | **Cohere** | 1K calls/month | Embeddings |
 ---
 ## 🚀 Quick Start
 ### Prerequisites
 - **OS**: Debian 12, Ubuntu 22.04+, or Proxmox LXC
 - **RAM**: 4GB minimum (8GB recommended with IDE)
 - **Storage**: 20GB free space
 - **Docker** & Docker Compose
 ### Installation
 ```bash
-# 1. Clone from your Gitea
+# 1. Clone the repository
-git clone https://gitea.yourdomain.com/youruser/llm-hub.git
+git clone https://github.com/yourusername/llm-hub.git
 cd llm-hub
-# 2. Configure
+# 2. Configure environment
 cp .env.example .env
 nano .env  # Add your API keys
 # 3. Deploy
-./setup.sh && ./start.sh
+make setup
 make start
 ```
-## 📡 Access Points
+### Or use scripts directly:
 ```bash
 chmod +x setup.sh start.sh
 ./setup.sh && ./start.sh full
 ```
 ---
 ## 🌐 Access Points
 Once running, access the services at:
 | Service | URL | Description |
 |---------|-----|-------------|
-| VS Code IDE | `http://your-ip:8443` | Full IDE with Continue.dev |
+| 📝 **Web UI** | `http://localhost:3000` | Chat interface with Open WebUI |
-| Agent API | `http://your-ip:8080/v1` | Main API endpoint |
+| 💻 **VS Code IDE** | `http://localhost:8443` | Full IDE with AI assistant |
-| LiteLLM | `http://your-ip:4000` | LLM Gateway |
+| 🔌 **Agent API** | `http://localhost:8080/v1` | Main API endpoint |
-| MCP Tools | `http://your-ip:8001/docs` | Tool OpenAPI docs |
+| ⚡ **LiteLLM** | `http://localhost:4000` | LLM Gateway & model management |
-| ChromaDB | `http://your-ip:8000` | Vector memory |
+| 🔧 **MCP Tools** | `http://localhost:8001/docs` | Tool OpenAPI documentation |
-| Web UI | `http://your-ip:3000` | Chat interface |
+| 🧠 **ChromaDB** | `http://localhost:8000` | Vector memory dashboard |
-## 🔧 Supported Providers
+---
 - **Groq** (Free tier, fast)
 - **Mistral** (1B tokens/month free)
 - **Anthropic Claude** (Trial credits)
 - **Moonshot Kimi** ($5 signup bonus)
 - **OpenRouter** (Free tier access)
 - **Cohere** (1K calls/month)
 - **DeepSeek** (Cheap reasoning)
 ## 🧠 Reasoning Modes
- `react` - Fast iterative reasoning
+Choose the right reasoning strategy for your task:
- `plan_execute` - Complex multi-step tasks
+
- `reflexion` - Self-correcting with verification
+| Mode | Description | Speed | Accuracy | Best For |
- `auto` - Automatic selection
+|------|-------------|-------|----------|----------|
 | `react` | Iterative thought-action loops | ⚡ Fast | ⭐⭐⭐ Medium | Simple Q&A, debugging |
 | `plan_execute` | Plan first, then execute | 🚀 Medium | ⭐⭐⭐⭐ High | Multi-step tasks, automation |
 | `reflexion` | Self-correcting with verification | 🐢 Slow | ⭐⭐⭐⭐⭐ Very High | Code review, critical analysis |
 | `auto` | Automatic mode selection | ⚡ Variable | ⭐⭐⭐⭐ Adaptive | General purpose |
 Set your default mode in `.env`:
 ```bash
 DEFAULT_REASONING=auto
 ```
 Or specify per-request:
 ```bash
 curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer sk-agent-xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Refactor this codebase",
    "reasoning_mode": "reflexion"
  }'
 ```
 ---
 ## 🛠️ Configuration
 ### 1. API Keys
 Edit `.env` and add at least one provider:
 ```bash
 # Required: At least one LLM provider
 GROQ_API_KEY_1=gsk_xxx
 MISTRAL_API_KEY=your_key
 # Recommended: Multiple providers for redundancy
 ANTHROPIC_API_KEY=sk-ant-xxx
 MOONSHOT_API_KEY=sk-xxx
 OPENROUTER_API_KEY=sk-or-xxx
 ```
 ### 2. Security
 Change default passwords in `.env`:
 ```bash
 # Code-Server access
 IDE_PASSWORD=your-secure-password
 IDE_SUDO_PASSWORD=your-admin-password
 # API access
 MASTER_KEY=sk-agent-$(openssl rand -hex 16)
 ```
 ### 3. Advanced Settings
 ```bash
 # Enable self-reflection
 ENABLE_REFLECTION=true
 # Maximum iterations per request
 MAX_ITERATIONS=10
 # Knowledge graph (requires more RAM)
 ENABLE_KNOWLEDGE_GRAPH=false
 ```
 ---
 ## 💻 Usage Examples
 ### Python SDK
 ```python
 import requests
 API_URL = "http://localhost:8080/v1"
 API_KEY = "sk-agent-xxx"
 response = requests.post(
    f"{API_URL}/chat/completions",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "message": "Create a Python script to fetch weather data",
        "reasoning_mode": "plan_execute",
        "session_id": "my-session-001"
    }
 )
 result = response.json()
 print(result["response"])
 print(f"Steps taken: {len(result['steps'])}")
 ```
 ### cURL
 ```bash
 # Simple query
 curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer sk-agent-xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Explain quantum computing",
    "reasoning_mode": "react"
  }'
 # Complex task with history
 curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer sk-agent-xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Build a REST API with FastAPI",
    "reasoning_mode": "plan_execute",
    "max_iterations": 15
  }'
 ```
 ### OpenAI-Compatible API
 ```python
 from openai import OpenAI
 client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="sk-agent-xxx"
 )
 response = client.chat.completions.create(
    model="agent/orchestrator",
    messages=[{"role": "user", "content": "Hello!"}],
    extra_body={"reasoning_mode": "auto"}
 )
 ```
 ---
 ## 📊 Architecture
 ```
 ┌─────────────────────────────────────────────────────────────┐
 │                     Agentic LLM Hub                         │
 ├─────────────────────────────────────────────────────────────┤
 │                                                             │
 │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
 │  │   Web UI     │  │  VS Code IDE │  │    API       │     │
 │  │  (:3000)     │  │   (:8443)    │  │  (:8080)     │     │
 │  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘     │
 │         │                 │                 │              │
 │         └─────────────────┼─────────────────┘              │
 │                           │                                │
 │              ┌────────────┴────────────┐                  │
 │              │     Agent Core          │                  │
 │              │  (Reasoning Engines)    │                  │
 │              └────────────┬────────────┘                  │
 │                           │                                │
 │       ┌───────────────────┼───────────────────┐           │
 │       │                   │                   │            │
 │  ┌────┴────┐       ┌─────┴─────┐      ┌──────┴──────┐     │
 │  │  Redis  │       │  LiteLLM  │      │  ChromaDB   │     │
 │  │ (Cache) │       │ (Gateway) │      │  (Memory)   │     │
 │  │(:6379)  │       │  (:4000)  │      │   (:8000)   │     │
 │  └─────────┘       └─────┬─────┘      └─────────────┘     │
 │                          │                                 │
 │            ┌─────────────┼─────────────┐                  │
 │            │             │             │                   │
 │      ┌─────┴──┐    ┌─────┴──┐   ┌─────┴──┐               │
 │      │  Groq  │    │ Claude │   │ Mistral│               │
 │      └───┬────┘    └───┬────┘   └───┬────┘               │
 │          │             │             │                     │
 │      ┌───┴──┐     ┌────┴───┐   ┌────┴───┐                │
 │      │ Kimi │     │DeepSeek│   │  ...   │                │
 │      └──────┘     └────────┘   └────────┘                │
 │                                                             │
 └─────────────────────────────────────────────────────────────┘
 ```
 ---
 ## 🧰 Management Commands
 Use the Makefile for common operations:
 ```bash
 make setup      # Initial setup
 make start      # Start all services (full profile)
 make start-ide  # Start with IDE only
 make stop       # Stop all services
 make logs       # View logs
 make status     # Check service status
 make update     # Pull latest and update images
 make backup     # Backup data directories
 make clean      # Remove containers (data preserved)
 ```
 Or use Docker Compose profiles:
 ```bash
 # Core services only
 docker-compose up -d
 # Full stack with IDE and UI
 docker-compose --profile ide --profile ui up -d
 # With MCP tools
 docker-compose --profile mcp up -d
 ```
 ---
 ## 📚 Documentation
- [Setup Guide](docs/SETUP.md)
+- **[Setup Guide](docs/SETUP.md)** — Detailed installation and configuration
- [API Reference](docs/API.md)
+- **[API Reference](docs/API.md)** — Complete API documentation with examples
- [Provider Guide](docs/PROVIDERS.md)
+- **[Provider Guide](docs/PROVIDERS.md)** — Provider setup and rate limits
 ---
 ## 🔄 Updates
 Update to the latest version:
 ```bash
 # Automatic update
 make update
 # Or manual
 git pull origin main
 docker-compose pull
 docker-compose up -d
 ```
 ---
 ## 🐛 Troubleshooting
 ### Common Issues
 | Issue | Solution |
 |-------|----------|
 | Docker fails in LXC | Enable nesting: `features: nesting=1,keyctl=1` |
 | Port conflicts | Edit `docker-compose.yml` port mappings |
 | Permission denied | Run `chown -R 1000:1000 workspace/` |
 | API not responding | Check `docker-compose logs agent-core` |
 | Out of memory | Increase swap or reduce to core services only |
 ### Health Checks
 ```bash
 # Check API health
 curl http://localhost:8080/health
 # Check LiteLLM
 curl http://localhost:4000/health
 # View all logs
 make logs
 # Check container status
 docker-compose ps
 ```
 ---
 ## 🛡️ Security Considerations
 1. **Change default passwords** in `.env` before deploying
 2. **Use HTTPS** in production (reverse proxy recommended)
 3. **Restrict network access** to admin ports (8080, 8443)
 4. **Rotate API keys** regularly
 5. **Review provider rate limits** to prevent unexpected costs
 ---
 ## 🤝 Contributing
 Contributions are welcome! Please:
 1. Fork the repository
 2. Create a feature branch
 3. Submit a pull request
 ---
 ## 📄 License
 This project is licensed under the MIT License.
 ---
 <div align="center">
 **[⬆ Back to Top](#-agentic-llm-hub)**
 Built with ❤️ for the self-hosting community
 </div>