From 23f2e9541e61f84d5b9db2fbd6ed9bc8c5f929e0 Mon Sep 17 00:00:00 2001 From: ImpulsiveFPS Date: Sun, 1 Feb 2026 15:34:21 +0100 Subject: [PATCH] docs: rewrite README with comprehensive documentation and examples --- README.md | 403 ++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 375 insertions(+), 28 deletions(-) diff --git a/README.md b/README.md index 763056b..f413b8b 100644 --- a/README.md +++ b/README.md @@ -1,60 +1,407 @@ +
+ # πŸ€– Agentic LLM Hub -Self-hosted AI agent platform with multi-provider LLM aggregation, reasoning engines (ReAct, Plan-and-Execute, Reflexion), MCP tools, and web IDE. +**Self-hosted AI agent platform with multi-provider LLM aggregation, advanced reasoning engines, and integrated development environment.** + +[![Docker](https://img.shields.io/badge/docker-ready-blue?logo=docker)](https://www.docker.com/) +[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE) +[![Platform](https://img.shields.io/badge/platform-linux-lightgrey?logo=linux)](https://www.linux.org/) + +[Quick Start](#-quick-start) β€’ [Features](#-features) β€’ [Documentation](docs/) β€’ [API Reference](docs/API.md) + +
+ +--- + +## πŸ“‹ Overview + +Agentic LLM Hub is a production-ready, self-hosted platform that unifies multiple LLM providers behind a single API gateway. It features intelligent request routing, multiple reasoning engines (ReAct, Plan-and-Execute, Reflexion), persistent memory via vector storage, and a complete web-based IDE with AI assistance. + +### Why Agentic LLM Hub? + +- **πŸ”Œ Multi-Provider Aggregation** β€” Seamlessly route between 7+ LLM providers with automatic failover +- **🧠 Advanced Reasoning** β€” Choose from multiple reasoning modes based on task complexity +- **πŸ’° Cost Optimization** β€” Free tier providers prioritized, paid providers as fallback +- **πŸ–₯️ Complete Workspace** β€” Browser-based VS Code with AI assistant integration +- **πŸ”§ Extensible** β€” MCP tool ecosystem for custom integrations +- **πŸ“Š Persistent Memory** β€” Vector-based conversation history and knowledge storage + +--- + +## ✨ Features + +| Component | Technology | Purpose | +|-----------|------------|---------| +| **LLM Gateway** | [LiteLLM](https://github.com/BerriAI/litellm) | Unified API for 7+ providers with load balancing | +| **Reasoning Engine** | Custom implementation | ReAct, Plan-and-Execute, Reflexion modes | +| **Vector Memory** | [ChromaDB](https://www.trychroma.com/) | Persistent embeddings & conversation history | +| **Cache Layer** | Redis | Response caching & session management | +| **Web IDE** | [code-server](https://github.com/coder/code-server) | VS Code in browser with Continue.dev | +| **Chat Interface** | [Open WebUI](https://github.com/open-webui/open-webui) | Modern conversational UI | +| **Tool Gateway** | MCP Gateway | Extensible tool ecosystem | + +### Supported Providers + +| Provider | Free Tier | Best For | +|----------|-----------|----------| +| **Groq** | 20 RPM | Speed, quick coding | +| **Mistral** | 1B tokens/month | High volume processing | +| **OpenRouter** | 50 req/day | Universal fallback access | +| **Anthropic Claude** | $5 trial | Complex reasoning | +| **Moonshot Kimi** | $5 signup | Coding, 128K context | +| **DeepSeek** | Pay-as-you-go | Cheapest reasoning | +| **Cohere** | 1K calls/month | Embeddings | + +--- ## πŸš€ Quick Start +### Prerequisites + +- **OS**: Debian 12, Ubuntu 22.04+, or Proxmox LXC +- **RAM**: 4GB minimum (8GB recommended with IDE) +- **Storage**: 20GB free space +- **Docker** & Docker Compose + +### Installation + ```bash -# 1. Clone from your Gitea -git clone https://gitea.yourdomain.com/youruser/llm-hub.git +# 1. Clone the repository +git clone https://github.com/yourusername/llm-hub.git cd llm-hub -# 2. Configure +# 2. Configure environment cp .env.example .env nano .env # Add your API keys # 3. Deploy -./setup.sh && ./start.sh +make setup +make start ``` -## πŸ“‘ Access Points +### Or use scripts directly: + +```bash +chmod +x setup.sh start.sh +./setup.sh && ./start.sh full +``` + +--- + +## 🌐 Access Points + +Once running, access the services at: | Service | URL | Description | |---------|-----|-------------| -| VS Code IDE | `http://your-ip:8443` | Full IDE with Continue.dev | -| Agent API | `http://your-ip:8080/v1` | Main API endpoint | -| LiteLLM | `http://your-ip:4000` | LLM Gateway | -| MCP Tools | `http://your-ip:8001/docs` | Tool OpenAPI docs | -| ChromaDB | `http://your-ip:8000` | Vector memory | -| Web UI | `http://your-ip:3000` | Chat interface | +| πŸ“ **Web UI** | `http://localhost:3000` | Chat interface with Open WebUI | +| πŸ’» **VS Code IDE** | `http://localhost:8443` | Full IDE with AI assistant | +| πŸ”Œ **Agent API** | `http://localhost:8080/v1` | Main API endpoint | +| ⚑ **LiteLLM** | `http://localhost:4000` | LLM Gateway & model management | +| πŸ”§ **MCP Tools** | `http://localhost:8001/docs` | Tool OpenAPI documentation | +| 🧠 **ChromaDB** | `http://localhost:8000` | Vector memory dashboard | -## πŸ”§ Supported Providers - -- **Groq** (Free tier, fast) -- **Mistral** (1B tokens/month free) -- **Anthropic Claude** (Trial credits) -- **Moonshot Kimi** ($5 signup bonus) -- **OpenRouter** (Free tier access) -- **Cohere** (1K calls/month) -- **DeepSeek** (Cheap reasoning) +--- ## 🧠 Reasoning Modes -- `react` - Fast iterative reasoning -- `plan_execute` - Complex multi-step tasks -- `reflexion` - Self-correcting with verification -- `auto` - Automatic selection +Choose the right reasoning strategy for your task: + +| Mode | Description | Speed | Accuracy | Best For | +|------|-------------|-------|----------|----------| +| `react` | Iterative thought-action loops | ⚑ Fast | ⭐⭐⭐ Medium | Simple Q&A, debugging | +| `plan_execute` | Plan first, then execute | πŸš€ Medium | ⭐⭐⭐⭐ High | Multi-step tasks, automation | +| `reflexion` | Self-correcting with verification | 🐒 Slow | ⭐⭐⭐⭐⭐ Very High | Code review, critical analysis | +| `auto` | Automatic mode selection | ⚑ Variable | ⭐⭐⭐⭐ Adaptive | General purpose | + +Set your default mode in `.env`: +```bash +DEFAULT_REASONING=auto +``` + +Or specify per-request: +```bash +curl -X POST http://localhost:8080/v1/chat/completions \ + -H "Authorization: Bearer sk-agent-xxx" \ + -H "Content-Type: application/json" \ + -d '{ + "message": "Refactor this codebase", + "reasoning_mode": "reflexion" + }' +``` + +--- + +## πŸ› οΈ Configuration + +### 1. API Keys + +Edit `.env` and add at least one provider: + +```bash +# Required: At least one LLM provider +GROQ_API_KEY_1=gsk_xxx +MISTRAL_API_KEY=your_key + +# Recommended: Multiple providers for redundancy +ANTHROPIC_API_KEY=sk-ant-xxx +MOONSHOT_API_KEY=sk-xxx +OPENROUTER_API_KEY=sk-or-xxx +``` + +### 2. Security + +Change default passwords in `.env`: + +```bash +# Code-Server access +IDE_PASSWORD=your-secure-password +IDE_SUDO_PASSWORD=your-admin-password + +# API access +MASTER_KEY=sk-agent-$(openssl rand -hex 16) +``` + +### 3. Advanced Settings + +```bash +# Enable self-reflection +ENABLE_REFLECTION=true + +# Maximum iterations per request +MAX_ITERATIONS=10 + +# Knowledge graph (requires more RAM) +ENABLE_KNOWLEDGE_GRAPH=false +``` + +--- + +## πŸ’» Usage Examples + +### Python SDK + +```python +import requests + +API_URL = "http://localhost:8080/v1" +API_KEY = "sk-agent-xxx" + +response = requests.post( + f"{API_URL}/chat/completions", + headers={"Authorization": f"Bearer {API_KEY}"}, + json={ + "message": "Create a Python script to fetch weather data", + "reasoning_mode": "plan_execute", + "session_id": "my-session-001" + } +) + +result = response.json() +print(result["response"]) +print(f"Steps taken: {len(result['steps'])}") +``` + +### cURL + +```bash +# Simple query +curl -X POST http://localhost:8080/v1/chat/completions \ + -H "Authorization: Bearer sk-agent-xxx" \ + -H "Content-Type: application/json" \ + -d '{ + "message": "Explain quantum computing", + "reasoning_mode": "react" + }' + +# Complex task with history +curl -X POST http://localhost:8080/v1/chat/completions \ + -H "Authorization: Bearer sk-agent-xxx" \ + -H "Content-Type: application/json" \ + -d '{ + "message": "Build a REST API with FastAPI", + "reasoning_mode": "plan_execute", + "max_iterations": 15 + }' +``` + +### OpenAI-Compatible API + +```python +from openai import OpenAI + +client = OpenAI( + base_url="http://localhost:8080/v1", + api_key="sk-agent-xxx" +) + +response = client.chat.completions.create( + model="agent/orchestrator", + messages=[{"role": "user", "content": "Hello!"}], + extra_body={"reasoning_mode": "auto"} +) +``` + +--- + +## πŸ“Š Architecture + +``` +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ Agentic LLM Hub β”‚ +β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ +β”‚ β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ Web UI β”‚ β”‚ VS Code IDE β”‚ β”‚ API β”‚ β”‚ +β”‚ β”‚ (:3000) β”‚ β”‚ (:8443) β”‚ β”‚ (:8080) β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β”‚ β”‚ β”‚ β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β”‚ β”‚ β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ Agent Core β”‚ β”‚ +β”‚ β”‚ (Reasoning Engines) β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β”‚ β”‚ β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ β”‚ β”‚ β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ Redis β”‚ β”‚ LiteLLM β”‚ β”‚ ChromaDB β”‚ β”‚ +β”‚ β”‚ (Cache) β”‚ β”‚ (Gateway) β”‚ β”‚ (Memory) β”‚ β”‚ +β”‚ β”‚(:6379) β”‚ β”‚ (:4000) β”‚ β”‚ (:8000) β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β”‚ β”‚ β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ β”‚ β”‚ β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β” β”‚ +β”‚ β”‚ Groq β”‚ β”‚ Claude β”‚ β”‚ Mistralβ”‚ β”‚ +β”‚ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β”‚ +β”‚ β”‚ β”‚ β”‚ β”‚ +β”‚ β”Œβ”€β”€β”€β”΄β”€β”€β” β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β” β”‚ +β”‚ β”‚ Kimi β”‚ β”‚DeepSeekβ”‚ β”‚ ... β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β”‚ β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ +``` + +--- + +## 🧰 Management Commands + +Use the Makefile for common operations: + +```bash +make setup # Initial setup +make start # Start all services (full profile) +make start-ide # Start with IDE only +make stop # Stop all services +make logs # View logs +make status # Check service status +make update # Pull latest and update images +make backup # Backup data directories +make clean # Remove containers (data preserved) +``` + +Or use Docker Compose profiles: + +```bash +# Core services only +docker-compose up -d + +# Full stack with IDE and UI +docker-compose --profile ide --profile ui up -d + +# With MCP tools +docker-compose --profile mcp up -d +``` + +--- ## πŸ“š Documentation -- [Setup Guide](docs/SETUP.md) -- [API Reference](docs/API.md) -- [Provider Guide](docs/PROVIDERS.md) +- **[Setup Guide](docs/SETUP.md)** β€” Detailed installation and configuration +- **[API Reference](docs/API.md)** β€” Complete API documentation with examples +- **[Provider Guide](docs/PROVIDERS.md)** β€” Provider setup and rate limits + +--- ## πŸ”„ Updates +Update to the latest version: + ```bash +# Automatic update +make update + +# Or manual git pull origin main docker-compose pull docker-compose up -d ``` + +--- + +## πŸ› Troubleshooting + +### Common Issues + +| Issue | Solution | +|-------|----------| +| Docker fails in LXC | Enable nesting: `features: nesting=1,keyctl=1` | +| Port conflicts | Edit `docker-compose.yml` port mappings | +| Permission denied | Run `chown -R 1000:1000 workspace/` | +| API not responding | Check `docker-compose logs agent-core` | +| Out of memory | Increase swap or reduce to core services only | + +### Health Checks + +```bash +# Check API health +curl http://localhost:8080/health + +# Check LiteLLM +curl http://localhost:4000/health + +# View all logs +make logs + +# Check container status +docker-compose ps +``` + +--- + +## πŸ›‘οΈ Security Considerations + +1. **Change default passwords** in `.env` before deploying +2. **Use HTTPS** in production (reverse proxy recommended) +3. **Restrict network access** to admin ports (8080, 8443) +4. **Rotate API keys** regularly +5. **Review provider rate limits** to prevent unexpected costs + +--- + +## 🀝 Contributing + +Contributions are welcome! Please: + +1. Fork the repository +2. Create a feature branch +3. Submit a pull request + +--- + +## πŸ“„ License + +This project is licensed under the MIT License. + +--- + +
+ +**[⬆ Back to Top](#-agentic-llm-hub)** + +Built with ❀️ for the self-hosting community + +