# πŸ€– Agentic LLM Hub **Self-hosted AI agent platform with multi-provider LLM aggregation, advanced reasoning engines, and integrated development environment.** [![Docker](https://img.shields.io/badge/docker-ready-blue?logo=docker)](https://www.docker.com/) [![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE) [![Platform](https://img.shields.io/badge/platform-linux-lightgrey?logo=linux)](https://www.linux.org/) [Quick Start](#-quick-start) β€’ [Features](#-features) β€’ [Documentation](docs/) β€’ [API Reference](docs/API.md)
--- ## πŸ“‹ Overview Agentic LLM Hub is a production-ready, self-hosted platform that unifies multiple LLM providers behind a single API gateway. It features intelligent request routing, multiple reasoning engines (ReAct, Plan-and-Execute, Reflexion), persistent memory via vector storage, and a complete web-based IDE with AI assistance. ### Why Agentic LLM Hub? - **πŸ”Œ Multi-Provider Aggregation** β€” Seamlessly route between 7+ LLM providers with automatic failover - **🧠 Advanced Reasoning** β€” Choose from multiple reasoning modes based on task complexity - **πŸ’° Cost Optimization** β€” Free tier providers prioritized, paid providers as fallback - **πŸ–₯️ Complete Workspace** β€” Browser-based VS Code with AI assistant integration - **πŸ”§ Extensible** β€” MCP tool ecosystem for custom integrations - **πŸ“Š Persistent Memory** β€” Vector-based conversation history and knowledge storage --- ## ✨ Features | Component | Technology | Purpose | |-----------|------------|---------| | **LLM Gateway** | [LiteLLM](https://github.com/BerriAI/litellm) | Unified API for 7+ providers with load balancing | | **Reasoning Engine** | Custom implementation | ReAct, Plan-and-Execute, Reflexion modes | | **Vector Memory** | [ChromaDB](https://www.trychroma.com/) | Persistent embeddings & conversation history | | **Cache Layer** | Redis | Response caching & session management | | **Web IDE** | [code-server](https://github.com/coder/code-server) | VS Code in browser with Continue.dev | | **Chat Interface** | [Open WebUI](https://github.com/open-webui/open-webui) | Modern conversational UI | | **Tool Gateway** | MCP Gateway | Extensible tool ecosystem | ### Supported Providers | Provider | Free Tier | Best For | |----------|-----------|----------| | **Groq** | 20 RPM | Speed, quick coding | | **Mistral** | 1B tokens/month | High volume processing | | **OpenRouter** | 50 req/day | Universal fallback access | | **Anthropic Claude** | $5 trial | Complex reasoning | | **Moonshot Kimi** | $5 signup | Coding, 128K context | | **DeepSeek** | Pay-as-you-go | Cheapest reasoning | | **Cohere** | 1K calls/month | Embeddings | --- ## πŸš€ Quick Start ### Prerequisites - **OS**: Debian 12, Ubuntu 22.04+, or Proxmox LXC - **RAM**: 4GB minimum (8GB recommended with IDE) - **Storage**: 20GB free space - **Docker** & Docker Compose ### Installation ```bash # 1. Clone the repository git clone https://github.com/yourusername/llm-hub.git cd llm-hub # 2. Configure environment cp .env.example .env nano .env # Add your API keys # 3. Deploy make setup make start ``` ### Or use scripts directly: ```bash chmod +x setup.sh start.sh ./setup.sh && ./start.sh full ``` --- ## 🌐 Access Points Once running, access the services at: | Service | URL | Description | |---------|-----|-------------| | πŸ“ **Web UI** | `http://localhost:3000` | Chat interface with Open WebUI | | πŸ’» **VS Code IDE** | `http://localhost:8443` | Full IDE with AI assistant | | πŸ”Œ **Agent API** | `http://localhost:8080/v1` | Main API endpoint | | ⚑ **LiteLLM** | `http://localhost:4000` | LLM Gateway & model management | | πŸ”§ **MCP Tools** | `http://localhost:8001/docs` | Tool OpenAPI documentation | | 🧠 **ChromaDB** | `http://localhost:8000` | Vector memory dashboard | --- ## 🧠 Reasoning Modes Choose the right reasoning strategy for your task: | Mode | Description | Speed | Accuracy | Best For | |------|-------------|-------|----------|----------| | `react` | Iterative thought-action loops | ⚑ Fast | ⭐⭐⭐ Medium | Simple Q&A, debugging | | `plan_execute` | Plan first, then execute | πŸš€ Medium | ⭐⭐⭐⭐ High | Multi-step tasks, automation | | `reflexion` | Self-correcting with verification | 🐒 Slow | ⭐⭐⭐⭐⭐ Very High | Code review, critical analysis | | `auto` | Automatic mode selection | ⚑ Variable | ⭐⭐⭐⭐ Adaptive | General purpose | Set your default mode in `.env`: ```bash DEFAULT_REASONING=auto ``` Or specify per-request: ```bash curl -X POST http://localhost:8080/v1/chat/completions \ -H "Authorization: Bearer sk-agent-xxx" \ -H "Content-Type: application/json" \ -d '{ "message": "Refactor this codebase", "reasoning_mode": "reflexion" }' ``` --- ## πŸ› οΈ Configuration ### 1. API Keys Edit `.env` and add at least one provider: ```bash # Required: At least one LLM provider GROQ_API_KEY_1=gsk_xxx MISTRAL_API_KEY=your_key # Recommended: Multiple providers for redundancy ANTHROPIC_API_KEY=sk-ant-xxx MOONSHOT_API_KEY=sk-xxx OPENROUTER_API_KEY=sk-or-xxx ``` ### 2. Security Change default passwords in `.env`: ```bash # Code-Server access IDE_PASSWORD=your-secure-password IDE_SUDO_PASSWORD=your-admin-password # API access MASTER_KEY=sk-agent-$(openssl rand -hex 16) ``` ### 3. Advanced Settings ```bash # Enable self-reflection ENABLE_REFLECTION=true # Maximum iterations per request MAX_ITERATIONS=10 # Knowledge graph (requires more RAM) ENABLE_KNOWLEDGE_GRAPH=false ``` --- ## πŸ’» Usage Examples ### Python SDK ```python import requests API_URL = "http://localhost:8080/v1" API_KEY = "sk-agent-xxx" response = requests.post( f"{API_URL}/chat/completions", headers={"Authorization": f"Bearer {API_KEY}"}, json={ "message": "Create a Python script to fetch weather data", "reasoning_mode": "plan_execute", "session_id": "my-session-001" } ) result = response.json() print(result["response"]) print(f"Steps taken: {len(result['steps'])}") ``` ### cURL ```bash # Simple query curl -X POST http://localhost:8080/v1/chat/completions \ -H "Authorization: Bearer sk-agent-xxx" \ -H "Content-Type: application/json" \ -d '{ "message": "Explain quantum computing", "reasoning_mode": "react" }' # Complex task with history curl -X POST http://localhost:8080/v1/chat/completions \ -H "Authorization: Bearer sk-agent-xxx" \ -H "Content-Type: application/json" \ -d '{ "message": "Build a REST API with FastAPI", "reasoning_mode": "plan_execute", "max_iterations": 15 }' ``` ### OpenAI-Compatible API ```python from openai import OpenAI client = OpenAI( base_url="http://localhost:8080/v1", api_key="sk-agent-xxx" ) response = client.chat.completions.create( model="agent/orchestrator", messages=[{"role": "user", "content": "Hello!"}], extra_body={"reasoning_mode": "auto"} ) ``` --- ## πŸ“Š Architecture ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Agentic LLM Hub β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Web UI β”‚ β”‚ VS Code IDE β”‚ β”‚ API β”‚ β”‚ β”‚ β”‚ (:3000) β”‚ β”‚ (:8443) β”‚ β”‚ (:8080) β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Agent Core β”‚ β”‚ β”‚ β”‚ (Reasoning Engines) β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Redis β”‚ β”‚ LiteLLM β”‚ β”‚ ChromaDB β”‚ β”‚ β”‚ β”‚ (Cache) β”‚ β”‚ (Gateway) β”‚ β”‚ (Memory) β”‚ β”‚ β”‚ β”‚(:6379) β”‚ β”‚ (:4000) β”‚ β”‚ (:8000) β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β” β”‚ β”‚ β”‚ Groq β”‚ β”‚ Claude β”‚ β”‚ Mistralβ”‚ β”‚ β”‚ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”΄β”€β”€β” β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β” β”‚ β”‚ β”‚ Kimi β”‚ β”‚DeepSeekβ”‚ β”‚ ... β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` --- ## 🧰 Management Commands Use the Makefile for common operations: ```bash make setup # Initial setup make start # Start all services (full profile) make start-ide # Start with IDE only make stop # Stop all services make logs # View logs make status # Check service status make update # Pull latest and update images make backup # Backup data directories make clean # Remove containers (data preserved) ``` Or use Docker Compose profiles: ```bash # Core services only docker-compose up -d # Full stack with IDE and UI docker-compose --profile ide --profile ui up -d # With MCP tools docker-compose --profile mcp up -d ``` --- ## πŸ“š Documentation - **[Setup Guide](docs/SETUP.md)** β€” Detailed installation and configuration - **[API Reference](docs/API.md)** β€” Complete API documentation with examples - **[Provider Guide](docs/PROVIDERS.md)** β€” Provider setup and rate limits --- ## πŸ”„ Updates Update to the latest version: ```bash # Automatic update make update # Or manual git pull origin main docker-compose pull docker-compose up -d ``` --- ## πŸ› Troubleshooting ### Common Issues | Issue | Solution | |-------|----------| | Docker fails in LXC | Enable nesting: `features: nesting=1,keyctl=1` | | Port conflicts | Edit `docker-compose.yml` port mappings | | Permission denied | Run `chown -R 1000:1000 workspace/` | | API not responding | Check `docker-compose logs agent-core` | | Out of memory | Increase swap or reduce to core services only | ### Health Checks ```bash # Check API health curl http://localhost:8080/health # Check LiteLLM curl http://localhost:4000/health # View all logs make logs # Check container status docker-compose ps ``` --- ## πŸ›‘οΈ Security Considerations 1. **Change default passwords** in `.env` before deploying 2. **Use HTTPS** in production (reverse proxy recommended) 3. **Restrict network access** to admin ports (8080, 8443) 4. **Rotate API keys** regularly 5. **Review provider rate limits** to prevent unexpected costs --- ## 🀝 Contributing Contributions are welcome! Please: 1. Fork the repository 2. Create a feature branch 3. Submit a pull request --- ## πŸ“„ License This project is licensed under the MIT License. ---
**[⬆ Back to Top](#-agentic-llm-hub)** Built with ❀️ for the self-hosting community