# π€ Agentic LLM Hub
**Self-hosted AI agent platform with multi-provider LLM aggregation, advanced reasoning engines, and integrated development environment.**
[](https://www.docker.com/)
[](LICENSE)
[](https://www.linux.org/)
[Quick Start](#-quick-start) β’ [Features](#-features) β’ [Documentation](docs/) β’ [API Reference](docs/API.md)
---
## π Overview
Agentic LLM Hub is a production-ready, self-hosted platform that unifies multiple LLM providers behind a single API gateway. It features intelligent request routing, multiple reasoning engines (ReAct, Plan-and-Execute, Reflexion), persistent memory via vector storage, and a complete web-based IDE with AI assistance.
### Why Agentic LLM Hub?
- **π Multi-Provider Aggregation** β Seamlessly route between 7+ LLM providers with automatic failover
- **π§ Advanced Reasoning** β Choose from multiple reasoning modes based on task complexity
- **π° Cost Optimization** β Free tier providers prioritized, paid providers as fallback
- **π₯οΈ Complete Workspace** β Browser-based VS Code with AI assistant integration
- **π§ Extensible** β MCP tool ecosystem for custom integrations
- **π Persistent Memory** β Vector-based conversation history and knowledge storage
---
## β¨ Features
| Component | Technology | Purpose |
|-----------|------------|---------|
| **LLM Gateway** | [LiteLLM](https://github.com/BerriAI/litellm) | Unified API for 7+ providers with load balancing |
| **Reasoning Engine** | Custom implementation | ReAct, Plan-and-Execute, Reflexion modes |
| **Vector Memory** | [ChromaDB](https://www.trychroma.com/) | Persistent embeddings & conversation history |
| **Cache Layer** | Redis | Response caching & session management |
| **Web IDE** | [code-server](https://github.com/coder/code-server) | VS Code in browser with Continue.dev |
| **Chat Interface** | [Open WebUI](https://github.com/open-webui/open-webui) | Modern conversational UI |
| **Tool Gateway** | MCP Gateway | Extensible tool ecosystem |
### Supported Providers
| Provider | Free Tier | Best For |
|----------|-----------|----------|
| **Groq** | 20 RPM | Speed, quick coding |
| **Mistral** | 1B tokens/month | High volume processing |
| **OpenRouter** | 50 req/day | Universal fallback access |
| **Anthropic Claude** | $5 trial | Complex reasoning |
| **Moonshot Kimi** | $5 signup | Coding, 128K context |
| **DeepSeek** | Pay-as-you-go | Cheapest reasoning |
| **Cohere** | 1K calls/month | Embeddings |
---
## π Quick Start
### Prerequisites
- **OS**: Debian 12, Ubuntu 22.04+, or Proxmox LXC
- **RAM**: 4GB minimum (8GB recommended with IDE)
- **Storage**: 20GB free space
- **Docker** & Docker Compose
### Installation
```bash
# 1. Clone the repository
git clone https://github.com/yourusername/llm-hub.git
cd llm-hub
# 2. Configure environment
cp .env.example .env
nano .env # Add your API keys
# 3. Deploy
make setup
make start
```
### Or use scripts directly:
```bash
chmod +x setup.sh start.sh
./setup.sh && ./start.sh full
```
---
## π Access Points
Once running, access the services at:
| Service | URL | Description |
|---------|-----|-------------|
| π **Web UI** | `http://localhost:3000` | Chat interface with Open WebUI |
| π» **VS Code IDE** | `http://localhost:8443` | Full IDE with AI assistant |
| π **Agent API** | `http://localhost:8080/v1` | Main API endpoint |
| β‘ **LiteLLM** | `http://localhost:4000` | LLM Gateway & model management |
| π§ **MCP Tools** | `http://localhost:8001/docs` | Tool OpenAPI documentation |
| π§ **ChromaDB** | `http://localhost:8000` | Vector memory dashboard |
---
## π§ Reasoning Modes
Choose the right reasoning strategy for your task:
| Mode | Description | Speed | Accuracy | Best For |
|------|-------------|-------|----------|----------|
| `react` | Iterative thought-action loops | β‘ Fast | βββ Medium | Simple Q&A, debugging |
| `plan_execute` | Plan first, then execute | π Medium | ββββ High | Multi-step tasks, automation |
| `reflexion` | Self-correcting with verification | π’ Slow | βββββ Very High | Code review, critical analysis |
| `auto` | Automatic mode selection | β‘ Variable | ββββ Adaptive | General purpose |
Set your default mode in `.env`:
```bash
DEFAULT_REASONING=auto
```
Or specify per-request:
```bash
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer sk-agent-xxx" \
-H "Content-Type: application/json" \
-d '{
"message": "Refactor this codebase",
"reasoning_mode": "reflexion"
}'
```
---
## π οΈ Configuration
### 1. API Keys
Edit `.env` and add at least one provider:
```bash
# Required: At least one LLM provider
GROQ_API_KEY_1=gsk_xxx
MISTRAL_API_KEY=your_key
# Recommended: Multiple providers for redundancy
ANTHROPIC_API_KEY=sk-ant-xxx
MOONSHOT_API_KEY=sk-xxx
OPENROUTER_API_KEY=sk-or-xxx
```
### 2. Security
Change default passwords in `.env`:
```bash
# Code-Server access
IDE_PASSWORD=your-secure-password
IDE_SUDO_PASSWORD=your-admin-password
# API access
MASTER_KEY=sk-agent-$(openssl rand -hex 16)
```
### 3. Advanced Settings
```bash
# Enable self-reflection
ENABLE_REFLECTION=true
# Maximum iterations per request
MAX_ITERATIONS=10
# Knowledge graph (requires more RAM)
ENABLE_KNOWLEDGE_GRAPH=false
```
---
## π» Usage Examples
### Python SDK
```python
import requests
API_URL = "http://localhost:8080/v1"
API_KEY = "sk-agent-xxx"
response = requests.post(
f"{API_URL}/chat/completions",
headers={"Authorization": f"Bearer {API_KEY}"},
json={
"message": "Create a Python script to fetch weather data",
"reasoning_mode": "plan_execute",
"session_id": "my-session-001"
}
)
result = response.json()
print(result["response"])
print(f"Steps taken: {len(result['steps'])}")
```
### cURL
```bash
# Simple query
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer sk-agent-xxx" \
-H "Content-Type: application/json" \
-d '{
"message": "Explain quantum computing",
"reasoning_mode": "react"
}'
# Complex task with history
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer sk-agent-xxx" \
-H "Content-Type: application/json" \
-d '{
"message": "Build a REST API with FastAPI",
"reasoning_mode": "plan_execute",
"max_iterations": 15
}'
```
### OpenAI-Compatible API
```python
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="sk-agent-xxx"
)
response = client.chat.completions.create(
model="agent/orchestrator",
messages=[{"role": "user", "content": "Hello!"}],
extra_body={"reasoning_mode": "auto"}
)
```
---
## π Architecture
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Agentic LLM Hub β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Web UI β β VS Code IDE β β API β β
β β (:3000) β β (:8443) β β (:8080) β β
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ β
β β β β β
β βββββββββββββββββββΌββββββββββββββββββ β
β β β
β ββββββββββββββ΄βββββββββββββ β
β β Agent Core β β
β β (Reasoning Engines) β β
β ββββββββββββββ¬βββββββββββββ β
β β β
β βββββββββββββββββββββΌββββββββββββββββββββ β
β β β β β
β ββββββ΄βββββ βββββββ΄ββββββ ββββββββ΄βββββββ β
β β Redis β β LiteLLM β β ChromaDB β β
β β (Cache) β β (Gateway) β β (Memory) β β
β β(:6379) β β (:4000) β β (:8000) β β
β βββββββββββ βββββββ¬ββββββ βββββββββββββββ β
β β β
β βββββββββββββββΌββββββββββββββ β
β β β β β
β βββββββ΄βββ βββββββ΄βββ βββββββ΄βββ β
β β Groq β β Claude β β Mistralβ β
β βββββ¬βββββ βββββ¬βββββ βββββ¬βββββ β
β β β β β
β βββββ΄βββ ββββββ΄ββββ ββββββ΄ββββ β
β β Kimi β βDeepSeekβ β ... β β
β ββββββββ ββββββββββ ββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
---
## π§° Management Commands
Use the Makefile for common operations:
```bash
make setup # Initial setup
make start # Start all services (full profile)
make start-ide # Start with IDE only
make stop # Stop all services
make logs # View logs
make status # Check service status
make update # Pull latest and update images
make backup # Backup data directories
make clean # Remove containers (data preserved)
```
Or use Docker Compose profiles:
```bash
# Core services only
docker-compose up -d
# Full stack with IDE and UI
docker-compose --profile ide --profile ui up -d
# With MCP tools
docker-compose --profile mcp up -d
```
---
## π Documentation
- **[Setup Guide](docs/SETUP.md)** β Detailed installation and configuration
- **[API Reference](docs/API.md)** β Complete API documentation with examples
- **[Provider Guide](docs/PROVIDERS.md)** β Provider setup and rate limits
---
## π Updates
Update to the latest version:
```bash
# Automatic update
make update
# Or manual
git pull origin main
docker-compose pull
docker-compose up -d
```
---
## π Troubleshooting
### Common Issues
| Issue | Solution |
|-------|----------|
| Docker fails in LXC | Enable nesting: `features: nesting=1,keyctl=1` |
| Port conflicts | Edit `docker-compose.yml` port mappings |
| Permission denied | Run `chown -R 1000:1000 workspace/` |
| API not responding | Check `docker-compose logs agent-core` |
| Out of memory | Increase swap or reduce to core services only |
### Health Checks
```bash
# Check API health
curl http://localhost:8080/health
# Check LiteLLM
curl http://localhost:4000/health
# View all logs
make logs
# Check container status
docker-compose ps
```
---
## π‘οΈ Security Considerations
1. **Change default passwords** in `.env` before deploying
2. **Use HTTPS** in production (reverse proxy recommended)
3. **Restrict network access** to admin ports (8080, 8443)
4. **Rotate API keys** regularly
5. **Review provider rate limits** to prevent unexpected costs
---
## π€ Contributing
Contributions are welcome! Please:
1. Fork the repository
2. Create a feature branch
3. Submit a pull request
---
## π License
This project is licensed under the MIT License.
---
**[β¬ Back to Top](#-agentic-llm-hub)**
Built with β€οΈ for the self-hosting community