<div align="center">

# 🤖 Agentic LLM Hub

**Self-hosted AI agent platform with multi-provider LLM aggregation, advanced reasoning engines, and integrated development environment.**

[![Docker](https://img.shields.io/badge/docker-ready-blue?logo=docker)](https://www.docker.com/)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
[![Platform](https://img.shields.io/badge/platform-linux-lightgrey?logo=linux)](https://www.linux.org/)

[Quick Start](#-quick-start) • [Features](#-features) • [Documentation](docs/) • [API Reference](docs/API.md)

</div>

---

## 📋 Overview

Agentic LLM Hub is a production-ready, self-hosted platform that unifies multiple LLM providers behind a single API gateway. It features intelligent request routing, multiple reasoning engines (ReAct, Plan-and-Execute, Reflexion), persistent memory via vector storage, and a complete web-based IDE with AI assistance.

### Why Agentic LLM Hub?

- **🔌 Multi-Provider Aggregation** — Seamlessly route between 7+ LLM providers with automatic failover
- **🧠 Advanced Reasoning** — Choose from multiple reasoning modes based on task complexity
- **💰 Cost Optimization** — Free tier providers prioritized, paid providers as fallback
- **🖥️ Complete Workspace** — Browser-based VS Code with AI assistant integration
- **🔧 Extensible** — MCP tool ecosystem for custom integrations
- **📊 Persistent Memory** — Vector-based conversation history and knowledge storage

---

## ✨ Features

| Component | Technology | Purpose |
|-----------|------------|---------|
| **LLM Gateway** | [LiteLLM](https://github.com/BerriAI/litellm) | Unified API for 7+ providers with load balancing |
| **Reasoning Engine** | Custom implementation | ReAct, Plan-and-Execute, Reflexion modes |
| **Vector Memory** | [ChromaDB](https://www.trychroma.com/) | Persistent embeddings & conversation history |
| **Cache Layer** | Redis | Response caching & session management |
| **Web IDE** | [code-server](https://github.com/coder/code-server) | VS Code in browser with Continue.dev |
| **Chat Interface** | [Open WebUI](https://github.com/open-webui/open-webui) | Modern conversational UI |
| **Tool Gateway** | MCP Gateway | Extensible tool ecosystem |

### Supported Providers

| Provider | Free Tier | Best For |
|----------|-----------|----------|
| **Groq** | 20 RPM | Speed, quick coding |
| **Mistral** | 1B tokens/month | High volume processing |
| **OpenRouter** | 50 req/day | Universal fallback access |
| **Anthropic Claude** | $5 trial | Complex reasoning |
| **Moonshot Kimi** | $5 signup | Coding, 128K context |
| **DeepSeek** | Pay-as-you-go | Cheapest reasoning |
| **Cohere** | 1K calls/month | Embeddings |

---

## 🚀 Quick Start

### Prerequisites

- **OS**: Debian 12, Ubuntu 22.04+, or Proxmox LXC
- **RAM**: 4GB minimum (8GB recommended with IDE)
- **Storage**: 20GB free space
- **Docker** & Docker Compose

### Installation

```bash
# 1. Clone the repository
git clone https://github.com/yourusername/llm-hub.git
cd llm-hub

# 2. Configure environment
cp .env.example .env
nano .env  # Add your API keys

# 3. Deploy
make setup
make start
```

### Or use scripts directly:

```bash
chmod +x setup.sh start.sh
./setup.sh && ./start.sh full
```

---

## 🌐 Access Points

Once running, access the services at:

| Service | URL | Description |
|---------|-----|-------------|
| 📝 **Web UI** | `http://localhost:3000` | Chat interface with Open WebUI |
| 💻 **VS Code IDE** | `http://localhost:8443` | Full IDE with AI assistant |
| 🔌 **Agent API** | `http://localhost:8080/v1` | Main API endpoint |
| ⚡ **LiteLLM** | `http://localhost:4000` | LLM Gateway & model management |
| 🔧 **MCP Tools** | `http://localhost:8001/docs` | Tool OpenAPI documentation |
| 🧠 **ChromaDB** | `http://localhost:8000` | Vector memory dashboard |

---

## 🧠 Reasoning Modes

Choose the right reasoning strategy for your task:

| Mode | Description | Speed | Accuracy | Best For |
|------|-------------|-------|----------|----------|
| `react` | Iterative thought-action loops | ⚡ Fast | ⭐⭐⭐ Medium | Simple Q&A, debugging |
| `plan_execute` | Plan first, then execute | 🚀 Medium | ⭐⭐⭐⭐ High | Multi-step tasks, automation |
| `reflexion` | Self-correcting with verification | 🐢 Slow | ⭐⭐⭐⭐⭐ Very High | Code review, critical analysis |
| `auto` | Automatic mode selection | ⚡ Variable | ⭐⭐⭐⭐ Adaptive | General purpose |

Set your default mode in `.env`:
```bash
DEFAULT_REASONING=auto
```

Or specify per-request:
```bash
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer sk-agent-xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Refactor this codebase",
    "reasoning_mode": "reflexion"
  }'
```

---

## 🛠️ Configuration

### 1. API Keys

Edit `.env` and add at least one provider:

```bash
# Required: At least one LLM provider
GROQ_API_KEY_1=gsk_xxx
MISTRAL_API_KEY=your_key

# Recommended: Multiple providers for redundancy
ANTHROPIC_API_KEY=sk-ant-xxx
MOONSHOT_API_KEY=sk-xxx
OPENROUTER_API_KEY=sk-or-xxx
```

### 2. Security

Change default passwords in `.env`:

```bash
# Code-Server access
IDE_PASSWORD=your-secure-password
IDE_SUDO_PASSWORD=your-admin-password

# API access
MASTER_KEY=sk-agent-$(openssl rand -hex 16)
```

### 3. Advanced Settings

```bash
# Enable self-reflection
ENABLE_REFLECTION=true

# Maximum iterations per request
MAX_ITERATIONS=10

# Knowledge graph (requires more RAM)
ENABLE_KNOWLEDGE_GRAPH=false
```

---

## 💻 Usage Examples

### Python SDK

```python
import requests

API_URL = "http://localhost:8080/v1"
API_KEY = "sk-agent-xxx"

response = requests.post(
    f"{API_URL}/chat/completions",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "message": "Create a Python script to fetch weather data",
        "reasoning_mode": "plan_execute",
        "session_id": "my-session-001"
    }
)

result = response.json()
print(result["response"])
print(f"Steps taken: {len(result['steps'])}")
```

### cURL

```bash
# Simple query
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer sk-agent-xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Explain quantum computing",
    "reasoning_mode": "react"
  }'

# Complex task with history
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer sk-agent-xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Build a REST API with FastAPI",
    "reasoning_mode": "plan_execute",
    "max_iterations": 15
  }'
```

### OpenAI-Compatible API

```python
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="sk-agent-xxx"
)

response = client.chat.completions.create(
    model="agent/orchestrator",
    messages=[{"role": "user", "content": "Hello!"}],
    extra_body={"reasoning_mode": "auto"}
)
```

---

## 📊 Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                     Agentic LLM Hub                         │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│  │   Web UI     │  │  VS Code IDE │  │    API       │     │
│  │  (:3000)     │  │   (:8443)    │  │  (:8080)     │     │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘     │
│         │                 │                 │              │
│         └─────────────────┼─────────────────┘              │
│                           │                                │
│              ┌────────────┴────────────┐                  │
│              │     Agent Core          │                  │
│              │  (Reasoning Engines)    │                  │
│              └────────────┬────────────┘                  │
│                           │                                │
│       ┌───────────────────┼───────────────────┐           │
│       │                   │                   │            │
│  ┌────┴────┐       ┌─────┴─────┐      ┌──────┴──────┐     │
│  │  Redis  │       │  LiteLLM  │      │  ChromaDB   │     │
│  │ (Cache) │       │ (Gateway) │      │  (Memory)   │     │
│  │(:6379)  │       │  (:4000)  │      │   (:8000)   │     │
│  └─────────┘       └─────┬─────┘      └─────────────┘     │
│                          │                                 │
│            ┌─────────────┼─────────────┐                  │
│            │             │             │                   │
│      ┌─────┴──┐    ┌─────┴──┐   ┌─────┴──┐               │
│      │  Groq  │    │ Claude │   │ Mistral│               │
│      └───┬────┘    └───┬────┘   └───┬────┘               │
│          │             │             │                     │
│      ┌───┴──┐     ┌────┴───┐   ┌────┴───┐                │
│      │ Kimi │     │DeepSeek│   │  ...   │                │
│      └──────┘     └────────┘   └────────┘                │
│                                                             │
└─────────────────────────────────────────────────────────────┘
```

---

## 🧰 Management Commands

Use the Makefile for common operations:

```bash
make setup      # Initial setup
make start      # Start all services (full profile)
make start-ide  # Start with IDE only
make stop       # Stop all services
make logs       # View logs
make status     # Check service status
make update     # Pull latest and update images
make backup     # Backup data directories
make clean      # Remove containers (data preserved)
```

Or use Docker Compose profiles:

```bash
# Core services only
docker-compose up -d

# Full stack with IDE and UI
docker-compose --profile ide --profile ui up -d

# With MCP tools
docker-compose --profile mcp up -d
```

---

## 📚 Documentation

- **[Setup Guide](docs/SETUP.md)** — Detailed installation and configuration
- **[API Reference](docs/API.md)** — Complete API documentation with examples
- **[Provider Guide](docs/PROVIDERS.md)** — Provider setup and rate limits

---

## 🔄 Updates

Update to the latest version:

```bash
# Automatic update
make update

# Or manual
git pull origin main
docker-compose pull
docker-compose up -d
```

---

## 🐛 Troubleshooting

### Common Issues

| Issue | Solution |
|-------|----------|
| Docker fails in LXC | Enable nesting: `features: nesting=1,keyctl=1` |
| Port conflicts | Edit `docker-compose.yml` port mappings |
| Permission denied | Run `chown -R 1000:1000 workspace/` |
| API not responding | Check `docker-compose logs agent-core` |
| Out of memory | Increase swap or reduce to core services only |

### Health Checks

```bash
# Check API health
curl http://localhost:8080/health

# Check LiteLLM
curl http://localhost:4000/health

# View all logs
make logs

# Check container status
docker-compose ps
```

---

## 🛡️ Security Considerations

1. **Change default passwords** in `.env` before deploying
2. **Use HTTPS** in production (reverse proxy recommended)
3. **Restrict network access** to admin ports (8080, 8443)
4. **Rotate API keys** regularly
5. **Review provider rate limits** to prevent unexpected costs

---

## 🤝 Contributing

Contributions are welcome! Please:

1. Fork the repository
2. Create a feature branch
3. Submit a pull request

---

## 📄 License

This project is licensed under the MIT License.

---

<div align="center">

**[⬆ Back to Top](#-agentic-llm-hub)**

Built with ❤️ for the self-hosting community

</div>