docs: rewrite README with comprehensive documentation and examples
This commit is contained in:
parent
2cafb31cb4
commit
23f2e9541e
403
README.md
403
README.md
|
|
@ -1,60 +1,407 @@
|
|||
<div align="center">
|
||||
|
||||
# 🤖 Agentic LLM Hub
|
||||
|
||||
Self-hosted AI agent platform with multi-provider LLM aggregation, reasoning engines (ReAct, Plan-and-Execute, Reflexion), MCP tools, and web IDE.
|
||||
**Self-hosted AI agent platform with multi-provider LLM aggregation, advanced reasoning engines, and integrated development environment.**
|
||||
|
||||
[](https://www.docker.com/)
|
||||
[](LICENSE)
|
||||
[](https://www.linux.org/)
|
||||
|
||||
[Quick Start](#-quick-start) • [Features](#-features) • [Documentation](docs/) • [API Reference](docs/API.md)
|
||||
|
||||
</div>
|
||||
|
||||
---
|
||||
|
||||
## 📋 Overview
|
||||
|
||||
Agentic LLM Hub is a production-ready, self-hosted platform that unifies multiple LLM providers behind a single API gateway. It features intelligent request routing, multiple reasoning engines (ReAct, Plan-and-Execute, Reflexion), persistent memory via vector storage, and a complete web-based IDE with AI assistance.
|
||||
|
||||
### Why Agentic LLM Hub?
|
||||
|
||||
- **🔌 Multi-Provider Aggregation** — Seamlessly route between 7+ LLM providers with automatic failover
|
||||
- **🧠 Advanced Reasoning** — Choose from multiple reasoning modes based on task complexity
|
||||
- **💰 Cost Optimization** — Free tier providers prioritized, paid providers as fallback
|
||||
- **🖥️ Complete Workspace** — Browser-based VS Code with AI assistant integration
|
||||
- **🔧 Extensible** — MCP tool ecosystem for custom integrations
|
||||
- **📊 Persistent Memory** — Vector-based conversation history and knowledge storage
|
||||
|
||||
---
|
||||
|
||||
## ✨ Features
|
||||
|
||||
| Component | Technology | Purpose |
|
||||
|-----------|------------|---------|
|
||||
| **LLM Gateway** | [LiteLLM](https://github.com/BerriAI/litellm) | Unified API for 7+ providers with load balancing |
|
||||
| **Reasoning Engine** | Custom implementation | ReAct, Plan-and-Execute, Reflexion modes |
|
||||
| **Vector Memory** | [ChromaDB](https://www.trychroma.com/) | Persistent embeddings & conversation history |
|
||||
| **Cache Layer** | Redis | Response caching & session management |
|
||||
| **Web IDE** | [code-server](https://github.com/coder/code-server) | VS Code in browser with Continue.dev |
|
||||
| **Chat Interface** | [Open WebUI](https://github.com/open-webui/open-webui) | Modern conversational UI |
|
||||
| **Tool Gateway** | MCP Gateway | Extensible tool ecosystem |
|
||||
|
||||
### Supported Providers
|
||||
|
||||
| Provider | Free Tier | Best For |
|
||||
|----------|-----------|----------|
|
||||
| **Groq** | 20 RPM | Speed, quick coding |
|
||||
| **Mistral** | 1B tokens/month | High volume processing |
|
||||
| **OpenRouter** | 50 req/day | Universal fallback access |
|
||||
| **Anthropic Claude** | $5 trial | Complex reasoning |
|
||||
| **Moonshot Kimi** | $5 signup | Coding, 128K context |
|
||||
| **DeepSeek** | Pay-as-you-go | Cheapest reasoning |
|
||||
| **Cohere** | 1K calls/month | Embeddings |
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Quick Start
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- **OS**: Debian 12, Ubuntu 22.04+, or Proxmox LXC
|
||||
- **RAM**: 4GB minimum (8GB recommended with IDE)
|
||||
- **Storage**: 20GB free space
|
||||
- **Docker** & Docker Compose
|
||||
|
||||
### Installation
|
||||
|
||||
```bash
|
||||
# 1. Clone from your Gitea
|
||||
git clone https://gitea.yourdomain.com/youruser/llm-hub.git
|
||||
# 1. Clone the repository
|
||||
git clone https://github.com/yourusername/llm-hub.git
|
||||
cd llm-hub
|
||||
|
||||
# 2. Configure
|
||||
# 2. Configure environment
|
||||
cp .env.example .env
|
||||
nano .env # Add your API keys
|
||||
|
||||
# 3. Deploy
|
||||
./setup.sh && ./start.sh
|
||||
make setup
|
||||
make start
|
||||
```
|
||||
|
||||
## 📡 Access Points
|
||||
### Or use scripts directly:
|
||||
|
||||
```bash
|
||||
chmod +x setup.sh start.sh
|
||||
./setup.sh && ./start.sh full
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🌐 Access Points
|
||||
|
||||
Once running, access the services at:
|
||||
|
||||
| Service | URL | Description |
|
||||
|---------|-----|-------------|
|
||||
| VS Code IDE | `http://your-ip:8443` | Full IDE with Continue.dev |
|
||||
| Agent API | `http://your-ip:8080/v1` | Main API endpoint |
|
||||
| LiteLLM | `http://your-ip:4000` | LLM Gateway |
|
||||
| MCP Tools | `http://your-ip:8001/docs` | Tool OpenAPI docs |
|
||||
| ChromaDB | `http://your-ip:8000` | Vector memory |
|
||||
| Web UI | `http://your-ip:3000` | Chat interface |
|
||||
| 📝 **Web UI** | `http://localhost:3000` | Chat interface with Open WebUI |
|
||||
| 💻 **VS Code IDE** | `http://localhost:8443` | Full IDE with AI assistant |
|
||||
| 🔌 **Agent API** | `http://localhost:8080/v1` | Main API endpoint |
|
||||
| ⚡ **LiteLLM** | `http://localhost:4000` | LLM Gateway & model management |
|
||||
| 🔧 **MCP Tools** | `http://localhost:8001/docs` | Tool OpenAPI documentation |
|
||||
| 🧠 **ChromaDB** | `http://localhost:8000` | Vector memory dashboard |
|
||||
|
||||
## 🔧 Supported Providers
|
||||
|
||||
- **Groq** (Free tier, fast)
|
||||
- **Mistral** (1B tokens/month free)
|
||||
- **Anthropic Claude** (Trial credits)
|
||||
- **Moonshot Kimi** ($5 signup bonus)
|
||||
- **OpenRouter** (Free tier access)
|
||||
- **Cohere** (1K calls/month)
|
||||
- **DeepSeek** (Cheap reasoning)
|
||||
---
|
||||
|
||||
## 🧠 Reasoning Modes
|
||||
|
||||
- `react` - Fast iterative reasoning
|
||||
- `plan_execute` - Complex multi-step tasks
|
||||
- `reflexion` - Self-correcting with verification
|
||||
- `auto` - Automatic selection
|
||||
Choose the right reasoning strategy for your task:
|
||||
|
||||
| Mode | Description | Speed | Accuracy | Best For |
|
||||
|------|-------------|-------|----------|----------|
|
||||
| `react` | Iterative thought-action loops | ⚡ Fast | ⭐⭐⭐ Medium | Simple Q&A, debugging |
|
||||
| `plan_execute` | Plan first, then execute | 🚀 Medium | ⭐⭐⭐⭐ High | Multi-step tasks, automation |
|
||||
| `reflexion` | Self-correcting with verification | 🐢 Slow | ⭐⭐⭐⭐⭐ Very High | Code review, critical analysis |
|
||||
| `auto` | Automatic mode selection | ⚡ Variable | ⭐⭐⭐⭐ Adaptive | General purpose |
|
||||
|
||||
Set your default mode in `.env`:
|
||||
```bash
|
||||
DEFAULT_REASONING=auto
|
||||
```
|
||||
|
||||
Or specify per-request:
|
||||
```bash
|
||||
curl -X POST http://localhost:8080/v1/chat/completions \
|
||||
-H "Authorization: Bearer sk-agent-xxx" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"message": "Refactor this codebase",
|
||||
"reasoning_mode": "reflexion"
|
||||
}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🛠️ Configuration
|
||||
|
||||
### 1. API Keys
|
||||
|
||||
Edit `.env` and add at least one provider:
|
||||
|
||||
```bash
|
||||
# Required: At least one LLM provider
|
||||
GROQ_API_KEY_1=gsk_xxx
|
||||
MISTRAL_API_KEY=your_key
|
||||
|
||||
# Recommended: Multiple providers for redundancy
|
||||
ANTHROPIC_API_KEY=sk-ant-xxx
|
||||
MOONSHOT_API_KEY=sk-xxx
|
||||
OPENROUTER_API_KEY=sk-or-xxx
|
||||
```
|
||||
|
||||
### 2. Security
|
||||
|
||||
Change default passwords in `.env`:
|
||||
|
||||
```bash
|
||||
# Code-Server access
|
||||
IDE_PASSWORD=your-secure-password
|
||||
IDE_SUDO_PASSWORD=your-admin-password
|
||||
|
||||
# API access
|
||||
MASTER_KEY=sk-agent-$(openssl rand -hex 16)
|
||||
```
|
||||
|
||||
### 3. Advanced Settings
|
||||
|
||||
```bash
|
||||
# Enable self-reflection
|
||||
ENABLE_REFLECTION=true
|
||||
|
||||
# Maximum iterations per request
|
||||
MAX_ITERATIONS=10
|
||||
|
||||
# Knowledge graph (requires more RAM)
|
||||
ENABLE_KNOWLEDGE_GRAPH=false
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 💻 Usage Examples
|
||||
|
||||
### Python SDK
|
||||
|
||||
```python
|
||||
import requests
|
||||
|
||||
API_URL = "http://localhost:8080/v1"
|
||||
API_KEY = "sk-agent-xxx"
|
||||
|
||||
response = requests.post(
|
||||
f"{API_URL}/chat/completions",
|
||||
headers={"Authorization": f"Bearer {API_KEY}"},
|
||||
json={
|
||||
"message": "Create a Python script to fetch weather data",
|
||||
"reasoning_mode": "plan_execute",
|
||||
"session_id": "my-session-001"
|
||||
}
|
||||
)
|
||||
|
||||
result = response.json()
|
||||
print(result["response"])
|
||||
print(f"Steps taken: {len(result['steps'])}")
|
||||
```
|
||||
|
||||
### cURL
|
||||
|
||||
```bash
|
||||
# Simple query
|
||||
curl -X POST http://localhost:8080/v1/chat/completions \
|
||||
-H "Authorization: Bearer sk-agent-xxx" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"message": "Explain quantum computing",
|
||||
"reasoning_mode": "react"
|
||||
}'
|
||||
|
||||
# Complex task with history
|
||||
curl -X POST http://localhost:8080/v1/chat/completions \
|
||||
-H "Authorization: Bearer sk-agent-xxx" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"message": "Build a REST API with FastAPI",
|
||||
"reasoning_mode": "plan_execute",
|
||||
"max_iterations": 15
|
||||
}'
|
||||
```
|
||||
|
||||
### OpenAI-Compatible API
|
||||
|
||||
```python
|
||||
from openai import OpenAI
|
||||
|
||||
client = OpenAI(
|
||||
base_url="http://localhost:8080/v1",
|
||||
api_key="sk-agent-xxx"
|
||||
)
|
||||
|
||||
response = client.chat.completions.create(
|
||||
model="agent/orchestrator",
|
||||
messages=[{"role": "user", "content": "Hello!"}],
|
||||
extra_body={"reasoning_mode": "auto"}
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📊 Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Agentic LLM Hub │
|
||||
├─────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ Web UI │ │ VS Code IDE │ │ API │ │
|
||||
│ │ (:3000) │ │ (:8443) │ │ (:8080) │ │
|
||||
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
||||
│ │ │ │ │
|
||||
│ └─────────────────┼─────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌────────────┴────────────┐ │
|
||||
│ │ Agent Core │ │
|
||||
│ │ (Reasoning Engines) │ │
|
||||
│ └────────────┬────────────┘ │
|
||||
│ │ │
|
||||
│ ┌───────────────────┼───────────────────┐ │
|
||||
│ │ │ │ │
|
||||
│ ┌────┴────┐ ┌─────┴─────┐ ┌──────┴──────┐ │
|
||||
│ │ Redis │ │ LiteLLM │ │ ChromaDB │ │
|
||||
│ │ (Cache) │ │ (Gateway) │ │ (Memory) │ │
|
||||
│ │(:6379) │ │ (:4000) │ │ (:8000) │ │
|
||||
│ └─────────┘ └─────┬─────┘ └─────────────┘ │
|
||||
│ │ │
|
||||
│ ┌─────────────┼─────────────┐ │
|
||||
│ │ │ │ │
|
||||
│ ┌─────┴──┐ ┌─────┴──┐ ┌─────┴──┐ │
|
||||
│ │ Groq │ │ Claude │ │ Mistral│ │
|
||||
│ └───┬────┘ └───┬────┘ └───┬────┘ │
|
||||
│ │ │ │ │
|
||||
│ ┌───┴──┐ ┌────┴───┐ ┌────┴───┐ │
|
||||
│ │ Kimi │ │DeepSeek│ │ ... │ │
|
||||
│ └──────┘ └────────┘ └────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🧰 Management Commands
|
||||
|
||||
Use the Makefile for common operations:
|
||||
|
||||
```bash
|
||||
make setup # Initial setup
|
||||
make start # Start all services (full profile)
|
||||
make start-ide # Start with IDE only
|
||||
make stop # Stop all services
|
||||
make logs # View logs
|
||||
make status # Check service status
|
||||
make update # Pull latest and update images
|
||||
make backup # Backup data directories
|
||||
make clean # Remove containers (data preserved)
|
||||
```
|
||||
|
||||
Or use Docker Compose profiles:
|
||||
|
||||
```bash
|
||||
# Core services only
|
||||
docker-compose up -d
|
||||
|
||||
# Full stack with IDE and UI
|
||||
docker-compose --profile ide --profile ui up -d
|
||||
|
||||
# With MCP tools
|
||||
docker-compose --profile mcp up -d
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📚 Documentation
|
||||
|
||||
- [Setup Guide](docs/SETUP.md)
|
||||
- [API Reference](docs/API.md)
|
||||
- [Provider Guide](docs/PROVIDERS.md)
|
||||
- **[Setup Guide](docs/SETUP.md)** — Detailed installation and configuration
|
||||
- **[API Reference](docs/API.md)** — Complete API documentation with examples
|
||||
- **[Provider Guide](docs/PROVIDERS.md)** — Provider setup and rate limits
|
||||
|
||||
---
|
||||
|
||||
## 🔄 Updates
|
||||
|
||||
Update to the latest version:
|
||||
|
||||
```bash
|
||||
# Automatic update
|
||||
make update
|
||||
|
||||
# Or manual
|
||||
git pull origin main
|
||||
docker-compose pull
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🐛 Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
| Issue | Solution |
|
||||
|-------|----------|
|
||||
| Docker fails in LXC | Enable nesting: `features: nesting=1,keyctl=1` |
|
||||
| Port conflicts | Edit `docker-compose.yml` port mappings |
|
||||
| Permission denied | Run `chown -R 1000:1000 workspace/` |
|
||||
| API not responding | Check `docker-compose logs agent-core` |
|
||||
| Out of memory | Increase swap or reduce to core services only |
|
||||
|
||||
### Health Checks
|
||||
|
||||
```bash
|
||||
# Check API health
|
||||
curl http://localhost:8080/health
|
||||
|
||||
# Check LiteLLM
|
||||
curl http://localhost:4000/health
|
||||
|
||||
# View all logs
|
||||
make logs
|
||||
|
||||
# Check container status
|
||||
docker-compose ps
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🛡️ Security Considerations
|
||||
|
||||
1. **Change default passwords** in `.env` before deploying
|
||||
2. **Use HTTPS** in production (reverse proxy recommended)
|
||||
3. **Restrict network access** to admin ports (8080, 8443)
|
||||
4. **Rotate API keys** regularly
|
||||
5. **Review provider rate limits** to prevent unexpected costs
|
||||
|
||||
---
|
||||
|
||||
## 🤝 Contributing
|
||||
|
||||
Contributions are welcome! Please:
|
||||
|
||||
1. Fork the repository
|
||||
2. Create a feature branch
|
||||
3. Submit a pull request
|
||||
|
||||
---
|
||||
|
||||
## 📄 License
|
||||
|
||||
This project is licensed under the MIT License.
|
||||
|
||||
---
|
||||
|
||||
<div align="center">
|
||||
|
||||
**[⬆ Back to Top](#-agentic-llm-hub)**
|
||||
|
||||
Built with ❤️ for the self-hosting community
|
||||
|
||||
</div>
|
||||
|
|
|
|||
Loading…
Reference in New Issue