llm-hub/docs/API.md

# API Reference

## Base URL
```
http://your-server-ip:8080/v1
```

## Authentication
All requests require Bearer token:
```
Authorization: Bearer sk-agent-your-key
```

## Endpoints

### POST /chat/completions
Main agent endpoint.

**Request:**
```json
{
  "message": "Create a Python script to fetch weather data",
  "reasoning_mode": "plan_execute",
  "session_id": "unique-session-id",
  "max_iterations": 10
}
```

**Response:**
```json
{
  "response": "Here\'s the Python script...",
  "reasoning_mode": "plan_execute",
  "session_id": "unique-session-id",
  "steps": [
    {"step_number": 1, "type": "plan", "content": "..."},
    {"step_number": 2, "type": "action", "content": "..."}
  ],
  "metadata": {
    "model_used": "volume-tier",
    "auto_selected": true,
    "timestamp": "2024-..."
  }
}
```

### Reasoning Modes

| Mode | Use Case | Speed | Accuracy |
|------|----------|-------|----------|
| `react` | Simple Q&A, debugging | Fast | Medium |
| `plan_execute` | Complex multi-step tasks | Medium | High |
| `reflexion` | Code review, critical tasks | Slow | Very High |
| `auto` | Let system decide | Variable | Adaptive |

### GET /models
List available models.

### GET /health
Check system status.

### GET /sessions/{id}/history
Retrieve conversation history.

## Examples

### Python
```python
import requests

response = requests.post(
    "http://localhost:8080/v1/chat/completions",
    headers={"Authorization": "Bearer sk-agent-xxx"},
    json={
        "message": "Refactor this code",
        "reasoning_mode": "reflexion"
    }
)
print(response.json()["response"])
```

### cURL
```bash
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer sk-agent-xxx" \
  -H "Content-Type: application/json" \
  -d '{"message":"Hello","reasoning_mode":"auto"}'
```