llm-hub/docs/API.md

1.8 KiB

API Reference

Base URL

http://your-server-ip:8080/v1

Authentication

All requests require Bearer token:

Authorization: Bearer sk-agent-your-key

Endpoints

POST /chat/completions

Main agent endpoint.

Request:

{
  "message": "Create a Python script to fetch weather data",
  "reasoning_mode": "plan_execute",
  "session_id": "unique-session-id",
  "max_iterations": 10
}

Response:

{
  "response": "Here\'s the Python script...",
  "reasoning_mode": "plan_execute",
  "session_id": "unique-session-id",
  "steps": [
    {"step_number": 1, "type": "plan", "content": "..."},
    {"step_number": 2, "type": "action", "content": "..."}
  ],
  "metadata": {
    "model_used": "volume-tier",
    "auto_selected": true,
    "timestamp": "2024-..."
  }
}

Reasoning Modes

Mode Use Case Speed Accuracy
react Simple Q&A, debugging Fast Medium
plan_execute Complex multi-step tasks Medium High
reflexion Code review, critical tasks Slow Very High
auto Let system decide Variable Adaptive

GET /models

List available models.

GET /health

Check system status.

GET /sessions/{id}/history

Retrieve conversation history.

Examples

Python

import requests

response = requests.post(
    "http://localhost:8080/v1/chat/completions",
    headers={"Authorization": "Bearer sk-agent-xxx"},
    json={
        "message": "Refactor this code",
        "reasoning_mode": "reflexion"
    }
)
print(response.json()["response"])

cURL

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer sk-agent-xxx" \
  -H "Content-Type: application/json" \
  -d '{"message":"Hello","reasoning_mode":"auto"}'