devmatrix-docs/PROXMOX_PRODUCTION_SETUP.md

8.4 KiB

Proxmox Production VM Setup - Best Practices

Overview

Create isolated production environment with security hardening, resource guarantees, and automated backups.

Architecture

Proxmox Host
├── VM-200: compute-node-01 (production compute)
├── VM-201: production servers
│
├── VM-202: DevMatrix-Prod (NEW - Mission Control Production)
│   ├── Mission Control (production)
│   ├── Reverse Proxy (Traefik)
│   ├── Monitoring (Prometheus/Grafana)
│   └── Public-facing services
│
├── VM-300: OpenClaw-DevMatrix (Development)
├── VM-301: Windows-LTSC-Test
└── VM-302: Android-Emulator

VM Specifications

Production VM (VM-202)

  • Name: DevMatrix-Prod
  • VM ID: 202
  • OS: Ubuntu 22.04 LTS (minimal server)
  • CPU: 4 cores (dedicated, not shared)
  • RAM: 8GB (reserved, not ballooning)
  • Disk: 100GB SSD (thin provisioned)
  • Network: vmbr0 (same LAN, separate IP)
  • IP: 192.168.5.211 (static)

Resource Allocation Strategy

  • CPU Units: 2048 (higher priority than dev)
  • CPU Limit: 4 (hard limit)
  • Memory: 8GB (no ballooning)
  • Swap: Disabled (prevents performance issues)
  • Disk I/O: SSD optimized

Security Hardening

1. Network Isolation

  • Separate VLAN for production (optional)
  • Firewall: Only ports 80, 443, 22 (restricted IP)
  • No direct internet access (via proxy)

2. Access Control

  • SSH key only (no passwords)
  • Fail2ban enabled
  • Root login disabled
  • Sudo with password for admin tasks

3. VM-level Security

  • QEMU Guest Agent enabled
  • Secure Boot (optional)
  • TPM 2.0 (optional, for secrets)

Proxmox Configuration Steps

# On Proxmox host
# Download Ubuntu 22.04 cloud image
wget https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img

# Create template VM
qm create 9000 --name ubuntu-22.04-template --memory 2048 --cores 2 --net0 virtio,bridge=vmbr0
qm importdisk 9000 jammy-server-cloudimg-amd64.img local-lvm
qm set 9000 --scsihw virtio-scsi-pci --scsi0 local-lvm:vm-9000-disk-0
qm set 9000 --ide2 local-lvm:cloudinit
qm set 9000 --boot order=scsi0
qm set 9000 --serial0 socket --vga serial0
qm set 9000 --agent enabled=1

# Convert to template
qm template 9000

Step 2: Create Production VM from Template

# Clone template
qm clone 9000 101 --name DevMatrix-Prod --full

# Configure resources
qm set 101 --memory 8192 --balloon 0  # 8GB, no ballooning
qm set 101 --cores 4 --cpuunits 2048   # High priority
qm set 101 --scsihw virtio-scsi-single

# Resize disk
qm disk resize 101 scsi0 100G

# Configure network (static IP)
qm set 101 --ipconfig0 ip=192.168.5.211/24,gw=192.168.5.1

# Start VM
qm start 101

Step 3: VM-Level Backups (Proxmox)

# Create backup job in Proxmox
# Datacenter → Backup → Add
# Schedule: Daily 01:00
# Mode: Snapshot (for running VMs)
# Compression: ZSTD
# Storage: NAS/Backup storage
# Retention: Keep 7 daily, 4 weekly, 12 monthly

Step 4: Firewall Rules (Proxmox Host Level)

# /etc/pve/firewall/101.fw
# Production VM firewall rules

[OPTIONS]
enable: 1

[RULES]
# Allow HTTP/HTTPS
IN ACCEPT -p tcp -dport 80
IN ACCEPT -p tcp -dport 443

# Allow SSH from management network only
IN ACCEPT -p tcp -dport 22 -source 192.168.5.0/24

# Block everything else
IN DROP

Post-VM Setup Script

Create this script to run on the new production VM after creation:

#!/bin/bash
# Production VM Setup Script
# Run as root on new VM

set -e

echo "🚀 Setting up DevMatrix Production VM"

# 1. System updates
echo "Updating system..."
apt-get update && apt-get upgrade -y

# 2. Install essentials
echo "Installing packages..."
apt-get install -y \
    curl wget git htop ncdu \
    fail2ban ufw unattended-upgrades \
    qemu-guest-agent \
    nfs-common cifs-utils

# 3. Configure automatic updates
echo "Configuring auto-updates..."
cat > /etc/apt/apt.conf.d/50unattended-upgrades << 'EOF'
Unattended-Upgrade::Allowed-Origins {
    "${distro_id}:${distro_codename}-security";
};
Unattended-Upgrade::AutoFixInterruptedDpkg "true";
Unattended-Upgrade::MinimalSteps "true";
Unattended-Upgrade::InstallOnShutdown "false";
Unattended-Upgrade::Remove-Unused-Dependencies "true";
Unattended-Upgrade::Remove-New-Unused-Dependencies "true";
Unattended-Upgrade::Automatic-Reboot "true";
Unattended-Upgrade::Automatic-Reboot-Time "03:00";
EOF

# 4. Configure firewall
echo "Configuring firewall..."
ufw default deny incoming
ufw default allow outgoing
ufw allow from 192.168.5.0/24 to any port 22 comment 'SSH from LAN'
ufw allow 80 comment 'HTTP'
ufw allow 443 comment 'HTTPS'
ufw --force enable

# 5. Configure fail2ban
echo "Setting up fail2ban..."
cat >> /etc/fail2ban/jail.local << 'EOF'
[DEFAULT]
bantime = 3600
findtime = 600
maxretry = 3

[sshd]
enabled = true
port = 22
filter = sshd
logpath = /var/log/auth.log
maxretry = 3
EOF

systemctl enable fail2ban
systemctl start fail2ban

# 6. Mount NAS storage
echo "Setting up NAS mounts..."
mkdir -p /mnt/nas/backups /mnt/nas/shared

cat >> /etc/fstab << 'EOF'
# NAS Mounts
192.168.5.195:/mnt/NAS2/devmatrix/backups /mnt/nas/backups nfs defaults,_netdev,noatime 0 0
192.168.5.195:/mnt/NAS2/devmatrix/shared /mnt/nas/shared nfs defaults,_netdev,noatime 0 0
EOF

mount -a

# 7. Create devmatrix user
echo "Creating devmatrix user..."
useradd -m -s /bin/bash -G sudo devmatrix
mkdir -p /home/devmatrix/.ssh

# Copy SSH keys from dev VM (manual step or use ssh-copy-id)
echo "⚠️  Remember to copy SSH keys from dev VM"

# 8. Set hostname
echo "Setting hostname..."
hostnamectl set-hostname devmatrix-prod

echo "✅ Production VM setup complete!"
echo ""
echo "Next steps:"
echo "1. Copy SSH keys: ssh-copy-id devmatrix@192.168.5.211"
echo "2. Clone Mission Control repo"
echo "3. Run production deployment"
echo "4. Configure monitoring"

High Availability Setup (Future)

For even higher availability, consider:

Option 1: Proxmox HA Cluster

  • 3-node Proxmox cluster
  • Shared storage (Ceph or NFS)
  • Automatic failover
  • VM migration between nodes

Option 2: Load Balancer + Multiple VMs

Internet
    ↓
HAProxy (VM-102) - Load balancer
    ↓
├─ Mission Control (VM-101)
└─ Mission Control (VM-103) - Replica

Option 3: Container Orchestration (Advanced)

Proxmox
└─ Kubernetes Cluster (3 VMs)
    ├─ Master nodes (2 VMs)
    └─ Worker nodes (2+ VMs)
        └─ Mission Control (container)

Migration Strategy (Dev → Prod)

Phase 1: Setup Production VM

  1. Create VM in Proxmox
  2. Run setup script
  3. Install Node.js, PM2, etc.

Phase 2: Deploy to Production

  1. Clone Mission Control repo on prod VM
  2. Copy database from dev to prod
  3. Run production deployment
  4. Test all functionality

Phase 3: DNS Switchover

  1. Point domain to production IP
  2. Keep dev running for rollback
  3. Monitor for 24 hours

Phase 4: Decommission Dev (Optional)

  1. Once prod is stable
  2. Repurpose dev for staging/testing

Monitoring Proxmox Itself

Don't forget to monitor the Proxmox host:

# Install Zabbix Agent or Prometheus Node Exporter
# Monitor:
# - Host CPU/RAM/Disk
# - VM status
# - Network throughput
# - Storage health (SMART)
# - UPS status (if applicable)

Checklist

Pre-VM Creation

  • Sufficient disk space on Proxmox
  • Network configured (vmbr0)
  • NAS accessible from new VM IP range
  • Static IP available (192.168.5.211)

VM Creation

  • Create from template
  • Configure resources (4 CPU, 8GB RAM)
  • Set static IP
  • Enable QEMU Guest Agent
  • Configure backups in Proxmox

Post-Setup

  • System updates
  • Firewall configured
  • Fail2ban enabled
  • NAS mounted
  • SSH keys copied
  • User created
  • Mission Control deployed
  • Health monitoring active

Validation

  • Mission Control accessible at 192.168.5.211:3000
  • Health endpoint responding
  • Backups working
  • Monitoring alerts working
  • Can SSH from dev VM

Cost Analysis

Current: 1 VM running everything

  • Pros: Simple, less overhead
  • Cons: Dev affects prod, no isolation

Proposed: 2 VMs

  • Dev VM: 2 CPU, 4GB RAM (reduced since just dev)
  • Prod VM: 4 CPU, 8GB RAM (dedicated)
  • Pros: Isolation, security, reliability
  • Cons: Slightly more resource usage

Recommendation

Start with 2 VMs approach - it's the industry standard for a reason. You can always scale up later with HA if needed.

Ready to create the production VM?