348 lines
8.4 KiB
Markdown
348 lines
8.4 KiB
Markdown
# Proxmox Production VM Setup - Best Practices
|
|
|
|
## Overview
|
|
Create isolated production environment with security hardening, resource guarantees, and automated backups.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
Proxmox Host
|
|
├── VM-200: compute-node-01 (production compute)
|
|
├── VM-201: production servers
|
|
│
|
|
├── VM-202: DevMatrix-Prod (NEW - Mission Control Production)
|
|
│ ├── Mission Control (production)
|
|
│ ├── Reverse Proxy (Traefik)
|
|
│ ├── Monitoring (Prometheus/Grafana)
|
|
│ └── Public-facing services
|
|
│
|
|
├── VM-300: OpenClaw-DevMatrix (Development)
|
|
├── VM-301: Windows-LTSC-Test
|
|
└── VM-302: Android-Emulator
|
|
```
|
|
|
|
## VM Specifications
|
|
|
|
### Production VM (VM-202)
|
|
- **Name:** DevMatrix-Prod
|
|
- **VM ID:** 202
|
|
- **OS:** Ubuntu 22.04 LTS (minimal server)
|
|
- **CPU:** 4 cores (dedicated, not shared)
|
|
- **RAM:** 8GB (reserved, not ballooning)
|
|
- **Disk:** 100GB SSD (thin provisioned)
|
|
- **Network:** vmbr0 (same LAN, separate IP)
|
|
- **IP:** 192.168.5.211 (static)
|
|
|
|
### Resource Allocation Strategy
|
|
- **CPU Units:** 2048 (higher priority than dev)
|
|
- **CPU Limit:** 4 (hard limit)
|
|
- **Memory:** 8GB (no ballooning)
|
|
- **Swap:** Disabled (prevents performance issues)
|
|
- **Disk I/O:** SSD optimized
|
|
|
|
## Security Hardening
|
|
|
|
### 1. Network Isolation
|
|
- Separate VLAN for production (optional)
|
|
- Firewall: Only ports 80, 443, 22 (restricted IP)
|
|
- No direct internet access (via proxy)
|
|
|
|
### 2. Access Control
|
|
- SSH key only (no passwords)
|
|
- Fail2ban enabled
|
|
- Root login disabled
|
|
- Sudo with password for admin tasks
|
|
|
|
### 3. VM-level Security
|
|
- QEMU Guest Agent enabled
|
|
- Secure Boot (optional)
|
|
- TPM 2.0 (optional, for secrets)
|
|
|
|
## Proxmox Configuration Steps
|
|
|
|
### Step 1: Create VM Template (Optional but Recommended)
|
|
```bash
|
|
# On Proxmox host
|
|
# Download Ubuntu 22.04 cloud image
|
|
wget https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img
|
|
|
|
# Create template VM
|
|
qm create 9000 --name ubuntu-22.04-template --memory 2048 --cores 2 --net0 virtio,bridge=vmbr0
|
|
qm importdisk 9000 jammy-server-cloudimg-amd64.img local-lvm
|
|
qm set 9000 --scsihw virtio-scsi-pci --scsi0 local-lvm:vm-9000-disk-0
|
|
qm set 9000 --ide2 local-lvm:cloudinit
|
|
qm set 9000 --boot order=scsi0
|
|
qm set 9000 --serial0 socket --vga serial0
|
|
qm set 9000 --agent enabled=1
|
|
|
|
# Convert to template
|
|
qm template 9000
|
|
```
|
|
|
|
### Step 2: Create Production VM from Template
|
|
```bash
|
|
# Clone template
|
|
qm clone 9000 101 --name DevMatrix-Prod --full
|
|
|
|
# Configure resources
|
|
qm set 101 --memory 8192 --balloon 0 # 8GB, no ballooning
|
|
qm set 101 --cores 4 --cpuunits 2048 # High priority
|
|
qm set 101 --scsihw virtio-scsi-single
|
|
|
|
# Resize disk
|
|
qm disk resize 101 scsi0 100G
|
|
|
|
# Configure network (static IP)
|
|
qm set 101 --ipconfig0 ip=192.168.5.211/24,gw=192.168.5.1
|
|
|
|
# Start VM
|
|
qm start 101
|
|
```
|
|
|
|
### Step 3: VM-Level Backups (Proxmox)
|
|
```bash
|
|
# Create backup job in Proxmox
|
|
# Datacenter → Backup → Add
|
|
# Schedule: Daily 01:00
|
|
# Mode: Snapshot (for running VMs)
|
|
# Compression: ZSTD
|
|
# Storage: NAS/Backup storage
|
|
# Retention: Keep 7 daily, 4 weekly, 12 monthly
|
|
```
|
|
|
|
### Step 4: Firewall Rules (Proxmox Host Level)
|
|
```bash
|
|
# /etc/pve/firewall/101.fw
|
|
# Production VM firewall rules
|
|
|
|
[OPTIONS]
|
|
enable: 1
|
|
|
|
[RULES]
|
|
# Allow HTTP/HTTPS
|
|
IN ACCEPT -p tcp -dport 80
|
|
IN ACCEPT -p tcp -dport 443
|
|
|
|
# Allow SSH from management network only
|
|
IN ACCEPT -p tcp -dport 22 -source 192.168.5.0/24
|
|
|
|
# Block everything else
|
|
IN DROP
|
|
```
|
|
|
|
## Post-VM Setup Script
|
|
|
|
Create this script to run on the new production VM after creation:
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
# Production VM Setup Script
|
|
# Run as root on new VM
|
|
|
|
set -e
|
|
|
|
echo "🚀 Setting up DevMatrix Production VM"
|
|
|
|
# 1. System updates
|
|
echo "Updating system..."
|
|
apt-get update && apt-get upgrade -y
|
|
|
|
# 2. Install essentials
|
|
echo "Installing packages..."
|
|
apt-get install -y \
|
|
curl wget git htop ncdu \
|
|
fail2ban ufw unattended-upgrades \
|
|
qemu-guest-agent \
|
|
nfs-common cifs-utils
|
|
|
|
# 3. Configure automatic updates
|
|
echo "Configuring auto-updates..."
|
|
cat > /etc/apt/apt.conf.d/50unattended-upgrades << 'EOF'
|
|
Unattended-Upgrade::Allowed-Origins {
|
|
"${distro_id}:${distro_codename}-security";
|
|
};
|
|
Unattended-Upgrade::AutoFixInterruptedDpkg "true";
|
|
Unattended-Upgrade::MinimalSteps "true";
|
|
Unattended-Upgrade::InstallOnShutdown "false";
|
|
Unattended-Upgrade::Remove-Unused-Dependencies "true";
|
|
Unattended-Upgrade::Remove-New-Unused-Dependencies "true";
|
|
Unattended-Upgrade::Automatic-Reboot "true";
|
|
Unattended-Upgrade::Automatic-Reboot-Time "03:00";
|
|
EOF
|
|
|
|
# 4. Configure firewall
|
|
echo "Configuring firewall..."
|
|
ufw default deny incoming
|
|
ufw default allow outgoing
|
|
ufw allow from 192.168.5.0/24 to any port 22 comment 'SSH from LAN'
|
|
ufw allow 80 comment 'HTTP'
|
|
ufw allow 443 comment 'HTTPS'
|
|
ufw --force enable
|
|
|
|
# 5. Configure fail2ban
|
|
echo "Setting up fail2ban..."
|
|
cat >> /etc/fail2ban/jail.local << 'EOF'
|
|
[DEFAULT]
|
|
bantime = 3600
|
|
findtime = 600
|
|
maxretry = 3
|
|
|
|
[sshd]
|
|
enabled = true
|
|
port = 22
|
|
filter = sshd
|
|
logpath = /var/log/auth.log
|
|
maxretry = 3
|
|
EOF
|
|
|
|
systemctl enable fail2ban
|
|
systemctl start fail2ban
|
|
|
|
# 6. Mount NAS storage
|
|
echo "Setting up NAS mounts..."
|
|
mkdir -p /mnt/nas/backups /mnt/nas/shared
|
|
|
|
cat >> /etc/fstab << 'EOF'
|
|
# NAS Mounts
|
|
192.168.5.195:/mnt/NAS2/devmatrix/backups /mnt/nas/backups nfs defaults,_netdev,noatime 0 0
|
|
192.168.5.195:/mnt/NAS2/devmatrix/shared /mnt/nas/shared nfs defaults,_netdev,noatime 0 0
|
|
EOF
|
|
|
|
mount -a
|
|
|
|
# 7. Create devmatrix user
|
|
echo "Creating devmatrix user..."
|
|
useradd -m -s /bin/bash -G sudo devmatrix
|
|
mkdir -p /home/devmatrix/.ssh
|
|
|
|
# Copy SSH keys from dev VM (manual step or use ssh-copy-id)
|
|
echo "⚠️ Remember to copy SSH keys from dev VM"
|
|
|
|
# 8. Set hostname
|
|
echo "Setting hostname..."
|
|
hostnamectl set-hostname devmatrix-prod
|
|
|
|
echo "✅ Production VM setup complete!"
|
|
echo ""
|
|
echo "Next steps:"
|
|
echo "1. Copy SSH keys: ssh-copy-id devmatrix@192.168.5.211"
|
|
echo "2. Clone Mission Control repo"
|
|
echo "3. Run production deployment"
|
|
echo "4. Configure monitoring"
|
|
```
|
|
|
|
## High Availability Setup (Future)
|
|
|
|
For even higher availability, consider:
|
|
|
|
### Option 1: Proxmox HA Cluster
|
|
- 3-node Proxmox cluster
|
|
- Shared storage (Ceph or NFS)
|
|
- Automatic failover
|
|
- VM migration between nodes
|
|
|
|
### Option 2: Load Balancer + Multiple VMs
|
|
```
|
|
Internet
|
|
↓
|
|
HAProxy (VM-102) - Load balancer
|
|
↓
|
|
├─ Mission Control (VM-101)
|
|
└─ Mission Control (VM-103) - Replica
|
|
```
|
|
|
|
### Option 3: Container Orchestration (Advanced)
|
|
```
|
|
Proxmox
|
|
└─ Kubernetes Cluster (3 VMs)
|
|
├─ Master nodes (2 VMs)
|
|
└─ Worker nodes (2+ VMs)
|
|
└─ Mission Control (container)
|
|
```
|
|
|
|
## Migration Strategy (Dev → Prod)
|
|
|
|
### Phase 1: Setup Production VM
|
|
1. Create VM in Proxmox
|
|
2. Run setup script
|
|
3. Install Node.js, PM2, etc.
|
|
|
|
### Phase 2: Deploy to Production
|
|
1. Clone Mission Control repo on prod VM
|
|
2. Copy database from dev to prod
|
|
3. Run production deployment
|
|
4. Test all functionality
|
|
|
|
### Phase 3: DNS Switchover
|
|
1. Point domain to production IP
|
|
2. Keep dev running for rollback
|
|
3. Monitor for 24 hours
|
|
|
|
### Phase 4: Decommission Dev (Optional)
|
|
1. Once prod is stable
|
|
2. Repurpose dev for staging/testing
|
|
|
|
## Monitoring Proxmox Itself
|
|
|
|
Don't forget to monitor the Proxmox host:
|
|
|
|
```bash
|
|
# Install Zabbix Agent or Prometheus Node Exporter
|
|
# Monitor:
|
|
# - Host CPU/RAM/Disk
|
|
# - VM status
|
|
# - Network throughput
|
|
# - Storage health (SMART)
|
|
# - UPS status (if applicable)
|
|
```
|
|
|
|
## Checklist
|
|
|
|
### Pre-VM Creation
|
|
- [ ] Sufficient disk space on Proxmox
|
|
- [ ] Network configured (vmbr0)
|
|
- [ ] NAS accessible from new VM IP range
|
|
- [ ] Static IP available (192.168.5.211)
|
|
|
|
### VM Creation
|
|
- [ ] Create from template
|
|
- [ ] Configure resources (4 CPU, 8GB RAM)
|
|
- [ ] Set static IP
|
|
- [ ] Enable QEMU Guest Agent
|
|
- [ ] Configure backups in Proxmox
|
|
|
|
### Post-Setup
|
|
- [ ] System updates
|
|
- [ ] Firewall configured
|
|
- [ ] Fail2ban enabled
|
|
- [ ] NAS mounted
|
|
- [ ] SSH keys copied
|
|
- [ ] User created
|
|
- [ ] Mission Control deployed
|
|
- [ ] Health monitoring active
|
|
|
|
### Validation
|
|
- [ ] Mission Control accessible at 192.168.5.211:3000
|
|
- [ ] Health endpoint responding
|
|
- [ ] Backups working
|
|
- [ ] Monitoring alerts working
|
|
- [ ] Can SSH from dev VM
|
|
|
|
## Cost Analysis
|
|
|
|
Current: 1 VM running everything
|
|
- Pros: Simple, less overhead
|
|
- Cons: Dev affects prod, no isolation
|
|
|
|
Proposed: 2 VMs
|
|
- Dev VM: 2 CPU, 4GB RAM (reduced since just dev)
|
|
- Prod VM: 4 CPU, 8GB RAM (dedicated)
|
|
- Pros: Isolation, security, reliability
|
|
- Cons: Slightly more resource usage
|
|
|
|
## Recommendation
|
|
|
|
**Start with 2 VMs approach** - it's the industry standard for a reason. You can always scale up later with HA if needed.
|
|
|
|
Ready to create the production VM?
|