Add production infrastructure documentation
- SSH_ACCESS_GUIDE.md: How to provide SSH access for production setup - PROXMOX_PRODUCTION_SETUP.md: Complete VM architecture and setup guide Includes security best practices, access levels, and deployment workflows.
This commit is contained in:
parent
805bed5aa0
commit
e0bd3df158
|
|
@ -0,0 +1,345 @@
|
|||
# Proxmox Production VM Setup - Best Practices
|
||||
|
||||
## Overview
|
||||
Create isolated production environment with security hardening, resource guarantees, and automated backups.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
Proxmox Host
|
||||
├── VM-100: DevMatrix-Dev (Current - Development/Staging)
|
||||
│ ├── Mission Control (dev)
|
||||
│ ├── Gitea (dev repos)
|
||||
│ └── Testing/Experiments
|
||||
│
|
||||
└── VM-101: DevMatrix-Prod (NEW - Production Only)
|
||||
├── Mission Control (production)
|
||||
├── Reverse Proxy (Traefik)
|
||||
├── Monitoring (Prometheus/Grafana)
|
||||
└── Public-facing services
|
||||
```
|
||||
|
||||
## VM Specifications
|
||||
|
||||
### Production VM (VM-101)
|
||||
- **Name:** DevMatrix-Prod
|
||||
- **VM ID:** 101
|
||||
- **OS:** Ubuntu 22.04 LTS (minimal server)
|
||||
- **CPU:** 4 cores (dedicated, not shared)
|
||||
- **RAM:** 8GB (reserved, not ballooning)
|
||||
- **Disk:** 100GB SSD (thin provisioned)
|
||||
- **Network:** vmbr0 (same LAN, separate IP)
|
||||
- **IP:** 192.168.5.211 (static)
|
||||
|
||||
### Resource Allocation Strategy
|
||||
- **CPU Units:** 2048 (higher priority than dev)
|
||||
- **CPU Limit:** 4 (hard limit)
|
||||
- **Memory:** 8GB (no ballooning)
|
||||
- **Swap:** Disabled (prevents performance issues)
|
||||
- **Disk I/O:** SSD optimized
|
||||
|
||||
## Security Hardening
|
||||
|
||||
### 1. Network Isolation
|
||||
- Separate VLAN for production (optional)
|
||||
- Firewall: Only ports 80, 443, 22 (restricted IP)
|
||||
- No direct internet access (via proxy)
|
||||
|
||||
### 2. Access Control
|
||||
- SSH key only (no passwords)
|
||||
- Fail2ban enabled
|
||||
- Root login disabled
|
||||
- Sudo with password for admin tasks
|
||||
|
||||
### 3. VM-level Security
|
||||
- QEMU Guest Agent enabled
|
||||
- Secure Boot (optional)
|
||||
- TPM 2.0 (optional, for secrets)
|
||||
|
||||
## Proxmox Configuration Steps
|
||||
|
||||
### Step 1: Create VM Template (Optional but Recommended)
|
||||
```bash
|
||||
# On Proxmox host
|
||||
# Download Ubuntu 22.04 cloud image
|
||||
wget https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img
|
||||
|
||||
# Create template VM
|
||||
qm create 9000 --name ubuntu-22.04-template --memory 2048 --cores 2 --net0 virtio,bridge=vmbr0
|
||||
qm importdisk 9000 jammy-server-cloudimg-amd64.img local-lvm
|
||||
qm set 9000 --scsihw virtio-scsi-pci --scsi0 local-lvm:vm-9000-disk-0
|
||||
qm set 9000 --ide2 local-lvm:cloudinit
|
||||
qm set 9000 --boot order=scsi0
|
||||
qm set 9000 --serial0 socket --vga serial0
|
||||
qm set 9000 --agent enabled=1
|
||||
|
||||
# Convert to template
|
||||
qm template 9000
|
||||
```
|
||||
|
||||
### Step 2: Create Production VM from Template
|
||||
```bash
|
||||
# Clone template
|
||||
qm clone 9000 101 --name DevMatrix-Prod --full
|
||||
|
||||
# Configure resources
|
||||
qm set 101 --memory 8192 --balloon 0 # 8GB, no ballooning
|
||||
qm set 101 --cores 4 --cpuunits 2048 # High priority
|
||||
qm set 101 --scsihw virtio-scsi-single
|
||||
|
||||
# Resize disk
|
||||
qm disk resize 101 scsi0 100G
|
||||
|
||||
# Configure network (static IP)
|
||||
qm set 101 --ipconfig0 ip=192.168.5.211/24,gw=192.168.5.1
|
||||
|
||||
# Start VM
|
||||
qm start 101
|
||||
```
|
||||
|
||||
### Step 3: VM-Level Backups (Proxmox)
|
||||
```bash
|
||||
# Create backup job in Proxmox
|
||||
# Datacenter → Backup → Add
|
||||
# Schedule: Daily 01:00
|
||||
# Mode: Snapshot (for running VMs)
|
||||
# Compression: ZSTD
|
||||
# Storage: NAS/Backup storage
|
||||
# Retention: Keep 7 daily, 4 weekly, 12 monthly
|
||||
```
|
||||
|
||||
### Step 4: Firewall Rules (Proxmox Host Level)
|
||||
```bash
|
||||
# /etc/pve/firewall/101.fw
|
||||
# Production VM firewall rules
|
||||
|
||||
[OPTIONS]
|
||||
enable: 1
|
||||
|
||||
[RULES]
|
||||
# Allow HTTP/HTTPS
|
||||
IN ACCEPT -p tcp -dport 80
|
||||
IN ACCEPT -p tcp -dport 443
|
||||
|
||||
# Allow SSH from management network only
|
||||
IN ACCEPT -p tcp -dport 22 -source 192.168.5.0/24
|
||||
|
||||
# Block everything else
|
||||
IN DROP
|
||||
```
|
||||
|
||||
## Post-VM Setup Script
|
||||
|
||||
Create this script to run on the new production VM after creation:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Production VM Setup Script
|
||||
# Run as root on new VM
|
||||
|
||||
set -e
|
||||
|
||||
echo "🚀 Setting up DevMatrix Production VM"
|
||||
|
||||
# 1. System updates
|
||||
echo "Updating system..."
|
||||
apt-get update && apt-get upgrade -y
|
||||
|
||||
# 2. Install essentials
|
||||
echo "Installing packages..."
|
||||
apt-get install -y \
|
||||
curl wget git htop ncdu \
|
||||
fail2ban ufw unattended-upgrades \
|
||||
qemu-guest-agent \
|
||||
nfs-common cifs-utils
|
||||
|
||||
# 3. Configure automatic updates
|
||||
echo "Configuring auto-updates..."
|
||||
cat > /etc/apt/apt.conf.d/50unattended-upgrades << 'EOF'
|
||||
Unattended-Upgrade::Allowed-Origins {
|
||||
"${distro_id}:${distro_codename}-security";
|
||||
};
|
||||
Unattended-Upgrade::AutoFixInterruptedDpkg "true";
|
||||
Unattended-Upgrade::MinimalSteps "true";
|
||||
Unattended-Upgrade::InstallOnShutdown "false";
|
||||
Unattended-Upgrade::Remove-Unused-Dependencies "true";
|
||||
Unattended-Upgrade::Remove-New-Unused-Dependencies "true";
|
||||
Unattended-Upgrade::Automatic-Reboot "true";
|
||||
Unattended-Upgrade::Automatic-Reboot-Time "03:00";
|
||||
EOF
|
||||
|
||||
# 4. Configure firewall
|
||||
echo "Configuring firewall..."
|
||||
ufw default deny incoming
|
||||
ufw default allow outgoing
|
||||
ufw allow from 192.168.5.0/24 to any port 22 comment 'SSH from LAN'
|
||||
ufw allow 80 comment 'HTTP'
|
||||
ufw allow 443 comment 'HTTPS'
|
||||
ufw --force enable
|
||||
|
||||
# 5. Configure fail2ban
|
||||
echo "Setting up fail2ban..."
|
||||
cat >> /etc/fail2ban/jail.local << 'EOF'
|
||||
[DEFAULT]
|
||||
bantime = 3600
|
||||
findtime = 600
|
||||
maxretry = 3
|
||||
|
||||
[sshd]
|
||||
enabled = true
|
||||
port = 22
|
||||
filter = sshd
|
||||
logpath = /var/log/auth.log
|
||||
maxretry = 3
|
||||
EOF
|
||||
|
||||
systemctl enable fail2ban
|
||||
systemctl start fail2ban
|
||||
|
||||
# 6. Mount NAS storage
|
||||
echo "Setting up NAS mounts..."
|
||||
mkdir -p /mnt/nas/backups /mnt/nas/shared
|
||||
|
||||
cat >> /etc/fstab << 'EOF'
|
||||
# NAS Mounts
|
||||
192.168.5.195:/mnt/NAS2/devmatrix/backups /mnt/nas/backups nfs defaults,_netdev,noatime 0 0
|
||||
192.168.5.195:/mnt/NAS2/devmatrix/shared /mnt/nas/shared nfs defaults,_netdev,noatime 0 0
|
||||
EOF
|
||||
|
||||
mount -a
|
||||
|
||||
# 7. Create devmatrix user
|
||||
echo "Creating devmatrix user..."
|
||||
useradd -m -s /bin/bash -G sudo devmatrix
|
||||
mkdir -p /home/devmatrix/.ssh
|
||||
|
||||
# Copy SSH keys from dev VM (manual step or use ssh-copy-id)
|
||||
echo "⚠️ Remember to copy SSH keys from dev VM"
|
||||
|
||||
# 8. Set hostname
|
||||
echo "Setting hostname..."
|
||||
hostnamectl set-hostname devmatrix-prod
|
||||
|
||||
echo "✅ Production VM setup complete!"
|
||||
echo ""
|
||||
echo "Next steps:"
|
||||
echo "1. Copy SSH keys: ssh-copy-id devmatrix@192.168.5.211"
|
||||
echo "2. Clone Mission Control repo"
|
||||
echo "3. Run production deployment"
|
||||
echo "4. Configure monitoring"
|
||||
```
|
||||
|
||||
## High Availability Setup (Future)
|
||||
|
||||
For even higher availability, consider:
|
||||
|
||||
### Option 1: Proxmox HA Cluster
|
||||
- 3-node Proxmox cluster
|
||||
- Shared storage (Ceph or NFS)
|
||||
- Automatic failover
|
||||
- VM migration between nodes
|
||||
|
||||
### Option 2: Load Balancer + Multiple VMs
|
||||
```
|
||||
Internet
|
||||
↓
|
||||
HAProxy (VM-102) - Load balancer
|
||||
↓
|
||||
├─ Mission Control (VM-101)
|
||||
└─ Mission Control (VM-103) - Replica
|
||||
```
|
||||
|
||||
### Option 3: Container Orchestration (Advanced)
|
||||
```
|
||||
Proxmox
|
||||
└─ Kubernetes Cluster (3 VMs)
|
||||
├─ Master nodes (2 VMs)
|
||||
└─ Worker nodes (2+ VMs)
|
||||
└─ Mission Control (container)
|
||||
```
|
||||
|
||||
## Migration Strategy (Dev → Prod)
|
||||
|
||||
### Phase 1: Setup Production VM
|
||||
1. Create VM in Proxmox
|
||||
2. Run setup script
|
||||
3. Install Node.js, PM2, etc.
|
||||
|
||||
### Phase 2: Deploy to Production
|
||||
1. Clone Mission Control repo on prod VM
|
||||
2. Copy database from dev to prod
|
||||
3. Run production deployment
|
||||
4. Test all functionality
|
||||
|
||||
### Phase 3: DNS Switchover
|
||||
1. Point domain to production IP
|
||||
2. Keep dev running for rollback
|
||||
3. Monitor for 24 hours
|
||||
|
||||
### Phase 4: Decommission Dev (Optional)
|
||||
1. Once prod is stable
|
||||
2. Repurpose dev for staging/testing
|
||||
|
||||
## Monitoring Proxmox Itself
|
||||
|
||||
Don't forget to monitor the Proxmox host:
|
||||
|
||||
```bash
|
||||
# Install Zabbix Agent or Prometheus Node Exporter
|
||||
# Monitor:
|
||||
# - Host CPU/RAM/Disk
|
||||
# - VM status
|
||||
# - Network throughput
|
||||
# - Storage health (SMART)
|
||||
# - UPS status (if applicable)
|
||||
```
|
||||
|
||||
## Checklist
|
||||
|
||||
### Pre-VM Creation
|
||||
- [ ] Sufficient disk space on Proxmox
|
||||
- [ ] Network configured (vmbr0)
|
||||
- [ ] NAS accessible from new VM IP range
|
||||
- [ ] Static IP available (192.168.5.211)
|
||||
|
||||
### VM Creation
|
||||
- [ ] Create from template
|
||||
- [ ] Configure resources (4 CPU, 8GB RAM)
|
||||
- [ ] Set static IP
|
||||
- [ ] Enable QEMU Guest Agent
|
||||
- [ ] Configure backups in Proxmox
|
||||
|
||||
### Post-Setup
|
||||
- [ ] System updates
|
||||
- [ ] Firewall configured
|
||||
- [ ] Fail2ban enabled
|
||||
- [ ] NAS mounted
|
||||
- [ ] SSH keys copied
|
||||
- [ ] User created
|
||||
- [ ] Mission Control deployed
|
||||
- [ ] Health monitoring active
|
||||
|
||||
### Validation
|
||||
- [ ] Mission Control accessible at 192.168.5.211:3000
|
||||
- [ ] Health endpoint responding
|
||||
- [ ] Backups working
|
||||
- [ ] Monitoring alerts working
|
||||
- [ ] Can SSH from dev VM
|
||||
|
||||
## Cost Analysis
|
||||
|
||||
Current: 1 VM running everything
|
||||
- Pros: Simple, less overhead
|
||||
- Cons: Dev affects prod, no isolation
|
||||
|
||||
Proposed: 2 VMs
|
||||
- Dev VM: 2 CPU, 4GB RAM (reduced since just dev)
|
||||
- Prod VM: 4 CPU, 8GB RAM (dedicated)
|
||||
- Pros: Isolation, security, reliability
|
||||
- Cons: Slightly more resource usage
|
||||
|
||||
## Recommendation
|
||||
|
||||
**Start with 2 VMs approach** - it's the industry standard for a reason. You can always scale up later with HA if needed.
|
||||
|
||||
Ready to create the production VM?
|
||||
|
|
@ -0,0 +1,214 @@
|
|||
# SSH Access for Production Setup
|
||||
|
||||
This document outlines how to provide SSH access for DevMatrix AI to help setup and manage the production environment.
|
||||
|
||||
## 🔐 Security Model
|
||||
|
||||
**Principle:** Minimal access, maximum security
|
||||
|
||||
- SSH key-based authentication only (no passwords)
|
||||
- Dedicated user account with limited permissions
|
||||
- Access logged and auditable
|
||||
- Can be revoked instantly
|
||||
|
||||
## 📋 Setup Steps
|
||||
|
||||
### 1. Create Production VM
|
||||
|
||||
On your Proxmox host, run:
|
||||
|
||||
```bash
|
||||
# Download and run the VM creation script
|
||||
curl -fsSL https://git.lemonlink.eu/devmatrix/devmatrix-scripts/raw/branch/main/proxmox/create-production-vm.sh | sudo bash
|
||||
```
|
||||
|
||||
This creates VM-101 (DevMatrix-Prod) with:
|
||||
- IP: 192.168.5.211
|
||||
- 4 CPU cores, 8GB RAM, 100GB disk
|
||||
- Ubuntu 22.04 LTS
|
||||
|
||||
### 2. Get DevMatrix AI SSH Public Key
|
||||
|
||||
Ask me for the SSH public key when you're ready. I'll provide:
|
||||
|
||||
```
|
||||
ssh-ed25519 AAAAC3NzaC... devmatrix-ai@production
|
||||
```
|
||||
|
||||
### 3. Add SSH Key to Production VM
|
||||
|
||||
On the new production VM (192.168.5.211):
|
||||
|
||||
```bash
|
||||
# SSH into the new VM
|
||||
ssh devmatrix@192.168.5.211
|
||||
|
||||
# Create authorized_keys if not exists
|
||||
mkdir -p ~/.ssh
|
||||
chmod 700 ~/.ssh
|
||||
|
||||
# Add my public key
|
||||
echo "ssh-ed25519 AAAAC3NzaC... devmatrix-ai@production" >> ~/.ssh/authorized_keys
|
||||
chmod 600 ~/.ssh/authorized_keys
|
||||
|
||||
# Verify
|
||||
ssh -T git@github.com # Just to test SSH is working
|
||||
```
|
||||
|
||||
### 4. Grant Sudo Access (Limited)
|
||||
|
||||
For production setup, I need limited sudo access:
|
||||
|
||||
```bash
|
||||
# On production VM, as root or with sudo
|
||||
sudo visudo
|
||||
|
||||
# Add this line at the end
|
||||
devmatrix-ai ALL=(ALL) NOPASSWD: /usr/bin/apt-get, /usr/bin/systemctl, /usr/bin/pm2, /home/devmatrix/devmatrix-scripts/infrastructure/*.sh, /home/devmatrix/devmatrix-scripts/proxmox/*.sh
|
||||
```
|
||||
|
||||
Or create a dedicated sudoers file:
|
||||
|
||||
```bash
|
||||
echo "devmatrix-ai ALL=(ALL) NOPASSWD: /usr/bin/apt-get, /usr/bin/apt, /usr/bin/systemctl, /usr/bin/pm2, /usr/sbin/ufw, /bin/mkdir, /bin/chown" | sudo tee /etc/sudoers.d/devmatrix-ai
|
||||
sudo chmod 440 /etc/sudoers.d/devmatrix-ai
|
||||
```
|
||||
|
||||
### 5. Test SSH Access
|
||||
|
||||
Once you've added my key, I'll verify access:
|
||||
|
||||
```bash
|
||||
ssh devmatrix@192.168.5.211
|
||||
curl -fsSL https://git.lemonlink.eu/devmatrix/devmatrix-scripts/raw/branch/main/proxmox/setup-production-vm.sh | sudo bash
|
||||
```
|
||||
|
||||
## 🔒 Security Measures
|
||||
|
||||
### IP Restriction (Recommended)
|
||||
|
||||
Restrict SSH to your internal network only:
|
||||
|
||||
```bash
|
||||
# On production VM
|
||||
sudo ufw allow from 192.168.5.0/24 to any port 22
|
||||
sudo ufw deny 22
|
||||
sudo ufw reload
|
||||
```
|
||||
|
||||
### Fail2ban
|
||||
|
||||
Already configured in setup script:
|
||||
- 3 failed attempts = 1 hour ban
|
||||
- Monitors SSH and application ports
|
||||
|
||||
### Audit Logging
|
||||
|
||||
All commands are logged:
|
||||
|
||||
```bash
|
||||
# View sudo logs
|
||||
sudo grep "devmatrix-ai" /var/log/auth.log
|
||||
|
||||
# View command history
|
||||
sudo cat /home/devmatrix/.bash_history
|
||||
```
|
||||
|
||||
## 🚀 Deployment Workflow
|
||||
|
||||
### Automated Deployment (Approved)
|
||||
|
||||
After initial setup, I can deploy updates with your approval:
|
||||
|
||||
1. **You request:** "Deploy latest Mission Control to production"
|
||||
2. **I verify:** Check git status, run tests
|
||||
3. **I backup:** Database backup before deploy
|
||||
4. **I deploy:** Zero-downtime deployment
|
||||
5. **I verify:** Health checks pass
|
||||
6. **I report:** Deployment status
|
||||
|
||||
### Manual Approval Mode
|
||||
|
||||
For sensitive operations, you can require manual approval:
|
||||
|
||||
```bash
|
||||
# Create approval flag
|
||||
touch /home/devmatrix/.deployment-approved
|
||||
|
||||
# I'll check for this before deploying
|
||||
if [ -f /home/devmatrix/.deployment-approved ]; then
|
||||
rm /home/devmatrix/.deployment-approved
|
||||
mc-deploy
|
||||
fi
|
||||
```
|
||||
|
||||
## 📊 Access Levels
|
||||
|
||||
| Operation | Access Level | Requires Approval |
|
||||
|-----------|--------------|-------------------|
|
||||
| View logs | ✅ Automatic | No |
|
||||
| Check status | ✅ Automatic | No |
|
||||
| Restart service | ✅ Automatic | No |
|
||||
| Deploy updates | ⚠️ Conditional | Yes (configurable) |
|
||||
| System updates | ⚠️ Conditional | Yes |
|
||||
| Database changes | ❌ Manual only | Yes |
|
||||
| SSH key changes | ❌ Manual only | Yes |
|
||||
|
||||
## 🔄 Revoking Access
|
||||
|
||||
To revoke access instantly:
|
||||
|
||||
```bash
|
||||
# Remove SSH key
|
||||
sed -i '/devmatrix-ai/d' ~/.ssh/authorized_keys
|
||||
|
||||
# Remove sudo access
|
||||
sudo rm /etc/sudoers.d/devmatrix-ai
|
||||
|
||||
# Kill any active sessions
|
||||
sudo pkill -u devmatrix-ai
|
||||
```
|
||||
|
||||
## 📞 Communication
|
||||
|
||||
For production operations:
|
||||
|
||||
1. **Telegram notifications** - Real-time alerts
|
||||
2. **Git commit logs** - Audit trail of all changes
|
||||
3. **System logs** - /var/log/mission-control/
|
||||
|
||||
## ✅ Checklist
|
||||
|
||||
Before giving SSH access:
|
||||
|
||||
- [ ] Production VM created (VM-101)
|
||||
- [ ] Basic OS installed
|
||||
- [ ] Network configured (192.168.5.211)
|
||||
- [ ] You have admin/root access
|
||||
- [ ] SSH key generated for me
|
||||
- [ ] Firewall rules configured
|
||||
- [ ] Backup NAS accessible
|
||||
- [ ] You understand how to revoke access
|
||||
|
||||
After giving SSH access:
|
||||
|
||||
- [ ] I confirm SSH connection works
|
||||
- [ ] Run production setup script
|
||||
- [ ] Deploy Mission Control
|
||||
- [ ] Verify health checks pass
|
||||
- [ ] Test backup/restore
|
||||
- [ ] Document any custom configs
|
||||
|
||||
## 🆘 Emergency Contacts
|
||||
|
||||
If something goes wrong:
|
||||
|
||||
1. Revoke SSH access immediately (see above)
|
||||
2. Restart services: `mc-restart`
|
||||
3. Check logs: `mc-logs`
|
||||
4. Restore from backup if needed
|
||||
5. Contact me with details
|
||||
|
||||
---
|
||||
|
||||
**Ready to proceed?** Create the VM and give me the SSH key when you're ready!
|
||||
Loading…
Reference in New Issue