Back to projects
May 10, 2026
3 min read

Personal Homelab: Production-Grade Self-Hosted Infrastructure

Designed and operate a full-rack homelab serving ~80 concurrent users with high availability, half a petabyte of storage, 10G networking, and zero-trust security, running continuously for 10+ years.

Overview

Designed, built, and operate a production-grade homelab running 24/7 in a 42U full rack, serving approximately 80 concurrent users across self-hosted applications and game servers. The environment prioritizes high availability, data integrity, and operational resilience, with enough redundancy to pull any server for maintenance or upgrades without impacting services.

What Was Done

  • Deployed a 3-node Proxmox HA cluster alongside a dedicated Proxmox Backup Server running daily, weekly, and monthly backups of all LXCs and VMs, plus a separate test bench (5 servers total)
  • Sourced and built a 45 Drives Q30 Storinator chassis with a Supermicro motherboard, Intel Xeon CPU, and 126 GB ECC RAM; configured TrueNAS with 30 drives across 6-wide RAIDZ2 vdevs, yielding approximately ~500TB raw and ~250TB of usable fault-tolerant storage after parity overhead
  • Architected three tiered NVMe pools: one for HA VM storage, one scratch pool for ephemeral workloads (transcodes, downloads), and a dedicated app data pool so LXCs and VMs stay minimal compute containers with metadata and app state stored separately and backed up
  • Configured a 10G backbone with jumbo frames for all storage traffic, reserving 1G links on each server for management, SSH, and WAN uplink to eliminate bandwidth contention under load
  • Deployed Wazuh as a SIEM and SigNoz for full-stack observability across all services and active users
  • Automated infrastructure provisioning and configuration management with Ansible and Terraform to codify the environment and protect against configuration drift
  • Implemented zero-trust networking with Tailscale and VLAN segmentation, limiting public exposure to only what is necessary and isolating sensitive infrastructure from standard network users
  • Configured a UPS with automated graceful shutdown sequencing so services wind down in order before the battery is depleted if power does not return
  • Hosted game servers accessible through a web portal, allowing users to spin up servers with custom mods and game modes on demand
  • Deployed a distributed intrusion prevention system across all public-facing services using a centralized decision engine at the reverse proxy layer, with lightweight agents on each application host feeding behavioral signals back to a central LAPI; integrated community threat intelligence to proactively block known malicious IPs before they reach any service
  • Identified and remediated active bot-driven registration attacks against public-facing services, implementing application-level controls and layered network enforcement to stop abuse without impacting legitimate users

Outcome

The homelab has run continuously for over 10 years, supporting ~80 concurrent users with near-zero unplanned downtime. High availability across the Proxmox cluster means any node can be removed for maintenance or upgrades without service interruption, with workloads migrating to available nodes automatically and rebalancing once the node is returned.

Stack

Proxmox VE · Proxmox Backup Server · TrueNAS SCALE · ZFS · 45Drives Q30 · Docker · Kubernetes · LXC · Ansible · Terraform · Wazuh · SigNoz · Tailscale · CrowdSec · 10GbE