Skip to content

Odinyg/homelab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

606 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Homelab

K3s cluster managed using Kustomize for GitOps deployment of self-hosted applications

K3s Kustomize SOPS FluxCD

Welcome to my homelab! This repository contains the complete Infrastructure-as-Code (IaC) for my Kubernetes homelab running on K3s. The cluster hosts various self-hosted applications for media streaming, productivity, monitoring, and AI workloads, all managed through GitOps principles using Kustomize overlays.

๐Ÿš€ Features

  • K3s Kubernetes cluster with NVIDIA GPU support for AI inference
  • GitOps deployment using Kustomize and FluxCD for automated continuous delivery
  • Automated dependency management with Renovate creating pull requests for updates
  • Secret management with SOPS and age encryption
  • External access via Cloudflare Tunnels and Tailscale
  • Monitoring with Prometheus and Grafana
  • Multi-storage support including local, NFS, and Longhorn distributed storage
  • SSL/TLS termination with Traefik ingress controller

๐Ÿ‡ Cluster

Infrastructure Automation

The homelab uses a GitOps approach with FluxCD and Kustomize for automated deployment and configuration management. FluxCD continuously monitors the Git repository and automatically applies changes to the cluster, ensuring the desired state is always maintained.

GitOps

  • FluxCD - Automated GitOps continuous delivery and reconciliation that watches the repository for changes and automatically deploys updates
  • Renovate - Automated dependency updates via pull requests for container images, Helm charts, etc.
  • Kustomize overlays - Environment-specific configurations with base/staging structure
  • SOPS encryption - Secure secret management with age keys integrated into GitOps workflows
  • Automated reconciliation - Ensures cluster state matches Git repository at all times

Directories

This Git repository contains the following top level directories:

๐Ÿ“ apps/                    # Applications deployed into the cluster
โ”œโ”€๐Ÿ“ base/                  # Base application configurations  
โ””โ”€๐Ÿ“ staging/               # Environment-specific overlays
๐Ÿ“ infrastructure/          # Infrastructure components and controllers
โ”œโ”€๐Ÿ“ controllers/           # Cluster infrastructure (monitoring, storage, etc.)
โ””โ”€๐Ÿ“ configs/               # Configuration overlays and secrets

๐Ÿ–ฅ๏ธ Tech Stack

Infrastructure

Logo Name Description
K3s Lightweight Kubernetes distribution
Kustomize Kubernetes native configuration management
SOPS Secrets management with age encryption
Traefik Modern HTTP reverse proxy and load balancer
Longhorn Cloud native distributed block storage
Prometheus Systems monitoring and alerting toolkit
Grafana Operational dashboards and visualization
cert-manager Cloud native certificate management
MetalLB Bare metal load balancer for HA services
Helm The package manager for Kubernetes
Proxmox VE 3-node HA cluster with Ceph storage
Ceph Distributed storage across Proxmox cluster
TrueNAS Scale N100 NAS with ZFS storage and application hosting
UniFi Network Enterprise networking with UDM Ultra, 16-port switch, and APs

Applications (by namespace)

Media & Entertainment

Icon Application Category Description Status
๐ŸŽฌ Jellyfin Media Server Self-hosted media streaming with GPU transcoding โœ… Deployed
๐Ÿ“š Audiobookshelf Audio Books Self-hosted audiobook and podcast server โœ… Deployed

Productivity & Tools

Icon Application Category Description Status
๐Ÿ”– Linkding Bookmark Manager Minimal bookmark management โœ… Deployed
๐Ÿค– Ollama + Open WebUI AI/LLM Local large language model deployment โœ… Deployed

Gaming & Streaming

Icon Application Category Description Status
๐ŸŽฎ Steam Headless Cloud Gaming Steam with Sunshine streaming server ๐Ÿ“ฆ Archived

Infrastructure & Monitoring

Icon Application Category Description Status
๐Ÿ“Š Grafana Dashboard Operational dashboards and monitoring โœ… Deployed
๐Ÿ” Vault Secrets Management HashiCorp Vault for secret management ๐Ÿšง Testing
๐Ÿ’พ Longhorn Storage Management Distributed storage management UI โœ… Deployed
๐Ÿ”„ Renovate Automation Automated dependency updates โœ… Deployed

๐Ÿ”ง Hardware Requirements

Current Hardware

  • Proxmox Cluster: 3-node HA cluster with Ceph distributed storage
    • Main Station PC: Primary node with NVIDIA RTX 3090 for GPU workloads
    • XPS 15: Laptop node with 5Gb WizDPI networking
    • Razer 15: Laptop node with 5Gb WizDPI networking
  • TrueNAS Scale: N100-based NAS with 4x5Gb networking
    • Services: Jellyfin (main instance), PostgreSQL, Redis
    • Planned: S3 object storage for backups and application data
  • Docker Host: N100 mini PC running various containerized services (always-on)
  • Primary K3s Node: mainkube - VM on main PC with GPU passthrough for testing/development and AI
  • Network Infrastructure:
    • UniFi Ultra: Core router/firewall/controller
    • UniFi Enterprise 16-Port PoE: Managed switching with PoE+
    • UniFi Access Points: WiFi coverage
    • 5Gb Backbone: WizDPI networking for high-speed inter-cluster communication
  • Storage: Ceph cluster + TrueNAS ZFS + SMB/NFS shares

Planned Hardware (K3s HA Expansion)

  • Worker Nodes: Raspberry Pi cluster running Talos OS
  • Control Plane Nodes: K3s VMs across all Proxmox cluster nodes for HA
    • Main PC VM: Primary control plane with GPU passthrough
    • XPS 15 VM: Secondary control plane node
    • Razer 15 VM: node
  • Load Balancing: MetalLB for service distribution across K3s nodes
  • High Availability: Multi-master K3s cluster architecture
  • Storage Integration: Longhorn + TrueNAS S3 backend

๐ŸŽฎ NVIDIA GPU Support

Prerequisites

  1. NVIDIA drivers installed on host
  2. NVIDIA Container Toolkit configured
  3. Compatible GPU with driver version available at https://download.nvidia.com/XFree86/Linux-x86_64/

Configuration

# Configure K3s with NVIDIA runtime
sudo nvidia-ctk runtime configure \
  --runtime=containerd \
  --config=/var/lib/rancher/k3s/agent/etc/containerd/config.toml

sudo systemctl restart k3s

GPU-Enabled Applications

  • Jellyfin: Hardware transcoding for 4K media
  • Ollama: Accelerated LLM inference
  • Steam Headless: GPU-accelerated game streaming

๐Ÿ”’ Security & Access

Secret Management

All sensitive data is encrypted using SOPS with age encryption: - SMB/CIFS credentials for media storage

  • Cloudflare tunnel certificates
  • Application secrets and API keys

External Access

  • Cloudflare Tunnels provide secure external access to select services
  • Tailscale provides secure VPN access to the entire homelab network for personal use
  • Traefik handles internal routing and SSL termination
  • Network isolation via Kubernetes namespaces

๐Ÿ’พ Storage Strategy

Storage Classes

  • local-path: Fast local storage for testing purposes and stateful apps
  • longhorn: Replicated distributed storage for critical data
  • nfs: Network storage for large media files
  • smb: Windows SMB/CIFS shares for existing media libraries
  • s3: Object storage (not implemented yet) - planned for backups, archive storage, and application data

Backup Strategy

  • Proxmox VM snapshots for complete node backup and disaster recovery
  • Longhorn snapshots for critical application data with automated scheduling
  • Git repository contains all configuration as code with encrypted secrets
  • Automated backup pipeline to S3 storage (planned)

๐Ÿ› Troubleshooting

GPU Issues

# Verify GPU is available in cluster
kubectl describe nodes | grep nvidia.com/gpu

# Check NVIDIA runtime configuration
sudo nvidia-ctk runtime configure --runtime=containerd --config=/var/lib/rancher/k3s/agent/etc/containerd/config.toml

๐Ÿ”ฎ Roadmap

  • Add Raspberry Pi workers - Deploy Talos OS on RPi cluster for HA
  • Add laptop control plane nodes - Configure XPS 15 and Razer 15 as K3s masters/workers for HA control plane
  • MetalLB implementation - Load balancer for service distribution across K3s nodes
  • Deploy Vault - Centralized secret management across cluster
  • Docker container migration - Move services from N100 mini PC to K3s cluster
  • S3 storage backend - Implement object storage on TrueNAS Scale
  • Tailscale operator - Native Kubernetes integration for VPN access
  • Production environment - Create production overlay configurations

About

Kubernetes homelab

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors