Homelab Journey -- Quarantine Updates
This is a follow up to this post from a few years ago:
TLDR: Over the years I went from self-hosting & deploying code on a Raspberry Pi cluster with Docker & Docker Compose, then Docker Swarm, then to a proper x86 server with the Hashistack, then briefly with the open-source platform as a service (PAAS) CapRover, and finally arrived at the lightweight Kubernetes distribution, k3s. I learned a bunch, had fun, and arrived at a setup that served my self-hosting desires.
Note: Running Kubernetes, or any clustered container orchestrator for that matter, for a home server is an act of hubris. Much of this is admittedly overkill though I was driven to learn and discover more about the options out there so this is where I've arrived, and I'm very grateful I have. With that said, a simple docker-compose or CapRover setup would most likely be a beautiful solution for most homelabs out there; but I digress.
Setting up a reliable and productive home server
has been an interest of mine for years now. Through quarantine, I found myself with more time to myself. With the extra time on my hands and in much need of a distraction, this seemed like the perfect opportunity to revisit my homelab project.
If you read through the linked post you'll see that I started a few years ago with a raspberry pi cluster with docker swarm. I had a lot of fun with this setup though I found myself dissatisfied with a number of qualities with maintaining and using the server.
First I found the challenges of hosting on the ARM architecture a constant source of friction. This has gotten a bit better over the few years since, and I think that ultimately reduced instruction set architectures will be the future of hosting. However for a homelab; at this time it's just not worth the uphill battle unless you have some very good reasons. Cost, power usage, or if you have a relatively simple use case are a few reasons that I can think of.
I discovered the r/homelab subreddit
sometime after that setup. I knew I appreciated the small form factor & size of that Raspberry Pi Cluster I'd been using but I wanted to find something comparably manageable but with an x86 CPU. At the time I consistently saw posts of the HP MicroServer series. After some time reading about them I ended up picking up a used HP MicroServer Gen8 for a great deal at $150. It had a measly dual-core Celeron & 16GB of RAM, but that was a huge step up from the Raspberry Pis.
I had originally wanted to set up my Raspberry Pi cluster with Kubernetes though I resolved to Docker Swarm after I was unable to get Kubernetes compiled for ARM (it was much more challenging at the time to run k8s on ARM). This time I ended up reading an article from a respected peer about their success with going from Docker Swarm to the Hashistack. The Hashistack entails the "workload" orchestrator Nomad, their service discovery service Consuul, & Vault for secret management. These were all mostly new terms to me at the time. Docker Swarm really is excellent for how simple it is to use, comparatively... I was aware of secret management though I took for granted the fact that internal DNS & load balancing is handled out of the box for services with Docker Swarm. In the Hashistack, the internal DNS & load balancing for services & their replicas are handled by Consuul.
I set up Ubuntu on my new to me HP MicroServer Gen8
and began setting up the Hashistack and its dependencies. This ended up being a painful process compared to what I had experienced with Docker Swarm. Firstly there were multiple pieces of software to install & configure to work together cooperatively. I did kind of appreciate this as it's very in line with the Unix philosophy of composing solutions with small dedicated processes. However, I found it frustrating that I had to write & set up systemd service configuration files for each of the processes. The documentation from Hashicorp was detailed though challenging to get everything up and going with that as the only substantive resource. Also, it didn't help that the majority of the searches I did online turned up resources talking about Kubernetes...
I spent a few months figuring out and running the Hashistack setup. There was a lot I liked about it and I certainly learned a bunch. Consuul, Vault & Nomad all had very nice to interact with CLIs and web interfaces, right out of the box. The developer experience of taking a given docker container & writing Nomad "job" files was smooth enough. I especially liked the "template" stanza and how easy it made to inject values into a configuration file that I'd then mount inside the containers. I even found a VSCode extension to help with the HCL configuration file syntax. In that time I discovered the r/selfhosting subreddit and deployed over a dozen open source services to my Hashistack cluster, including a wiki, Home Assistant, Docker Container registry, and many others. After about 6 months of running this setup, I had a lot of things figured out that I really liked.
I routinely browsed the r/selfhosted subreddit
and listened through all of the Selfhosted podcast episodes. I'd even gotten a Wireguard VPN set up and had figured out how to mount a USB Zigbee stick into the Home Assistant container to control my Tradfri & Hue lights. However, managing and formalizing my Hashistack setup still felt like a constant uphill battle. For one I never entirely got the Vault secret management integrated into Nomad. So I wasn't able to properly inject secrets into my services, which is pretty important.
However, I was creating a snowflake server...
The further along I got my Hashistack homelab setup the more precious it began to feel. The awareness that it would be exceedingly difficult to reproduce the server setup was nagging at me. This resulted in me feeling reluctant to improve it further. So I began pursuing an infrastructure as code(IAC) solution to reliably reproduce & version my changes. I reached out to my peer to speak about my experience and how they'd handled it with their Hashistack setup. I learned that they too had resolved to write their own IAC to provision and manage their infrastructure.
I set out to put together an IAC solution to manage my setup. I'd decided to invest my time into learning to use Ansible for this. I'd been meaning to learn how to use this technology for some time and this seemed like the perfect opportunity. I'd selected Ansible as it seemed like a clear choice due to its claim of not needing any agent installed on the server, along with its excellent community. I'd had dreams that this would lead to a reliably reproducible setup that would encourage me to go ahead and continue improving it.
The first roadblock I'd run into was to discover that Ansible's claim of not needing an agent installed to run on the server is only partially true. You do in fact need a specific version of Python to be installed for full support. As I continued I'd also discovered that there were some incompatibilities with the latest v20 LTS of Ubuntu that I was using for my server OS. That actually ended up being a recurring theme. For another example, I'd found that the documentation for setting up local mDNS reflection for Consuul's local service DNS didn't seem to work due to some new changes in the v20 LTS. None of the resources I'd found seemed to do the trick, or at least I wasn't able to sort it out.
I ended up discovering some Hashistack Ansible Playbooks on Ansible Galaxy. Ansible Galaxy is like GitHub but specifically for community-maintained Ansible IAC "Playbooks". I was initially excited and saw some documentation of people having success provisioning and maintaining their Hashistack setups with these Ansible Playbooks. Though as I dove in I was once again running into compatibility issues with the v20 LTS of Ubuntu. I'd also learned that the Hashistack Ansible Playbooks were no longer maintained...
I came to a period of reflection here
and started to ask myself if this was a hill I wanted to continue to climb. The Hashistack was serving much of my needs but the friction of improvements left me wondering if this was a dead-end for me. After all, this was a Homelab and a side project. Did I really want to be out in the deep on the bleeding edge? No, I wanted to focus on using the services I wanted to host and create a convenient infrastructure to develop on & deploy my side projects without a recurring expense.
Frustrated I'd started keeping an eye out for other solutions. I took note of any time I saw mention of what people were running on their homelabs on the r/homelab or r/selfhosted subreddit. I saw k3s mentioned a few times and thought it seemed appealing. It certainly seemed like a solution for the issues I was running into with the Hashistack, by touting to be a single binary with no other dependencies needed. I dismissed this though, as I figured there was no way it could really be that simple. I'd been driven to the Hashistack partially because I'd heard claims that it was simpler to work with than Kubernetes.
So I sought out to see if I could find something simpler.
Something more self-contained than my Hashistack setup had become, though with more features than docker swarm alone. And ultimately a solution with an easier, more reproducible deployment process. So back to the r/selfhosted subreddit I went. I searched for phrases like "open source heroku", to see if I could find an open-source platform as a service(PAAS). I'd recently had a nice experience using Digital Oceans App Platform PAAS offering and thought it could be nice to have the same or similar experience on my own hardware.
I found a number of interesting open source Platform as a Service (PAAS) solutions though CapRover seemed to stand out as a recommendation on the r/selfhosted subreddit. After trying out CapRover for about a month, there was a lot to like. The web interface is pretty slick, there are one-click deployments of a number of open-source projects. Though arguably most important for a PAAS, you can hook up a git repo and let CapRover handle building and deploying your project automatically on commit pushes. This is overall a pretty excellent project and I would recommend it for many simpler homelab setups, right alongside Docker Compose.
CapRover is built off of Docker Swarm so it supports clustered deployments, though docker swarm has fallen out of mainstream use. Unfortunately, though the usage of Docker Swarm makes it a no-go for my homelab as there's no way to mount external USB devices into containers, which has ended up being a pretty critical part of my setup. I use this for my Home Assistant Zigbee USB device & also with a Coral USB TPU for performant and low-power computer vision. I had a lot of fun with this project though and I really wanted it to work for me but for those reasons I continued looking.
At this point, it was time to bite the bullet and revisit Kubernetes options. This was the mainstream option after all and through this whole multi-year experience the predominant quantity of search results returned resources about Kubernetes. Since my last endeavor with Kubernetes a few years back the concept of Kubernetes "distributions" had become mainstream. I'd learned that a Kubernetes distribution was much like a Linux distribution in that it packaged all of the pieces of the solution into a single consumable product. Kubernetes is rather complex and this goes a long way to greatly reduce the barrier to entry.
So I dove in and revisited the lightweight k3s Kubernetes distribution I'd been reading about on the r/selfhosted subreddit. I intended to test out the claims of easy deployment with a lack of external dependency requirements. And to my surprise, this ended up actually being the whole truth. I was able to deploy a working k3s cluster on my Ubuntu v20 LTS install on my HP MicroServer without a hitch. I started the process of porting my Hashistack jobs over to Kubernetes and decided this was how I was going to proceed.
I learned about the k3OS Linux distribution
which is dedicated to the lightweight Kubernetes distribution k3s and reflashed my machine with it. From there I began getting the setup up to speed with where I had been prior with the Hashistack. One of the first challenges was learning how DNS was handled with k3s. I wanted to avoid modifying anything from the out-of-the-box k3s distro if possible. k3s comes with a number of services preconfigured via helm charts on startup and modifications don't stick unless you take significant effort. The out of the box DNS solution was CoreDNS which was what I'd resolved to with my Hashistack as it's the most recommended, modern, & container-native solution at this time. I ended up configuring my DNS registrar Cloudflare to point to the LAN IP address for my machine with a wildcard off of my domain and from there I was off to the races for local access.
k3s & k3OS had clearly lived up to their claims and solved the snowflake server issues I had with my last solution. It took about a couple of months until I had reached and surpassed where I was with my Hashistack infra in regards to the number of services and reproducibility of my server. My infrastructure no longer felt precious.
I assembled a fancy home networking wall.
This was another trend that I'd discovered from the Homelab subreddit. It was composed of a Unifi Dream Machine with a 3D printed wall mount, a new Modem, a Unifi Cloud Key for some Wifi security cameras, & a UPS power strip, all mounted on a metal peg board. The UPS really comes in handy as usually, the internet doesn't go out when the power does so with the backup power, internet activity can resume during momentary power grid failures. Running a home server has brought me a newfound awareness of how frequently these can happen throughout the year.
Around this point, I actually had a well enough working setup that I was beginning to exhaust the resources on the HP MicroServer Gen8. I was running every cool open-source service I could find from the r/selfhosted subreddit and having a lot of fun with it. It was around this point that I decided I wanted to invest in a bit more of a powerful setup so I could have more headroom for the future. I did a lot of research and discovered the HPE MicroServer Gen10 Plus could handle a fair amount of upgrades so I went ahead and purchased a new, base model with the intention of swapping out the RAM, CPU & adding the ILO addon. I maxed out the machine with an Intel Xeon E-2246G & 64GB of ECC RAM, installed the ILO addon, swapped the hard drive over, and after a bit of headache got the new machine up and running. This is the setup that I'm running still at the time of writing this.
At this point, I could mostly just enjoy the setup.
Home Assistant smart home automation had become an integral part of the household and I had pretty much tried every major project talked about on the r/selfhosted subreddit. I had one of the very popular start page setups like the posts I'd seen on Reddit, full of links to services running on my own machine and I'd felt that this project was nearly coming to a conclusion after years of working toward that point.
I then found myself with one last significant issue that needed to be addressed, and that was backups... This became the remaining blocker giving me pause to making more use of my server. I waffled around on this one and ended up going with Velero for backups. I'm still trialing and refining this though I think it will most likely be what I stick with. I may add a full system snapshot into the mix as well at some point. I'm also considering trying out a service like BackBlaze for my remote s3 backups. The r/homelab subreddit has also pretty well convinced me that I may want a local backup solution as well. I could use my old Synology NAS for this. Hard drives sure are expensive at this point in time though...
So will I stop here?
Yes, most likely as this setup satisfied most of my goals and has been both reproducible & maintainable. Now that I'm using Kubernetes I'm able to tap into the wealth of community-maintained resources and tools. So I'll probably be sticking with k3s or another similar Kubernetes distribution for a long time.
However, to be totally honest I'd still like to find a solid way to incorporate the flexibility of Kubernetes & the features of a PAAS. With that said I'll be keeping my eyes on these and any other similar projects which provide a PAAS set up on top of Kubernetes, which sounds like the best of both worlds to me.