Managing the Small Family Farm

The devops mantra is that infrastructure is livestock, not pets. Personal infrastructure will never be truely anonymous but we can use the tooling to make managing it a lot easier.

Posted by Tejus Parikh on July 25, 2023

A large part of why I haven’t blogged in a while is that the blog’s been busted. The latest version of Ubuntu LTS broke the versions of ruby powering everything. Recovery might have been possible, but this seemed like a good time to move on from infrastructure as a pet to infrastructure as livestock. Well, at least, livestock on a small family farm.

The Setup

This website is powered by Jekyll, a static site generator written in Ruby. I have a key plugins like Jekyll Picture Tag to handle images and Jekyll Site Map for generating an SEO friendly sitemap. This site started on Jekyll 0.11.2 with the first commit being in March 2013 (more than 10 years before this post!). Jekyll was early at the time, the plugins were early, and there were many hacks put in to get it all working.

Once the blog “compiles,” I use another plugin to push it to S3 which is fronted by a Cloudfront distribution. The push is done with S3 Website. My days of worry about Wordpress getting overloaded were over! The blog itself lived on a Digital Ocean droplet running an Ubuntu LTS that was manually configured to run all these things.

Ten years is an eternity in computer time and, no surprise, a lot of bit rot had set in. The last Ubuntu LTS update changed glibc, which broke the ancient Ruby running on the box. Not helping was that I had completely forgotten how everything had been put together. I didn’t even remember what S3 bucket things went to or how the SSL redirects were configured.

It was time to make use of the modern tools and rebuild the core infrastructure.

Where it’s going

The reality is that I don’t have full infrastructure farm. This is my personal setup so cost, ease of use, and how much I like using it are huge concerns.

What I decided to do was the following:

  • Consolidate on Amazon Web Services
    • Move the server from a Digital Ocean droplet to an EC2 instance
    • Put both my domains, vijedi.net and tejusparikh.com, on to Route53
  • Configure all the infrastructure with Terraform
  • Use Ansible for server automation.
  • Upgrade all things Jekyll
  • Evertying will still have a name and sshing to it is a primary use-case

Problem: I don’t own a computer

I didn’t bother getting a new personal laptop after my windows debacle, a minor problem when you need a launch platform for running all of the new IaC tools that where planned. The iPad is a wonderful device but falls very short on running developer tools locally.

The first order of business was to create a very bare bones launch setup script that could be run on anything. This script is on Github so I can easily access it on a brand new machine. An editor is also useful to have so I put my VIM setup into a similar structure for easy access.

With these two installed on a brand new EC2 instance, it’s time to get my keys and setup the rest.

Terraform to Manage the Landscape

With the launch box out of the way, the first real work was getting the infrastructure codified. I decided to move everything into AWS since I know it, I know how to interpret the documentation, and it does everything. There might be cheaper alternatives out there but differences are negligible at personal scale.

Another question is “Why Terraform?” Similar to AWS, I had some experience and it was generally positive. I like how it is cloud agnostic, the code reuse model, and the fact that it is a DSL. Somethings are just better when they are domain specific. The preview and confirmation is really nice and kept me from making a mistake more than a few times when bleary eyed. The early mornings and late nights for side project time isn’t at the high point of mental acuity.

resource "aws_instance" "dev_box" {
  ami               = data.aws_ami.ami.id
  instance_type     = var.ec2_instance_type
  availability_zone = var.aws_az
  key_name          = var.aws_key_name

  tags = {
    Name = "devbox"
    type = "devbox"
  }

  root_block_device {
    volume_size = "20"
  }

  lifecycle {
    ignore_changes = [ami]
  }
}

The above snippet of HashiCorp Configuration Language (HCL) creates the server in EC2.

While this post is not intended as a full HCL tutorial, there are a few concepts I want to highlight.

resource "aws_instance" "dev_box" {

This line is how you define a resource. The first string "aws_instance" is the type of resource with "dev_box" being an identifier that can be used later. I use this later in the configuration to attach an EBS to this instance.

data and var are two mechanisms to avoid hard-coding configuration that might change over time. In this example, data.aws_ami.ami.id is the result of a query for the most recent Ubuntu LTS ami. This ensures that a new server is fully up-to-date. var values can be set either on command line when applying the configuration or in a variable directive. I have my vars defined in a variable.tf file. I use the vars to allow for runtime configuration of what might change between different AWS regions.

A very important concept to keep in mind is that terraform apply updates the entire infrastructure definition with each resource changing in accordance with its defined lifecycle. For EC2, an AMI change will create a new server. However AMI’s have minor changes all time and I don’t want a whole new server every time I update a security group. The lifecycle directive states that this is a change that I would like to ignore.

Everything in a directory containing a main.tf is part of the infrastructure definition that will be applied on a terraform apply. I use this to keep each resource type in their own files.

A feature I really like about Terraform is how friendly the CLI is. The commands what is going to happen and, if something is non-standard, what you need to do next. The most common commands are terraform plan and terraform apply. Terraform creates a lock file on application. You can check this in to share the ability to manipulate the infrastructure across multiple hosts.

Ansible to shepherd the flock

Now that the infrastructure is setup it’s time to make the created box useful. I decided to use Ansible for this type of configuration management since the core contains all the major functions, like installing packages and running shell scripts.

The dev two functions of the dev server are to be a login host for building the blog and the vpn server for untrusted networks. Setting this up is straight forward, but a little tedious. Ansible turns the tedium into one command.

Ansible is conceptually similar to Terraform to the extent that Ansible now overlaps with the former’s core functions. Configuration is defined in files with a specific syntax which is then used by an interpretor to manage the infrastructure. The similarity does not carry over to the implementation or key concepts.

Terraform is directory based. The core component of Ansible is a “playbook,” which is a configuration file written in YAML that describes a collection of tasks. Ansible will execute the commands listed by the playbook against an inventory defined in an ini format. There are likely ways to manage the inventory automatically, but I handle this manually. My livestock does not turn over that frequently and it all has names.

- name: Setup EBS Storage
  hosts: devboxes
  become: true
  become_user: root

  tasks:

    - name: Create /mnt/data
      ansible.builtin.file:
        path: /mnt/data
        state: directory
        mode: '0755'

    - name: Create an ext4 filesystem on /dev/nvme1n1
      community.general.filesystem:
        fstype: ext4
        state: present
        dev: /dev/nvme1n1

    - name: Mount data filesystem
      ansible.posix.mount:
        path: /mnt/data
        src: /dev/nvme1n1
        fstype: ext4
        state: mounted

This ansible snippet configures the attached EBS volume to have an ext4 file system and mount it to /mnt/data. The state directive describes what happens when you run it again. So in this case, it will not repartition if the partition already exists nor mount it if already mounted.

A flaw in my approach to Ansible is the amount of hard-coding to get things to work. In the above snippet, the attached block device is set explicitly instead of discovered. My countermeasure against diving too deep into the discovery rabbithole is to make use of import_playbook to break the config into easily understandable chunks. This will give me enough contextual markers to figure out what the right answer was supposed to be if it should start failing in the future.

# main.yaml
- import_playbook: playbooks/core-playbook.yaml
- import_playbook: playbooks/user-tejus-playbook.yaml
- import_playbook: playbooks/infra-tools-playbook.yaml
- import_playbook: playbooks/wireguard-playbook.yaml
- import_playbook: playbooks/blog-playbook.yaml

When I need to update the box configuration, I can either run the main playbook or any of the sub-playbooks. I felt not having to run everything all at once was the one advantage of Ansible over Terraform, but I can also see consistency problems if Terraform allowed it.

Self Managed Livestock

I didn’t want to have to pay for a bunch of extra machines or additional cloud services to keep everything running. Both products have significant enterprise offerings used by big companies that cost a lot of money, but their free tiers are prefect for this use case. I can just run the terraforms from my little dev server. Ditto for Ansible, though it took a bit to get into the habit of updating and running the playbook instead of just doing apt-get install whatever.

Putting this process together didn’t add a lot more overhead once I figured my way around the commands and paid back almost instantly. In my first EC2 config, I went with the default setting of 8gb for the root device which was far from sufficient. All I had to do was set volume_size = "20" then:

$ terraform apply # this will get me a new server
# Confirm output, then check Hacker News
$ vim ansible/inventory.ini # set inventory to the new domain name
$ ansible-playbook all main.yaml # setup the new server correctly
# Check ESPN
$ ssh <newserver>.vijedi.net

It was nice to skip all the steps on how to set it up and just get back to the actual problem I was trying to solve.

Final thoughts

Perfect is the enemy of good and it especially applies in this case. The goal was not to have a great dev-ops setup, but have enough dev-ops where I could start writing posts and tinkering on code. Sure there are gaps, a little hard-coding, and a few rough edges that could be more seamless, but that’s now 5% of the effort instead of 100%.

Another benefit of IaC is the self documentation. I spent a lot of time trying to figure out which packages needed to be installed for which dependencies. Now I have them in appropriately named Ansible blocks. The same goes for all the other random configurations.

The biggest thing is that it didn’t actually take more time. Looking through my git history, I could see that my initial migration took the same amount of time as the rebuild. The automation investment has already paid off.

Huge value and very little cost; a perfect combo for any sized operation.

Original image is CC-licensed [original source]

Tejus Parikh

I'm a software engineer that writes occasionally about building software, software culture, and tech adjacent hobbies. If you want to get in touch, send me an email at [my_first_name]@tejusparikh.com.