Virtualization with Arch Linux and Qemu/KVM - part 2

To make life a bit easier now I’ll use Ansible to automate a few tasks. I’ve already written about Ansible in my blog post Kubernetes the not so hard way with Ansible. I haven’t automated everything as it’s not worth the effort for my private stuff. So some tasks I might also do manually.

To be able to work with the new hosts make sure that DNS entries are in place (or /etc/hosts entries might also work). So if you own example.com the hostnames would be k8s-010100.example.com, k8s-010200.example.com and k8s-010300.example.com.

Throughout this blog post series I’ll use the network 192.168.10.0/23. Physical Hosts will get IPs starting from 192.168.10.2 and Virtual Machines will get IPs starting from 192.168.11.2. So while in the network configuration Physical Hosts and Virtual Machines will get a /23 network mask, the Physical Hosts will get IPs from 192.168.10.0/24 and the Virtual Machines from 192.168.11.0/24. That makes handling network related stuff easier e.g. for setting up firewall rules that blocks access from Virtual Machines to the Physical Hosts.

In order to be able to login without password, ssh-copy-id can be used to copy locally available keys to authorize logins on a remote machine. So lets do this by running ssh-copy-id k8s-010100.example.com (and for the other hosts too of course).

To not mess up with the Python system installation Python’s venv module is pretty handy. This helps to create an isolated environment. Ansible installation pulls quite a few other modules. Having them only where they’re needed makes very much sense. This avoids version conflicts of Python modules e.g. Also you can try the latest Ansible versions before using them in production e.g.

In my case everything Ansible related I’ll put into /opt/scripts/ansible directory. We all know naming is the most difficult thing in IT 😄 In general it’s a good idea to avoid giving resources “real” names (like pets) but use some schema that makes automation easier. My Kubernetes (K8s) cluster will be called k8s-01. If I decide to install another K8s cluster it will be k8s-02 and so on. As I’ve three Physical Hosts, these hosts will be called k8s-010100, k8s-010200 and k8s-010300. The Virtual Machines (VM) will be called k8s-010101, k8-010102, k8s-010103, k8s-010201, and so on. So the hierarchy will look like this:

plain

k8s-01              # K8s cluster name
  |-> k8s-010100    # Physical host #1
    |-> k8s-010101  # VM #1 running etcd
    |-> k8s-010102  # VM #2 running K8s control plane
    |-> k8s-010103  # VM #3 running K8s worker
  |-> k8s-010200    # Physical host #2
    |-> k8s-010201  # VM #1 running etcd
    |-> k8s-010202  # VM #2 running K8s control plane
    |-> k8s-010203  # VM #3 running K8s worker
  |-> k8s-010300    # Physical host #3
    |-> k8s-010301  # VM #1 running etcd
    |-> k8s-010302  # VM #2 running K8s control plane
    |-> k8s-010303  # VM #3 running K8s worker

This setup offers some redundancy in case one host is down. So on my Ansible controller node (the laptop/host where I run the ansible and ansible-playbook commands on) I change to the directory /opt/scripts/ansible as mentioned above and create a Python Virtual Environment venv:

bash

python3 -m venv k8s-01_phy

That creates a directory called k8s-01_phy. k8s-01 is for “Kubernetes Cluster 01” and phy for “Physical Hosts”. While this is not relevant for this tutorial I’ll also create a Virtual Environment called k8s_01_vms. So everything related to the Physical Hosts (which means Arch Linux in my case) will be managed by k8s-01_phy and every thing related to the Virtual Machines which will run later on this Physical Hosts will go into k8s_01_vms. So lets enter this directory now. Normally you have to activate that venv with

bash

source ./bin/activate

But to make things a bit easier I’ve installed a tool called direnv. It should be included basically in every Linux OS package manager. So just install it with apt, yum, pacman or whatever package manager. You also need a hook. It’s just one line in $HOME/.bashrc. E.g. for Bash:

bash

eval "$(direnv hook bash)"

Please read direnv hook on how to setup (also for other shells). In my venv there is a file called .envrc and it looks like this:

bash

export ANSIBLE_HOME="$(pwd)"

source ./bin/activate

With direnv allow I kinda activate .envrc. So every time I enter this directory now direnv will set ANSIBLE_HOME environment variable and load the venv. ANSIBLE_HOME is by default the “root” directory of other variables e.g. roles_path. So Ansible Roles will be installed in $ANSIBLE_HOME/roles e.g. If I leave that directory it’ll unload the loaded environment and your venv settings are gone. To check if the venv is loaded you can run which python3 or which python. It should point to bin/python3 or bin/python within your venv directory.

Next lets upgrade Python package manager to the latest version (this now only happens within the venv and doesn’t touch you system Python):

bash

python3 -m pip install --upgrade pip

Now lets install Ansible. At time of writting this blog post the latest version is 9.1.0 (just remove the version number if you don’t want to specify it)

bash

python3 -m pip install ansible==9.1.0

In the venv’s bin directory there should be a few Ansible binaries now:

bash

ls -1A bin/ansible*
bin/ansible
bin/ansible-community
bin/ansible-config
bin/ansible-connection
bin/ansible-console
bin/ansible-doc
bin/ansible-galaxy
bin/ansible-inventory
bin/ansible-playbook
bin/ansible-pull
bin/ansible-test
bin/ansible-vault

A very helpful tool is ansible-lint. Ansible Lint is a command-line tool for linting playbooks, roles and collections aimed toward any Ansible users. Its main goal is to promote proven practices, patterns and behaviors while avoiding common pitfalls that can easily lead to bugs or make code harder to maintain. It can be installed with python3 -m pip install ansible-lint. Just run ansible-lint in the venv directory and the tool will give you hints on how to improve your code if needed.

For Ansible to be able manage these hosts a hosts file is needed in the venv directory. The first line is the group name. In my case it will be k8s_01 (k8s_01 for K8s cluster 01). Only the Physical Hosts will be specified in this group. So the next line specifies these hosts (below hosts) E.g. (YAML format but also INI format is possible):

yaml

k8s_01:
  hosts:
    k8s-01[01:03]00.example.com:
      ansible_port: "22"
      ansible_user: "deploy"
      ansible_become: true
      ansible_become_method: "sudo"
      ansible_python_interpreter: "/usr/bin/python3"

This basically tells Ansible that hosts k8s-010100 to k8s-010300 belong to this group. Also a special variable called ansible_python_interpreter is specified here. It specifies the target host Python path. As Arch Linux can have more Python versions installed we tell Ansible what Python to use. For more information see How to build your inventory.
I also specify that Ansible should connect to port 22 and connect as user deploy to the remote hosts. In order to enable Ansible to execute tasks that require “root” permissions I specify ansible_become: true and ansible_become_method: "sudo".

To make life a little bit easier lets create ansible.cfg file in your venv. For now it will only have one entry. E.g.:

ini

[defaults]
inventory = hosts

This avoids that I need to specify -i hosts all the time when running ansible or ansible-playbook (see below). To generate an example ansible.cfg one can use ansible-config init --disabled > /tmp/ansible.cfg. Afterwards inspect /tmp/ansible.cfg for other options you might want to set. Also see Ansible Configuration Settings.

Ansible needs Python installed on the remotes hosts. So in case Python isn’t already installed Ansible helps too here. Create a directory playbooks in the venv directory and create a file called bootstrap_python.yml with the following content:

yaml

---
- name: Bootstrap Python
  hosts: k8s_01
  gather_facts: false

  tasks:
    - name: Updating pacman cache
      ansible.builtin.raw: "pacman -Sy"
      changed_when: false
      failed_when: false
      become: true

    - name: Install Python
      ansible.builtin.raw: |
        pacman -S --noconfirm python        
      args:
        executable: "/bin/bash"
      changed_when: false
      become: true

Then run ansible-playbook playbooks/bootstrap_python.yml. To verify that Python is installed run ansible -m command -a "python -V" k8s_01. Now for every host you should see an output like this:

bash

k8s-010100.example.com | CHANGED | rc=0 >>
Python 3.11.6

With ansible -m ping k8s_01 you should now be able to “ping” all hosts in the group with Ansible’s ping module e.g.:

bash

k8s-010100.example.com | SUCCESS => {
    "changed": false,
    "ping": "pong"
...

With Ansible in place and configured lets further configure the Kubernetes hosts.

To make the Kubernetes hosts a little bit more secure I’ll use my Ansible role githubixx.harden_linux. So lets install it:

bash

ansible-galaxy role install githubixx.harden_linux

It will end up in $ANSIBLE_HOME/roles/ directory. To customize the role a few variables are needed. Since the variable values will be valid for all hosts in k8s_01 group I’ll create a file k8s_01.yml in group_vars directory (just create it if not there yet). The content will look like this (read comments what the settings are all about and githubixx.harden_linux README about all possible settings):

yaml

---
# Configure and use "systemd-timesyncd" as time server
harden_linux_ntp: "systemd-timesyncd"

# Create a user "deploy" on the remote hosts. This user is meant to be used
# by Ansible. Just use the same user that was created in first part of this
# blog post or use a different one. The encrypted password can be created
# with the following command (on Linux):
#
# mkpasswd --method=sha-512
#
# The $HOME directory of that user will be "/home/deploy". And the public SSH
# key stored in "/home/deploy/.ssh/id_rsa.pub" on your Ansible controller
# host will be copied to the remote hosts into
# "/home/deploy/.ssh/authorized_keys".
harden_linux_deploy_group: "deploy"
harden_linux_deploy_group_gid: "1000"
harden_linux_deploy_user: "deploy"
harden_linux_deploy_user_password: "<output of mkpasswd --method=sha-512>"
harden_linux_deploy_user_uid: "1000"
harden_linux_deploy_user_home: "/home/deploy"
harden_linux_deploy_user_shell: "/bin/bash"
harden_linux_deploy_user_public_keys:
  - /home/deploy/.ssh/id_rsa.pub

# Enable IP forwarding. This is needed to forward network packages between
# VMs e.g. 
harden_linux_sysctl_settings_user:
  "net.ipv4.ip_forward": 1
  "net.ipv6.conf.default.forwarding": 1
  "net.ipv6.conf.all.forwarding": 1

# Change default SSH port to 22222. This is of course optional and may
# prevent some SSH attacks on default SSH port 22. But this is of course more
# security by obscurity ;-)
harden_linux_sshd_settings_user:
  "^Port ": "Port 22222"

# Setup "Uncomplicated FireWall (UFW)". Allow incoming network traffic on
# port 22 and 22222. Also no restrictions for network traffic originating
# from hosts that belong to "192.168.10.0/24" network. Also enable UFW
# logging and set the default FORWARD policy to "accept". This policy is
# needed so that traffic between the VMs can flow later. Otherwise network
# packets are not forwarded between the VMs network interfaces.
harden_linux_ufw_rules:
  - rule: "allow"
    to_port: "22"
    protocol: "tcp"
    delete: true
  - rule: "allow"
    to_port: "22222"
    protocol: "tcp"
harden_linux_ufw_allow_networks:
  - "192.168.10.0/24"
harden_linux_ufw_logging: 'on'
harden_linux_ufw_defaults_user:
  "^DEFAULT_FORWARD_POLICY": 'DEFAULT_FORWARD_POLICY="ACCEPT"'

# "sshguard" should not block failed login attempts from these hosts.
harden_linux_sshguard_whitelist:
  - "127.0.0.0/8"
  - "::1/128"
  - "192.168.10.0/24"

# Configure "systemd-resolved". While it looks strange it's actually correct ;-)
# To override already existing default values they first need to be set to an
# empty string and in the next line the new value will be set.
harden_linux_systemd_resolved_settings:
  - DNS=
  - DNS=9.9.9.9 1.1.1.1 2606:4700:4700::1111 2620:fe::fe
  - FallbackDNS=
  - FallbackDNS=149.112.112.112 1.0.0.1 2620:fe::9 2606:4700:4700::1001
  - DNSOverTLS=
  - DNSOverTLS=opportunistic
  - Domains=
  - Domains=example.com

Now I create an Ansible playbook file called k8s.yml. E.g.:

yaml

---
-
  hosts: k8s_01
  roles:
    -
      role: githubixx.harden_linux
      tags: role-harden-linux

So for all hosts in the hosts group k8s_01 the Ansible role githubixx.harden_linux should be applied. Additionally also a tag called role-harden-linux will be applied to this role. The advantage by using this format is that when a host group contains more roles later, one can apply just one role to the hosts by using this tag and doesn’t have to apply all roles specified.

Finally apply the role:

bash

ansible-playbook --tags=role-harden-linux k8s.yml

Since I changed the SSH port to 22222 I now need to adjust that setting in Ansible’s hosts file too. Otherwise Ansible can’t connect to the remote hosts anymore:

yaml

ansible_port: "22222"

The next step will install libvirt, QEMU, KVM and everything is needed to make it work. libvirt is a collection of software that provides a convenient way to manage virtual machines and other virtualization functionality, such as storage and network interface management. The most important parts for my purpose is most probably libvirtd and virsh. libvirtd allows me to connect remotely from my laptop to all hosts and manage the VMs there e.g. virsh is basically a CLI tool to manage everything VM related on the hosts. On my laptop I’ve virt-manager installed that I’ll use later to manage my VMs. QEMU is a generic and open source machine emulator and virtualizer. QEMU can use other hypervisors like KVM to use CPU extensions (HVM) for virtualization. When used as a virtualizer, QEMU achieves near native performances by executing the guest code directly on the host CPU.

To setup every thing libvirt related I put again an Ansible playbook file called libvirt.yml into the playbooks directory. And it has this content:

yaml

---
- name: Setup libvirtd
  hosts: k8s_01
  gather_facts: false

  tasks:
    - name: Install iptables-nft
      ansible.builtin.raw: |
        yes | pacman -S iptables-nft        
      args:
        executable: "/bin/bash"
      changed_when: false
      become: true

    - name: Install libvirtd and packages for virtualization
      ansible.builtin.package:
        pkg: "{{ packages }}"
        state: "present"
      retries: 2
      delay: 5
      vars:
        packages: 
          - edk2-ovmf
          - dnsmasq
          - openbsd-netcat
          - dmidecode
          - libvirt
          - qemu-base
          - util-linux
          - bash-completion

    - name: Add deploy user to libvirt group
      ansible.builtin.user:
        name: "deploy"
        groups: "libvirt"
        append: "yes"

    - name: Restart libvirtd
      ansible.builtin.service:
        name: "libvirtd"
        enabled: true
        state: "restarted"

    - name: Restart UFW
      ansible.builtin.service:
        name: "ufw"
        enabled: true
        state: "restarted"

Lets discuss quickly what happens here. First I need to replace iptables package with iptables-nft. iptables comes with iproute2 package as a dependency. iptables and nftables are both firewall frameworks in Linux, with iptables being the traditional framework and nftables being the newer and more streamlined one. While iptables offers backward compatibility and wide support, nftables provides better performance, scalability, and a simpler syntax for managing firewall rules. UFW which is the firewall frontend used in my githubixx.harden_linux role for firewalling needs is a frontend that also works with nftables. Since iptables and iptables-nft are in conflict and the former one a dependency of iproute2 I need to use the Ansible raw module to explicitly confirm that I want to replace iptables with iptables-nft. In most cases for the enduser it’s a drop-in replacement.

Then a few packages will be installed. edk2-ovmf is a TianoCore project to enable UEFI support for Virtual Machines. dnsmasq for the default DHCP networking. But as I use bridged networking I most probably don’t need it. openbsd-netcat is for remote management over SSH. So that is used by virsh and virt-manager later. dmidecode is for DMI system info support. If not installed libvirtd will also complain that it can’t find dmidecode binary in the path. About libvirt I’ve already written above. And qemu-base contains the headless versions of QEMU without any GUI components (which are not needed on a server).

If you plan to also run Windows 11 guests consider installing swtpm package too. QEMU can emulate Trusted Platform Module, which is required by some systems such as Windows 11. swtpm package provides a software TPM implementation.

Then the playbook will add the deploy user (change that one accordingly) to the libvirt group. This is needed to access libvirtd. Members of the libvirt group have passwordless access to the RW daemon socket by default. So I can easily connect as user deploy (or whatever user you use) via a secure SSH connection to the remote hosts with virsh or virt-manager and everything works magically already 😄

Finally libvirtd and UFW will be restarted to catch up with all the changes (e.g. the UEFI support).

So lets run the playbook:

bash

ansible-playbook playbooks/libvirt.yml

If everything worked as intended the following command should work now from your laptop (pretended that you have libvirt package installed there and of course replace deploy with your user and k8s-010100 with your hostname):

bash

virsh --connect qemu+ssh://deploy@k8s-010100/system sysinfo

This should print some XML output about your remote hosts BIOS, processor, RAM, and so on.

So currently it’d be already possible to create Virtual Machines. But I don’t want to store the VM images as qcow2 files in the default directory /var/lib/libvirt/images for performance reasons. If you don’t care about disk performance that much and take the performance penalty you can skip this section. If it comes to flexibility then qcow2 images are still the easiest way to handle Virtual Machine images locally on a host.

To manage the storage for the VMs I’ll use Linux Logical Volume Manager (LVM). For this I’ll create a Volume group (VG). This normally build on top of one or more hard disks (NVMEs, SSDs or HDDs). Later I can create volumes for my VMs that will be Logical Volumes in LVM. This is very flexible as one can create, change or delete these volumes as needed.

In my hosts there is one NVMe (partly used by the host OS already) and one SSD drive. To figure out what disks/partitions are available, lsblk command can be used. It looks like this:

bash

ansible -m command -a "lsblk" k8s-010100
k8s-010100.example.com | CHANGED | rc=0 >>
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda           8:0    0 931.5G  0 disk 
└─sda1        8:1    0 931.5G  0 part 
zram0       254:0    0     4G  0 disk [SWAP]
nvme0n1     259:0    0   1.8T  0 disk 
├─nvme0n1p1 259:1    0   512M  0 part /boot
├─nvme0n1p2 259:2    0   100G  0 part /
└─nvme0n1p3 259:3    0   1.7T  0 part

So nvme0n1 (the NVMe disk) as around 1.7 TByte free and sda (the SSD disk) basically only has an empty partition with 931 GByte free. For the root (/) file system of the VMs a SSD is good enough. The NVMe partition nvme0n1p3 I reserve for services that need more speed and run in my Kubernetes cluster like databases and object storage e.g. Creating partitions can be done with tools like cfdisk e.g.

I’ll create two LVM Volume Groups. The first one will be called ssd01 and the second one nvme01. That makes it easy to distinguish later what the faster and what’s the slower storage. It also makes it easy to add more drives/logical volumes later by just counting up.

I’ll use my Ansbile role githubixx.lvm to setup LVM and the Volume Groups. So lets install this role:

bash

ansible-galaxy role install githubixx.lvm

Next I extended the Ansible playbook file k8s.yml. It looks like this now:

yaml

---
-
  hosts: k8s_01
  roles:
    -
      role: githubixx.harden_linux
      tags: role-harden-linux
    -
      role: githubixx.lvm
      tags: role-lvm

To enable the role to create the Volume Groups and Logical Volumes I create a variable file for each host in host_vars accordingly. Since the Logical Volume names are different for every host it can’t be defined in group_vars. So for the first host the file host_vars/k8s-010100.example.com will look like this:

yaml

lvm_vgs:
  - vgname: ssd01
    pvs: /dev/sda1
    state: present
    lvm_lvs:
      - lvname: k8s-010101
        size: 25G
        state: present
      - lvname: k8s-010102
        size: 25G
        state: present
      - lvname: k8s-010103
        size: 100G
        state: present
  - vgname: nvme01
    pvs: /dev/nvme0n1p3
    state: present

For the hosts k8s-010200.example.com and k8s-010300.example.com it will look similar but the lvname will be k8s-010201, k8s-010202 and k8s-010203 for host k8s-010200 e.g.

The configuration starts with the top level object lvm_vgs which contains the defintion of one or more Volume Groups. Then there are two Volume Groups as already mentioned above: ssd01 with the one partition /dev/sda1 on the SDD drive and nvme01 with also one partition /dev/nvme0n1p3 on the NVMe drive. While I don’t add any Logical Volumes for Volume Group nvme01 for now I’ll add three for ssd01. As mentioned above already I’ll put all root / partions of my VMs on ssd01 volume group. The first one k8s-010101 will be for the etcd Virtual Machine. The second k8s-010102 will be for the first Kubernetes controller node and the third one k8s-010103 will be for the Kubernetes worker node. Every physical host will have one of these workloads so three times etcd, Kubernetes controller and worker. 25 Gbyte for etcd and Kubernetes controller should be sufficient. For Kubernetes worker it makes sense to have more space available as that one will also need to store the container images e.g. And you might also need some host volumes mounted into your Pods.

Now lets setup the volume groups:

bash

ansible-playbook -t role-lvm k8s.yml

If everything ran smoothly there should be now two Volume Groups on each host. This can be checked with the vgs command:

bash

ansible -m command -a "/usr/bin/vgs" k8s_01
k8s-010100.example.com | CHANGED | rc=0 >>
  VG     #PV #LV #SN Attr   VSize    VFree   
  nvme01   1   0   0 wz--n-    1.72t    1.72t
  ssd01    1   0   0 wz--n- <931.51g <931.51g
k8s-010200.example.com | CHANGED | rc=0 >>
  VG     #PV #LV #SN Attr   VSize    VFree   
  nvme01   1   0   0 wz--n-    1.72t    1.72t
  ssd01    1   0   0 wz--n- <931.51g <931.51g
k8s-010300.example.com | CHANGED | rc=0 >>
  VG     #PV #LV #SN Attr   VSize    VFree   
  nvme01   1   0   0 wz--n-    1.72t    1.72t
  ssd01    1   0   0 wz--n- <931.51g <931.51g

And there should be three Logical Volumes on each host. This can be checked with lvs command:

bash

ansible -m command -a "lvs" k8s_01
k8s-010200.example.com | CHANGED | rc=0 >>
  LV         VG    Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  k8s-010201 ssd01 -wi-a-----  25.00g                                                    
  k8s-010202 ssd01 -wi-a-----  25.00g                                                    
  k8s-010303 ssd01 -wi-a----- 100.00g                                                    
k8s-010300.example.com | CHANGED | rc=0 >>
  LV         VG    Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  k8s-010301 ssd01 -wi-a-----  25.00g                                                    
  k8s-010302 ssd01 -wi-a-----  25.00g                                                    
  k8s-010303 ssd01 -wi-a----- 100.00g                                                    
k8s-010100.example.com | CHANGED | rc=0 >>
  LV         VG    Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  k8s-010101 ssd01 -wi-a-----  25.00g                                                    
  k8s-010102 ssd01 -wi-a-----  25.00g                                                    
  k8s-010103 ssd01 -wi-a----- 100.00g

In the next blog post I’ll do some preparation needed for the Virtual Machines.