Virtualization with Arch Linux and Qemu/KVM - part 2
Introduction
To make life a bit easier now I’ll use Ansible to automate a few tasks. I’ve already written about Ansible in my blog post Kubernetes the not so hard way with Ansible. I haven’t automated everything as it’s not worth the effort for my private stuff. So some tasks I might also do manually.
DNS entries
To be able to work with the new hosts make sure that DNS entries are in place (or /etc/hosts
entries might also work). So if you own example.com
the hostnames would be k8s-010100.example.com
, k8s-010200.example.com
and k8s-010300.example.com
.
Throughout this blog post series I’ll use the network 192.168.10.0/23
. Physical Hosts will get IPs starting from 192.168.10.2
and Virtual Machines will get IPs starting from 192.168.11.2
. So while in the network configuration Physical Hosts and Virtual Machines will get a /23
network mask, the Physical Hosts will get IPs from 192.168.10.0/24
and the Virtual Machines from 192.168.11.0/24
. That makes handling network related stuff easier e.g. for setting up firewall rules that blocks access from Virtual Machines to the Physical Hosts.
Setup SSH keys
In order to be able to login without password, ssh-copy-id
can be used to copy locally available keys to authorize logins on a remote machine. So lets do this by running ssh-copy-id k8s-010100.example.com
(and for the other hosts too of course).
Prepare Python virtual environments for Ansible
To not mess up with the Python system installation Python’s venv
module is pretty handy. This helps to create an isolated environment. Ansible installation pulls quite a few other modules. Having them only where they’re needed makes very much sense. This avoids version conflicts of Python modules e.g. Also you can try the latest Ansible versions before using them in production e.g.
In my case everything Ansible related I’ll put into /opt/scripts/ansible
directory. We all know naming is the most difficult thing in IT 😄 In general it’s a good idea to avoid giving resources “real” names (like pets) but use some schema that makes automation easier. My Kubernetes (K8s) cluster will be called k8s-01
. If I decide to install another K8s cluster it will be k8s-02
and so on. As I’ve three Physical Hosts, these hosts will be called k8s-010100
, k8s-010200
and k8s-010300
. The Virtual Machines (VM) will be called k8s-010101
, k8-010102
, k8s-010103
, k8s-010201
, and so on. So the hierarchy will look like this:
k8s-01 # K8s cluster name
|-> k8s-010100 # Physical host #1
|-> k8s-010101 # VM #1 running etcd
|-> k8s-010102 # VM #2 running K8s control plane
|-> k8s-010103 # VM #3 running K8s worker
|-> k8s-010200 # Physical host #2
|-> k8s-010201 # VM #1 running etcd
|-> k8s-010202 # VM #2 running K8s control plane
|-> k8s-010203 # VM #3 running K8s worker
|-> k8s-010300 # Physical host #3
|-> k8s-010301 # VM #1 running etcd
|-> k8s-010302 # VM #2 running K8s control plane
|-> k8s-010303 # VM #3 running K8s worker
This setup offers some redundancy in case one host is down. So on my Ansible controller node (the laptop/host where I run the ansible
and ansible-playbook
commands on) I change to the directory /opt/scripts/ansible
as mentioned above and create a Python Virtual Environment venv
:
python3 -m venv k8s-01_phy
That creates a directory called k8s-01_phy
. k8s-01
is for “Kubernetes Cluster 01” and phy
for “Physical Hosts”. While this is not relevant for this tutorial I’ll also create a Virtual Environment called k8s_01_vms
. So everything related to the Physical Hosts (which means Arch Linux in my case) will be managed by k8s-01_phy
and every thing related to the Virtual Machines which will run later on this Physical Hosts will go into k8s_01_vms
. So lets enter this directory now. Normally you have to activate that venv
with
source ./bin/activate
But to make things a bit easier I’ve installed a tool called direnv. It should be included basically in every Linux OS package manager. So just install it with apt
, yum
, pacman
or whatever package manager. You also need a hook. It’s just one line in $HOME/.bashrc
. E.g. for Bash:
eval "$(direnv hook bash)"
Please read direnv hook on how to setup (also for other shells). In my venv
there is a file called .envrc
and it looks like this:
export ANSIBLE_HOME="$(pwd)"
source ./bin/activate
With direnv allow
I kinda activate .envrc
. So every time I enter this directory now direnv
will set ANSIBLE_HOME
environment variable and load the venv
. ANSIBLE_HOME
is by default the “root” directory of other variables e.g. roles_path
. So Ansible Roles will be installed in $ANSIBLE_HOME/roles
e.g. If I leave that directory it’ll unload the loaded environment and your venv
settings are gone. To check if the venv
is loaded you can run which python3
or which python
. It should point to bin/python3
or bin/python
within your venv
directory.
Next lets upgrade Python package manager to the latest version (this now only happens within the venv
and doesn’t touch you system Python):
python3 -m pip install --upgrade pip
Setup Ansible
Now lets install Ansible. At time of writting this blog post the latest version is 9.1.0
(just remove the version number if you don’t want to specify it)
python3 -m pip install ansible==9.1.0
In the venv
’s bin
directory there should be a few Ansible binaries now:
ls -1A bin/ansible*
bin/ansible
bin/ansible-community
bin/ansible-config
bin/ansible-connection
bin/ansible-console
bin/ansible-doc
bin/ansible-galaxy
bin/ansible-inventory
bin/ansible-playbook
bin/ansible-pull
bin/ansible-test
bin/ansible-vault
A very helpful tool is ansible-lint. Ansible Lint is a command-line tool for linting playbooks, roles and collections aimed toward any Ansible users. Its main goal is to promote proven practices, patterns and behaviors while avoiding common pitfalls that can easily lead to bugs or make code harder to maintain. It can be installed with python3 -m pip install ansible-lint
. Just run ansible-lint
in the venv
directory and the tool will give you hints on how to improve your code if needed.
Setup Ansible’s host file
For Ansible to be able manage these hosts a hosts
file is needed in the venv
directory. The first line is the group name. In my case it will be k8s_01
(k8s_01
for K8s cluster 01
). Only the Physical Hosts will be specified in this group. So the next line specifies these hosts (below hosts
) E.g. (YAML format but also INI format is possible):
k8s_01:
hosts:
k8s-01[01:03]00.example.com:
ansible_port: "22"
ansible_user: "deploy"
ansible_become: true
ansible_become_method: "sudo"
ansible_python_interpreter: "/usr/bin/python3"
This basically tells Ansible that hosts k8s-010100
to k8s-010300
belong to this group. Also a special variable called ansible_python_interpreter
is specified here. It specifies the target host Python path. As Arch Linux can have more Python versions installed we tell Ansible what Python to use. For more information see How to build your inventory.
I also specify that Ansible should connect to port 22
and connect as user deploy
to the remote hosts. In order to enable Ansible to execute tasks that require “root” permissions I specify ansible_become: true
and ansible_become_method: "sudo"
.
Setup ansible.cfg
To make life a little bit easier lets create ansible.cfg
file in your venv
. For now it will only have one entry. E.g.:
[defaults]
inventory = hosts
This avoids that I need to specify -i hosts
all the time when running ansible
or ansible-playbook
(see below). To generate an example ansible.cfg
one can use ansible-config init --disabled > /tmp/ansible.cfg
. Afterwards inspect /tmp/ansible.cfg
for other options you might want to set. Also see Ansible Configuration Settings.
Install Python on remote hosts (if needed)
Ansible needs Python installed on the remotes hosts. So in case Python isn’t already installed Ansible helps too here. Create a directory playbooks
in the venv
directory and create a file called bootstrap_python.yml
with the following content:
---
- name: Bootstrap Python
hosts: k8s_01
gather_facts: false
tasks:
- name: Updating pacman cache
ansible.builtin.raw: "pacman -Sy"
changed_when: false
failed_when: false
become: true
- name: Install Python
ansible.builtin.raw: |
pacman -S --noconfirm python
args:
executable: "/bin/bash"
changed_when: false
become: true
Then run ansible-playbook playbooks/bootstrap_python.yml
. To verify that Python is installed run ansible -m command -a "python -V" k8s_01
. Now for every host you should see an output like this:
k8s-010100.example.com | CHANGED | rc=0 >>
Python 3.11.6
With ansible -m ping k8s_01
you should now be able to “ping” all hosts in the group with Ansible’s ping
module e.g.:
k8s-010100.example.com | SUCCESS => {
"changed": false,
"ping": "pong"
...
With Ansible in place and configured lets further configure the Kubernetes hosts.
Harden the instances and enable some needed sysctl settings
To make the Kubernetes hosts a little bit more secure I’ll use my Ansible role githubixx.harden_linux. So lets install it:
ansible-galaxy role install githubixx.harden_linux
It will end up in $ANSIBLE_HOME/roles/
directory. To customize the role a few variables are needed. Since the variable values will be valid for all hosts in k8s_01
group I’ll create a file k8s_01.yml
in group_vars
directory (just create it if not there yet). The content will look like this (read comments what the settings are all about and githubixx.harden_linux README about all possible settings):
---
# Configure and use "systemd-timesyncd" as time server
harden_linux_ntp: "systemd-timesyncd"
# Create a user "deploy" on the remote hosts. This user is meant to be used
# by Ansible. Just use the same user that was created in first part of this
# blog post or use a different one. The encrypted password can be created
# with the following command (on Linux):
#
# mkpasswd --method=sha-512
#
# The $HOME directory of that user will be "/home/deploy". And the public SSH
# key stored in "/home/deploy/.ssh/id_rsa.pub" on your Ansible controller
# host will be copied to the remote hosts into
# "/home/deploy/.ssh/authorized_keys".
harden_linux_deploy_group: "deploy"
harden_linux_deploy_group_gid: "1000"
harden_linux_deploy_user: "deploy"
harden_linux_deploy_user_password: "<output of mkpasswd --method=sha-512>"
harden_linux_deploy_user_uid: "1000"
harden_linux_deploy_user_home: "/home/deploy"
harden_linux_deploy_user_shell: "/bin/bash"
harden_linux_deploy_user_public_keys:
- /home/deploy/.ssh/id_rsa.pub
# Enable IP forwarding. This is needed to forward network packages between
# VMs e.g.
harden_linux_sysctl_settings_user:
"net.ipv4.ip_forward": 1
"net.ipv6.conf.default.forwarding": 1
"net.ipv6.conf.all.forwarding": 1
# Change default SSH port to 22222. This is of course optional and may
# prevent some SSH attacks on default SSH port 22. But this is of course more
# security by obscurity ;-)
harden_linux_sshd_settings_user:
"^Port ": "Port 22222"
# Setup "Uncomplicated FireWall (UFW)". Allow incoming network traffic on
# port 22 and 22222. Also no restrictions for network traffic originating
# from hosts that belong to "192.168.10.0/24" network. Also enable UFW
# logging and set the default FORWARD policy to "accept". This policy is
# needed so that traffic between the VMs can flow later. Otherwise network
# packets are not forwarded between the VMs network interfaces.
harden_linux_ufw_rules:
- rule: "allow"
to_port: "22"
protocol: "tcp"
delete: true
- rule: "allow"
to_port: "22222"
protocol: "tcp"
harden_linux_ufw_allow_networks:
- "192.168.10.0/24"
harden_linux_ufw_logging: 'on'
harden_linux_ufw_defaults_user:
"^DEFAULT_FORWARD_POLICY": 'DEFAULT_FORWARD_POLICY="ACCEPT"'
# "sshguard" should not block failed login attempts from these hosts.
harden_linux_sshguard_whitelist:
- "127.0.0.0/8"
- "::1/128"
- "192.168.10.0/24"
# Configure "systemd-resolved". While it looks strange it's actually correct ;-)
# To override already existing default values they first need to be set to an
# empty string and in the next line the new value will be set.
harden_linux_systemd_resolved_settings:
- DNS=
- DNS=9.9.9.9 1.1.1.1 2606:4700:4700::1111 2620:fe::fe
- FallbackDNS=
- FallbackDNS=149.112.112.112 1.0.0.1 2620:fe::9 2606:4700:4700::1001
- DNSOverTLS=
- DNSOverTLS=opportunistic
- Domains=
- Domains=example.com
Now I create an Ansible playbook file called k8s.yml
. E.g.:
---
-
hosts: k8s_01
roles:
-
role: githubixx.harden_linux
tags: role-harden-linux
So for all hosts in the hosts group k8s_01
the Ansible role githubixx.harden_linux
should be applied. Additionally also a tag called role-harden-linux
will be applied to this role. The advantage by using this format is that when a host group contains more roles later, one can apply just one role to the hosts by using this tag and doesn’t have to apply all roles specified.
Finally apply the role:
ansible-playbook --tags=role-harden-linux k8s.yml
Since I changed the SSH port to 22222
I now need to adjust that setting in Ansible’s hosts
file too. Otherwise Ansible can’t connect to the remote hosts anymore:
ansible_port: "22222"
Install libvirt, QEMU and KVM
The next step will install libvirt, QEMU, KVM and everything is needed to make it work. libvirt
is a collection of software that provides a convenient way to manage virtual machines and other virtualization functionality, such as storage and network interface management. The most important parts for my purpose is most probably libvirtd
and virsh
. libvirtd
allows me to connect remotely from my laptop to all hosts and manage the VMs there e.g. virsh
is basically a CLI tool to manage everything VM related on the hosts. On my laptop I’ve virt-manager installed that I’ll use later to manage my VMs. QEMU
is a generic and open source machine emulator and virtualizer. QEMU
can use other hypervisors like KVM to use CPU extensions (HVM) for virtualization. When used as a virtualizer, QEMU
achieves near native performances by executing the guest code directly on the host CPU.
To setup every thing libvirt
related I put again an Ansible playbook file called libvirt.yml
into the playbooks
directory. And it has this content:
---
- name: Setup libvirtd
hosts: k8s_01
gather_facts: false
tasks:
- name: Install iptables-nft
ansible.builtin.raw: |
yes | pacman -S iptables-nft
args:
executable: "/bin/bash"
changed_when: false
become: true
- name: Install libvirtd and packages for virtualization
ansible.builtin.package:
pkg: "{{ packages }}"
state: "present"
retries: 2
delay: 5
vars:
packages:
- edk2-ovmf
- dnsmasq
- openbsd-netcat
- dmidecode
- libvirt
- qemu-base
- util-linux
- bash-completion
- name: Add deploy user to libvirt group
ansible.builtin.user:
name: "deploy"
groups: "libvirt"
append: "yes"
- name: Restart libvirtd
ansible.builtin.service:
name: "libvirtd"
enabled: true
state: "restarted"
- name: Restart UFW
ansible.builtin.service:
name: "ufw"
enabled: true
state: "restarted"
Lets discuss quickly what happens here. First I need to replace iptables
package with iptables-nft
. iptables
comes with iproute2
package as a dependency. iptables
and nftables are both firewall frameworks in Linux, with iptables being the traditional framework and nftables
being the newer and more streamlined one. While iptables
offers backward compatibility and wide support, nftables
provides better performance, scalability, and a simpler syntax for managing firewall rules. UFW which is the firewall frontend used in my githubixx.harden_linux
role for firewalling needs is a frontend that also works with nftables
. Since iptables
and iptables-nft
are in conflict and the former one a dependency of iproute2
I need to use the Ansible raw
module to explicitly confirm that I want to replace iptables
with iptables-nft
. In most cases for the enduser it’s a drop-in replacement.
Then a few packages will be installed. edk2-ovmf is a TianoCore project to enable UEFI support for Virtual Machines. dnsmasq for the default DHCP networking. But as I use bridged networking I most probably don’t need it. openbsd-netcat is for remote management over SSH. So that is used by virsh
and virt-manager
later. dmidecode is for DMI system info support. If not installed libvirtd
will also complain that it can’t find dmidecode
binary in the path. About libvirt
I’ve already written above. And qemu-base contains the headless versions of QEMU
without any GUI components (which are not needed on a server).
If you plan to also run Windows 11
guests consider installing swtpm package too. QEMU can emulate Trusted Platform Module, which is required by some systems such as Windows 11. swtpm
package provides a software TPM implementation.
Then the playbook will add the deploy
user (change that one accordingly) to the libvirt
group. This is needed to access libvirtd
. Members of the libvirt
group have passwordless access to the RW daemon socket by default. So I can easily connect as user deploy
(or whatever user you use) via a secure SSH connection to the remote hosts with virsh
or virt-manager
and everything works magically already 😄
Finally libvirtd
and UFW
will be restarted to catch up with all the changes (e.g. the UEFI support).
So lets run the playbook:
ansible-playbook playbooks/libvirt.yml
If everything worked as intended the following command should work now from your laptop (pretended that you have libvirt
package installed there and of course replace deploy
with your user and k8s-010100
with your hostname):
virsh --connect qemu+ssh://deploy@k8s-010100/system sysinfo
This should print some XML output about your remote hosts BIOS, processor, RAM, and so on.
Install LVM for VM storage pool
So currently it’d be already possible to create Virtual Machines. But I don’t want to store the VM images as qcow2
files in the default directory /var/lib/libvirt/images
for performance reasons. If you don’t care about disk performance that much and take the performance penalty you can skip this section. If it comes to flexibility then qcow2
images are still the easiest way to handle Virtual Machine images locally on a host.
To manage the storage for the VMs I’ll use Linux Logical Volume Manager (LVM). For this I’ll create a Volume group (VG). This normally build on top of one or more hard disks (NVMEs, SSDs or HDDs). Later I can create volumes for my VMs that will be Logical Volumes in LVM. This is very flexible as one can create, change or delete these volumes as needed.
In my hosts there is one NVMe (partly used by the host OS already) and one SSD drive. To figure out what disks/partitions are available, lsblk
command can be used. It looks like this:
ansible -m command -a "lsblk" k8s-010100
k8s-010100.example.com | CHANGED | rc=0 >>
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 931.5G 0 disk
└─sda1 8:1 0 931.5G 0 part
zram0 254:0 0 4G 0 disk [SWAP]
nvme0n1 259:0 0 1.8T 0 disk
├─nvme0n1p1 259:1 0 512M 0 part /boot
├─nvme0n1p2 259:2 0 100G 0 part /
└─nvme0n1p3 259:3 0 1.7T 0 part
So nvme0n1
(the NVMe disk) as around 1.7 TByte free and sda
(the SSD disk) basically only has an empty partition with 931 GByte free. For the root (/
) file system of the VMs a SSD is good enough. The NVMe partition nvme0n1p3
I reserve for services that need more speed and run in my Kubernetes cluster like databases and object storage e.g. Creating partitions can be done with tools like cfdisk e.g.
I’ll create two LVM Volume Groups. The first one will be called ssd01
and the second one nvme01
. That makes it easy to distinguish later what the faster and what’s the slower storage. It also makes it easy to add more drives/logical volumes later by just counting up.
I’ll use my Ansbile role githubixx.lvm to setup LVM and the Volume Groups. So lets install this role:
ansible-galaxy role install githubixx.lvm
Next I extended the Ansible playbook file k8s.yml
. It looks like this now:
---
-
hosts: k8s_01
roles:
-
role: githubixx.harden_linux
tags: role-harden-linux
-
role: githubixx.lvm
tags: role-lvm
To enable the role to create the Volume Groups and Logical Volumes I create a variable file for each host in host_vars
accordingly. Since the Logical Volume names are different for every host it can’t be defined in group_vars
. So for the first host the file host_vars/k8s-010100.example.com
will look like this:
lvm_vgs:
- vgname: ssd01
pvs: /dev/sda1
state: present
lvm_lvs:
- lvname: k8s-010101
size: 25G
state: present
- lvname: k8s-010102
size: 25G
state: present
- lvname: k8s-010103
size: 100G
state: present
- vgname: nvme01
pvs: /dev/nvme0n1p3
state: present
For the hosts k8s-010200.example.com
and k8s-010300.example.com
it will look similar but the lvname
will be k8s-010201
, k8s-010202
and k8s-010203
for host k8s-010200
e.g.
The configuration starts with the top level object lvm_vgs
which contains the defintion of one or more Volume Groups. Then there are two Volume Groups as already mentioned above: ssd01
with the one partition /dev/sda1
on the SDD drive and nvme01
with also one partition /dev/nvme0n1p3
on the NVMe drive. While I don’t add any Logical Volumes for Volume Group nvme01
for now I’ll add three for ssd01
. As mentioned above already I’ll put all root /
partions of my VMs on ssd01
volume group. The first one k8s-010101
will be for the etcd
Virtual Machine. The second k8s-010102
will be for the first Kubernetes controller node and the third one k8s-010103
will be for the Kubernetes worker node. Every physical host will have one of these workloads so three times etcd
, Kubernetes controller
and worker
. 25 Gbyte for etcd
and Kubernetes controller
should be sufficient. For Kubernetes worker
it makes sense to have more space available as that one will also need to store the container images e.g. And you might also need some host volumes mounted into your Pods.
Now lets setup the volume groups:
ansible-playbook -t role-lvm k8s.yml
If everything ran smoothly there should be now two Volume Groups on each host. This can be checked with the vgs command:
ansible -m command -a "/usr/bin/vgs" k8s_01
k8s-010100.example.com | CHANGED | rc=0 >>
VG #PV #LV #SN Attr VSize VFree
nvme01 1 0 0 wz--n- 1.72t 1.72t
ssd01 1 0 0 wz--n- <931.51g <931.51g
k8s-010200.example.com | CHANGED | rc=0 >>
VG #PV #LV #SN Attr VSize VFree
nvme01 1 0 0 wz--n- 1.72t 1.72t
ssd01 1 0 0 wz--n- <931.51g <931.51g
k8s-010300.example.com | CHANGED | rc=0 >>
VG #PV #LV #SN Attr VSize VFree
nvme01 1 0 0 wz--n- 1.72t 1.72t
ssd01 1 0 0 wz--n- <931.51g <931.51g
And there should be three Logical Volumes on each host. This can be checked with lvs command:
ansible -m command -a "lvs" k8s_01
k8s-010200.example.com | CHANGED | rc=0 >>
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
k8s-010201 ssd01 -wi-a----- 25.00g
k8s-010202 ssd01 -wi-a----- 25.00g
k8s-010303 ssd01 -wi-a----- 100.00g
k8s-010300.example.com | CHANGED | rc=0 >>
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
k8s-010301 ssd01 -wi-a----- 25.00g
k8s-010302 ssd01 -wi-a----- 25.00g
k8s-010303 ssd01 -wi-a----- 100.00g
k8s-010100.example.com | CHANGED | rc=0 >>
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
k8s-010101 ssd01 -wi-a----- 25.00g
k8s-010102 ssd01 -wi-a----- 25.00g
k8s-010103 ssd01 -wi-a----- 100.00g
In the next blog post I’ll do some preparation needed for the Virtual Machines.