Kubernetes the not so hard way with Ansible - Control plane - (K8s v1.28)

This post is based on Kelsey Hightower’s Bootstrapping the Kubernetes Control Plane.

This time I’ll install the Kubernetes Control Plane services (that’s kube-apiserver, kube-scheduler and kube-controller-manager). All this components will run on every controller node. In Kubernetes certificate authority blob post I installed the PKI (public key infrastructure) in order to secure communication between the Kubernetes components/infrastructure e.g. As with the etcd cluster I use the certificate authority and generated certificates but this time for kube-apiserver for that I generated a separate CA and certificates. If you used the default values in the other playbooks so far you most likely don’t need to change too many of the default variables settings.

As this role needs to connect to the kube-apiserver at the very end to install a ClusterRoles and ClusterRoleBindings it uses Ansible’s kubernetes.core.k8s module. That in turn needs a few Python modules installed. So I’ll install them in my k8s-01_vms Python venv. E.g.:

bash

python3 -m pip install kubernetes pyyaml jsonpatch

As you might remember I’ve prepared three Virtual Machines to install the Ansible kubernetes_controller role on it. That means three times kube-apiserver, kube-scheduler and kube-controller-manager. The kube-apiserver daemons are used by basically all other kube-* services. To make them highly available I’ll install HAProxy on every node. HAProxy is a free, very fast and reliable reverse-proxy offering high availability, load balancing, and proxying for TCP and HTTP-based applications. I’ll use the TCP loadbalancing feature. Every k8s-* service will connect to its local HAProxy on port 16443 (the default). Then HAProxy will forward the requests to one of the three available kube-apiservers. In case one of the kube-apiservers is down (e.g. for maintenance), HAProxy will automatically detect this situation and forwards the requests to one of the remaining kube-apiservers. In case HAProxy itself dies, systemd tries to restart it. In the very rare case that HAProxy has issues then only one K8s Worker node fails. You can of course implement a way more sophisticated setup as described Achieving High Availability with HAProxy and Keepalived: Building a Redundant Load Balancer e.g. But this is out of scope of this tutorial. This would allow you to have a single VIP address (virtual IP address) that will be “moved” around in case an instance fails. Or if you’ve a hardware loadbalancer or another type of loadbalancer like Google Loadbalancer then you can use that too of course.

But for my use case HAProxy as described above is good enough. The Ansible haproxy role has just three default variables:

yaml

# Address which haproxy service is listening for incoming requests
haproxy_frontend_bind_address: "127.0.0.1"

# Port which haproxy is listening for incoming requests
haproxy_frontend_port: "16443"

# Port where kube-apiserver is listening
haproxy_k8s_api_endpoint_port: "6443"

This role is really just meant to be used for this single use case: Install HAProxy, configure it so that it listens on 127.0.0.1:16443 and forward all requests to one of the three kube-apiservers on the Kubernetes Control Plane nodes on port 6443. As you can see above these are the defaults and they should just work. But you can adjust them of course. But since I keep the defaults I don’t need to change any variables and can just continue installing the role first:

bash

ansible-galaxy role install githubixx.haproxy

Then I’ll extend the playbook k8s.yml accordingly. I’ll add the role to the k8s_worker group. This group includes the Control Plane and Worker nodes as you might remember. So at the end I’ve HAProxy running on every Kubernetes node besides the etcd nodes. So lets add the role to the playbook file (harden_linux I already added in one the previous blog posts):

yaml

-
  hosts: k8s_worker
  roles:
    -
      role: githubixx.harden_linux
      tags: role-harden-linux
    -
      role: githubixx.haproxy
      tags: role-haproxy

Next I’ll deploy the role:

bash

 ansible-playbook --tags=role-haproxy k8s.ym

If you check the HAProxy logs it’ll tell you that no backend server is available. That’s of course true currently as kube-apiserver isn’t there yet.

Before I change a few default variables of the kubernetes_controller role, again a few firewall changes are needed. Since I only filter on inbound traffic and allow outbound traffic by default I only need to take care about the former one. Since I want to allow external HTTP, HTTPs and SMTP traffic later I’ll open ports 80, 443 and 25 accordingly. So in my case the final firewall rules will now look like this (I’ll put them into group_vars/k8s_worker.yml which includes Control Plane and Worker nodes - remember that the Control Plane nodes are also running certain important workloads like Cilium):

yaml

harden_linux_ufw_rules:
  - rule: "allow"
    to_port: "22222"
    protocol: "tcp"
  - rule: "allow"
    to_port: "51820"
    protocol: "udp"
  - rule: "allow"
    to_port: "80"
    protocol: "tcp"
  - rule: "allow"
    to_port: "443"
    protocol: "tcp"
  - rule: "allow"
    to_port: "25"
    protocol: "tcp"

Just as a reminder: In one of the previous blog posts Harden the instances I also added

yaml

harden_linux_ufw_allow_networks:
  - "10.0.0.0/8"
  - "192.168.11.0/24"

and the IP of my Ansible Controller node. The Kubernetes Networking Model can be challenging to understand exactly how it is expected to work and kube-proxy and Cilium (and other K8s network implementations) doing lots of “funny” stuff behind the curtain 😉 E.g. the Kubernetes worker nodes need to be able to communicate an in my case that’s the IP range 192.168.11.0/24. Then you’ve Cilium. It’s IPAM will allocate /24 network blocks out of 10.0.0.0/8 range (see Cilium’s cluster-pool-ipv4-cidr and cluster-pool-ipv4-mask-size settings). Those /24 IP blocks will be assigned to a every node and the Pods running there will get an IP out of that /24 range. So Pod to Pod communication also needs to work. Then you’ve Pod to Service communication. In my case (and the default of my Ansible K8s roles) that’s the IP range 10.32.0.0/16. Then you’ve External to Service communication to make Kubernetes Service available to the “outside” world. There’re various possibilities to do so e.g. Ingress, the newer Gateway API or a Cloud provider Loadbalancer implementation e.g. Google loadbalancer. And so on 😉 So with 10.0.0.0/8 and 192.168.11.0/24 in harden_linux_ufw_allow_networks I’ll cover all those communication requirements. If you think it’s too much you can play around with the firewall rule but it’s really easy to shoot yourself into the foot 😄 But maybe GKE cluster firewall rules might give you a good starting point. Nevertheless it’s still possible later to control the traffic flow with Kubernetes Network Policies e.g. And CiliumNetworkPolicy is going even further. The Network Policy Editor might also be of great help to design such kind of Network Policies.

With that said to apply the changes I’ll run

bash

ansible-playbook --tags=role-harden-linux k8s.yml

Next lets start working on the relevant kubernetes_controller role. Below are the default variables for this role (I’ll discuss what values I changed afterwards):

yaml

# The base directory for Kubernetes configuration and certificate files for
# everything control plane related. After the playbook is done this directory
# contains various sub-folders.
k8s_ctl_conf_dir: "/etc/kubernetes/controller"

# All certificate files (Private Key Infrastructure related) specified in
# "k8s_ctl_certificates" and "k8s_ctl_etcd_certificates" (see "vars/main.yml")
# will be stored here. Owner of this new directory will be "root". Group will
# be the group specified in "k8s_run_as_group"`. The files in this directory
# will be owned by "root" and group as specified in "k8s_run_as_group"`. File
# permissions will be "0640".
k8s_ctl_pki_dir: "{{ k8s_ctl_conf_dir }}/pki"

# The directory to store the Kubernetes binaries (see "k8s_ctl_binaries"
# variable in "vars/main.yml"). Owner and group of this new directory
# will be "root" in both cases. Permissions for this directory will be "0755".
#
# NOTE: The default directory "/usr/local/bin" normally already exists on every
# Linux installation with the owner, group and permissions mentioned above. If
# your current settings are different consider a different directory. But make sure
# that the new directory is included in your "$PATH" variable value.
k8s_ctl_bin_dir: "/usr/local/bin"

# The Kubernetes release.
k8s_ctl_release: "1.28.5"

# The interface on which the Kubernetes services should listen on. As all cluster
# communication should use a VPN interface the interface name is
# normally "wg0" (WireGuard),"peervpn0" (PeerVPN) or "tap0".
#
# The network interface on which the Kubernetes control plane services should
# listen on. That is:
#
# - kube-apiserver
# - kube-scheduler
# - kube-controller-manager
#
k8s_interface: "eth0"

# Run Kubernetes control plane service (kube-apiserver, kube-scheduler,
# kube-controller-manager) as this user.
#
# If you want to use a "secure-port" < 1024 for "kube-apiserver" you most
# probably need to run "kube-apiserver" as user "root" (not recommended).
#
# If the user specified in "k8s_run_as_user" does not exist then the role
# will create it. Only if the user already exists the role will not create it
# but it will adjust it's UID/GID and shell if specified (see settings below).
# So make sure that UID, GID and shell matches the existing user if you don't
# want that that user will be changed.
#
# Additionally if "k8s_run_as_user" is "root" then this role wont touch the user
# at all.
k8s_run_as_user: "k8s"

# UID of user specified in "k8s_run_as_user". If not specified the next available
# UID from "/etc/login.defs" will be taken (see "SYS_UID_MAX" setting in that file).
# k8s_run_as_user_uid: "999"

# Shell for specified user in "k8s_run_as_user". For increased security keep
# the default.
k8s_run_as_user_shell: "/bin/false"

# Specifies if the user specified in "k8s_run_as_user" will be a system user (default)
# or not. If "true" the "k8s_run_as_user_home" setting will be ignored. In general
# it makes sense to keep the default as there should be no need to login as
# the user that runs kube-apiserver, kube-scheduler or kube-controller-manager.
k8s_run_as_user_system: true

# Home directory of user specified in "k8s_run_as_user". Will be ignored if
# "k8s_run_as_user_system" is set to "true". In this case no home directory will
# be created. Normally not needed.
# k8s_run_as_user_home: "/home/k8s"

# Run Kubernetes daemons (kube-apiserver, kube-scheduler, kube-controller-manager)
# as this group.
#
# Note: If the group specified in "k8s_run_as_group" does not exist then the role
# will create it. Only if the group already exists the role will not create it
# but will adjust GID if specified in "k8s_run_as_group_gid" (see setting below).
k8s_run_as_group: "k8s"

# GID of group specified in "k8s_run_as_group". If not specified the next available
# GID from "/etc/login.defs" will be take (see "SYS_GID_MAX" setting in that file).
# k8s_run_as_group_gid: "999"

# Specifies if the group specified in "k8s_run_as_group" will be a system group (default)
# or not.
k8s_run_as_group_system: true

# By default all tasks that needs to communicate with the Kubernetes
# cluster are executed on local host (127.0.0.1). But if that one
# doesn't have direct connection to the K8s cluster or should be executed
# elsewhere this variable can be changed accordingly.
k8s_ctl_delegate_to: "127.0.0.1"

# The IP address or hostname of the Kubernetes API endpoint. This variable
# is used by "kube-scheduler" and "kube-controller-manager" to connect
# to the "kube-apiserver" (Kubernetes API server).
#
# By default the first host in the Ansible group "k8s_controller" is
# specified here. NOTE: This setting is not fault tolerant! That means
# if the first host in the Ansible group "k8s_controller" is down
# the worker node and its workload continue working but the worker
# node doesn't receive any updates from Kubernetes API server.
#
# If you have a loadbalancer that distributes traffic between all
# Kubernetes API servers it should be specified here (either its IP
# address or the DNS name). But you need to make sure that the IP
# address or the DNS name you want to use here is included in the
# Kubernetes API server TLS certificate (see "k8s_apiserver_cert_hosts"
# variable of https://github.com/githubixx/ansible-role-kubernetes-ca
# role). If it's not specified you'll get certificate errors in the
# logs of the services mentioned above.
k8s_ctl_api_endpoint_host: "{% set controller_host = groups['k8s_controller'][0] %}{{ hostvars[controller_host]['ansible_' + hostvars[controller_host]['k8s_interface']].ipv4.address }}"

# As above just for the port. It specifies on which port the
# Kubernetes API servers are listening. Again if there is a loadbalancer
# in place that distributes the requests to the Kubernetes API servers
# put the port of the loadbalancer here.
k8s_ctl_api_endpoint_port: "6443"

# Normally  "kube-apiserver", "kube-controller-manager" and "kube-scheduler" log
# to "journald". But there are exceptions like the audit log. For this kind of
# log files this directory will be used as a base path. The owner and group
# of this directory will be the one specified in "k8s_run_as_user" and "k8s_run_as_group"
# as these services run as this user and need permissions to create log files
# in this directory.
k8s_ctl_log_base_dir: "/var/log/kubernetes"

# Permissions for directory specified in "k8s_ctl_log_base_dir"
k8s_ctl_log_base_dir_mode: "0770"

# The port the control plane components should connect to etcd cluster
k8s_ctl_etcd_client_port: "2379"

# The interface the etcd cluster is listening on
k8s_ctl_etcd_interface: "eth0"

# The location of the directory where the Kubernetes certificates are stored.
# These certificates were generated by the "kubernetes_ca" Ansible role if you
# haven't used a different method to generate these certificates. So this
# directory is located on the Ansible controller host. That's normally the
# host where "ansible-playbook" gets executed. "k8s_ca_conf_directory" is used
# by the "kubernetes_ca" Ansible role to store the certificates. So it's
# assumed that this variable is already set.
k8s_ctl_ca_conf_directory: "{{ k8s_ca_conf_directory }}"

# Directory where "admin.kubeconfig" (the credentials file) for the "admin" user
# is stored. By default this directory (and it's "kubeconfig" file) will
# be stored on the host specified in "k8s_ctl_delegate_to". By default this
# is "127.0.0.1". So if you run "ansible-playbook" locally e.g. the directory
# and file will be created there.
#
# By default the value of this variable will expand to the user's local $HOME
# plus "/k8s/certs". That means if the user's $HOME directory is e.g.
# "/home/da_user" then "k8s_admin_conf_dir" will have a value of
# "/home/da_user/k8s/certs".
k8s_admin_conf_dir: "{{ '~/k8s/configs' | expanduser }}"

# Permissions for the directory specified in "k8s_admin_conf_dir"
k8s_admin_conf_dir_perm: "0700"

# Owner of the directory specified in "k8s_admin_conf_dir" and for
# "admin.kubeconfig" stored in this directory.
k8s_admin_conf_owner: "root"

# Group of the directory specified in "k8s_admin_conf_dir" and for
# "admin.kubeconfig" stored in this directory.
k8s_admin_conf_group: "root"

# Host where the "admin" user connects to administer the K8s cluster. 
# This setting is written into "admin.kubeconfig". This allows to use
# a different host/loadbalancer as the K8s services which might use an internal
# loadbalancer while the "admin" user connects to a different host/loadbalancer
# that distributes traffic to the "kube-apiserver" e.g.
#
# Besides that basically the same comments as for "k8s_ctl_api_endpoint_host"
# variable apply.
k8s_admin_api_endpoint_host: "{% set controller_host = groups['k8s_controller'][0] %}{{ hostvars[controller_host]['ansible_' + hostvars[controller_host]['k8s_interface']].ipv4.address }}"

# As above just for the port.
k8s_admin_api_endpoint_port: "6443"

# Directory to store "kube-apiserver" audit logs (if enabled). The owner and
# group of this directory will be the one specified in "k8s_run_as_user"
# and "k8s_run_as_group".
k8s_apiserver_audit_log_dir: "{{ k8s_ctl_log_base_dir }}/kube-apiserver"

# The directory to store "kube-apiserver" configuration.
k8s_apiserver_conf_dir: "{{ k8s_ctl_conf_dir }}/kube-apiserver"

# "kube-apiserver" daemon settings (can be overridden or additional added by defining
# "k8s_apiserver_settings_user")
k8s_apiserver_settings:
  "advertise-address": "{{ hostvars[inventory_hostname]['ansible_' + k8s_interface].ipv4.address }}"
  "bind-address": "{{ hostvars[inventory_hostname]['ansible_' + k8s_interface].ipv4.address }}"
  "secure-port": "6443"
  "enable-admission-plugins": "NodeRestriction,NamespaceLifecycle,LimitRanger,ServiceAccount,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,PersistentVolumeClaimResize,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ValidatingAdmissionPolicy,ResourceQuota,PodSecurity,Priority,StorageObjectInUseProtection,RuntimeClass,CertificateApproval,CertificateSigning,ClusterTrustBundleAttest,CertificateSubjectRestriction,DefaultIngressClass"
  "allow-privileged": "true"
  "authorization-mode": "Node,RBAC"
  "audit-log-maxage": "30"
  "audit-log-maxbackup": "3"
  "audit-log-maxsize": "100"
  "audit-log-path": "{{ k8s_apiserver_audit_log_dir }}/audit.log"
  "event-ttl": "1h"
  "kubelet-preferred-address-types": "InternalIP,Hostname,ExternalIP"  # "--kubelet-preferred-address-types" defaults to:
                                                                       # "Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP"
                                                                       # Needs to be changed to make "kubectl logs" and "kubectl exec" work.
  "runtime-config": "api/all=true"
  "service-cluster-ip-range": "10.32.0.0/16"
  "service-node-port-range": "30000-32767"
  "client-ca-file": "{{ k8s_ctl_pki_dir }}/ca-k8s-apiserver.pem"
  "etcd-cafile": "{{ k8s_ctl_pki_dir }}/ca-etcd.pem"
  "etcd-certfile": "{{ k8s_ctl_pki_dir }}/cert-k8s-apiserver-etcd.pem"
  "etcd-keyfile": "{{ k8s_ctl_pki_dir }}/cert-k8s-apiserver-etcd-key.pem"
  "encryption-provider-config": "{{ k8s_apiserver_conf_dir }}/encryption-config.yaml"
  "encryption-provider-config-automatic-reload": "true"
  "kubelet-certificate-authority": "{{ k8s_ctl_pki_dir }}/ca-k8s-apiserver.pem"
  "kubelet-client-certificate": "{{ k8s_ctl_pki_dir }}/cert-k8s-apiserver.pem"
  "kubelet-client-key": "{{ k8s_ctl_pki_dir }}/cert-k8s-apiserver-key.pem"
  "service-account-key-file": "{{ k8s_ctl_pki_dir }}/cert-k8s-controller-manager-sa.pem"
  "service-account-signing-key-file": "{{ k8s_ctl_pki_dir }}/cert-k8s-controller-manager-sa-key.pem"
  "service-account-issuer": "https://{{ groups.k8s_controller | first }}:6443"
  "tls-cert-file": "{{ k8s_ctl_pki_dir }}/cert-k8s-apiserver.pem"
  "tls-private-key-file": "{{ k8s_ctl_pki_dir }}/cert-k8s-apiserver-key.pem"

# This is the content of "encryption-config.yaml". Used by "kube-apiserver"
# (see "encryption-provider-config" option in "k8s_apiserver_settings").
# "kube-apiserver" will use this configuration to encrypt data before storing
# it in etcd (encrypt data at-rest).
#
# The configuration below is a usable example but might not fit your needs.
# So please review carefully! E.g. you might want to replace "aescbc" provider
# with a different one like "secretbox". As you can see this configuration only
# encrypts "secrets" at-rest. But it's also possible to encrypt other K8s
# resources. NOTE: "identity" provider doesn't encrypt anything! That means
# plain text. In the configuration below it's used as fallback.
#
# If you keep the default defined below please make sure to specify the
# variable "k8s_encryption_config_key" somewhere (e.g. "group_vars/all.yml" or
# even better use "ansible-vault" to store these kind of secrets).
# This needs to be a base64 encoded value. To create such a value on Linux
# run the following command:
#
# head -c 32 /dev/urandom | base64
#
# For a detailed description please visit:
# https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/
#
# How to rotate the encryption key or to implement encryption at-rest in
# an existing K8s cluster please visit:
# https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/#rotating-a-decryption-key
k8s_apiserver_encryption_provider_config: |
  ---
  kind: EncryptionConfiguration
  apiVersion: apiserver.config.k8s.io/v1
  resources:
    - resources:
        - secrets
      providers:
        - aescbc:
            keys:
              - name: key1
                secret: {{ k8s_encryption_config_key }}
        - identity: {}  

# The directory to store controller manager configuration.
k8s_controller_manager_conf_dir: "{{ k8s_ctl_conf_dir }}/kube-controller-manager"

# K8s controller manager settings (can be overridden or additional added by defining
# "k8s_controller_manager_settings_user")
k8s_controller_manager_settings:
  "bind-address": "{{ hostvars[inventory_hostname]['ansible_' + k8s_interface].ipv4.address }}"
  "secure-port": "10257"
  "cluster-cidr": "10.200.0.0/16"
  "allocate-node-cidrs": "true"
  "cluster-name": "kubernetes"
  "authentication-kubeconfig": "{{ k8s_controller_manager_conf_dir }}/kubeconfig"
  "authorization-kubeconfig": "{{ k8s_controller_manager_conf_dir }}/kubeconfig"
  "kubeconfig": "{{ k8s_controller_manager_conf_dir }}/kubeconfig"
  "leader-elect": "true"
  "service-cluster-ip-range": "10.32.0.0/16"
  "cluster-signing-cert-file": "{{ k8s_ctl_pki_dir }}/cert-k8s-apiserver.pem"
  "cluster-signing-key-file": "{{ k8s_ctl_pki_dir }}/cert-k8s-apiserver-key.pem"
  "root-ca-file": "{{ k8s_ctl_pki_dir }}/ca-k8s-apiserver.pem"
  "requestheader-client-ca-file": "{{ k8s_ctl_pki_dir }}/ca-k8s-apiserver.pem"
  "service-account-private-key-file": "{{ k8s_ctl_pki_dir }}/cert-k8s-controller-manager-sa-key.pem"
  "use-service-account-credentials": "true"

# The directory to store scheduler configuration.
k8s_scheduler_conf_dir: "{{ k8s_ctl_conf_dir }}/kube-scheduler"

# kube-scheduler settings
k8s_scheduler_settings:
  "bind-address": "{{ hostvars[inventory_hostname]['ansible_' + k8s_interface].ipv4.address }}"
  "config": "{{ k8s_scheduler_conf_dir }}/kube-scheduler.yaml"
  "authentication-kubeconfig": "{{ k8s_scheduler_conf_dir }}/kubeconfig"
  "authorization-kubeconfig": "{{ k8s_scheduler_conf_dir }}/kubeconfig"
  "requestheader-client-ca-file": "{{ k8s_ctl_pki_dir }}/ca-k8s-apiserver.pem"

# These sandbox security/sandbox related settings will be used for
# "kube-apiserver", "kube-scheduler" and "kube-controller-manager"
# systemd units. These options will be placed in the "[Service]" section.
# The default settings should be just fine for increased security of the
# mentioned services. So it makes sense to keep them if possible.
#
# For more information see:
# https://www.freedesktop.org/software/systemd/man/systemd.service.html#Options
#
# The options below "RestartSec=5" are mostly security/sandbox related settings
# and limit the exposure of the system towards the unit's processes. You can add
# or remove options as needed of course. For more information see:
# https://www.freedesktop.org/software/systemd/man/systemd.exec.html
k8s_ctl_service_options:
  - User={{ k8s_run_as_user }}
  - Group={{ k8s_run_as_group }}
  - Restart=on-failure
  - RestartSec=5
  - NoNewPrivileges=true
  - ProtectHome=true
  - PrivateTmp=true
  - PrivateUsers=true
  - ProtectSystem=full
  - ProtectClock=true
  - ProtectKernelModules=true
  - ProtectKernelTunables=true
  - ProtectKernelLogs=true
  - ProtectControlGroups=true
  - ProtectHostname=true
  - ProtectControlGroups=true
  - RestrictNamespaces=true
  - RestrictRealtime=true
  - RestrictSUIDSGID=true
  - CapabilityBoundingSet=~CAP_SYS_PTRACE
  - CapabilityBoundingSet=~CAP_KILL
  - CapabilityBoundingSet=~CAP_MKNOD
  - CapabilityBoundingSet=~CAP_SYS_CHROOT
  - CapabilityBoundingSet=~CAP_SYS_ADMIN
  - CapabilityBoundingSet=~CAP_SETUID
  - CapabilityBoundingSet=~CAP_SETGID
  - CapabilityBoundingSet=~CAP_SETPCAP
  - CapabilityBoundingSet=~CAP_CHOWN
  - SystemCallFilter=@system-service
  - ReadWritePaths=-/usr/libexec/kubernetes

I’ll put the variables I want to change into group_vars/k8s_controller.yml with a few exceptions. k8s_interface is already defined in group_vars/all.yml. So I don’t need to redefine this variable anywhere else. As I use WireGuard the value of this variable will be wg0 so that all Kubernetes Control Plane services bind to this interface. k8s_ca_conf_directory is also already defined in group_vars/all.yml and I’ll keep that as is. The role will search for the certificates I created in certificate authority in the directory specified in k8s_ca_conf_directory on the Ansible Controller node. The variable k8s_apiserver_csr_cn is defined in group_vars/all.yml. This variable is needed when the certificates are created (so thats the k8s_ca role) and when a ClusterRoleBinding called system:kube-apiserver is created (that happens in kubernetes_controller role). As the kube-apiserver needs to be able to communicate with the kubelet on the K8s worker nodes to fetch the logs of Pods e.g. I need to make sure that the “common name” (CN) used in by the k8s-apiserver matches the “user name” in the ClusterRoleBinding.

I’ll also change the cluster name which is kubernetes by default. As this variable is used by kubernetes_controller and kubernetes_worker role I’ll put it in group_vars/all.yml. E.g.:

yaml

k8s_config_cluster_name: "k8s-01"

Above I’ve installed HAProxy load balancer that listens on localhost interface on port 16443 and forwards requests to one of the three kube-apiserver and it runs on all Kubernetes Controller and Worker nodes. To make kube-scheduler and kube-controller-manager use HAProxy I’ll override two variables:

yaml

k8s_ctl_api_endpoint_host: "127.0.0.1"
k8s_ctl_api_endpoint_port: "16443"

In general it’s a good idea to pin the Kubernetes version in case the default changes. Since etcd is using the same WireGuard network as all the Kubernetes services I set k8s_ctl_etcd_interface to the value of k8s_interface:

yaml

k8s_ctl_release: "1.28.5"
k8s_ctl_etcd_interface: "{{ k8s_interface }}"

kube-apiserver, kube-scheduler and kube-controller-manager will by default run as user/group k8s (see k8s_run_as_user and k8s_run_as_group variables). I’ll keep the default here but set a UID and GID for this user and group:

yaml

k8s_run_as_user_uid: "888"
k8s_run_as_group_gid: "888"

The role will also create certificates and a kubeconfig file for the admin user. This the very first Kubernetes user so to say. Store the files in a safe place. In my case the file admin.kubeconfig will end up in directory kubeconfig in my Python venv on my laptop which is my Ansible Controller node. Also part of the kubeconfig file are the connection settings to which kub-apiserver kubectl should connect. In my case I tell kubectl to connect to the first Kubernetes Controller node on port 6443. As I’m not port of the WireGuard VPN it’s not the internal (i) host name but the public (p) one (so the one you can reach from your network where the Ansible Controller is located) So I’ll set the following variables:

yaml

k8s_admin_conf_dir: "/opt/scripts/ansible/k8s-01_vms/kubeconfig"
k8s_admin_conf_dir_perm: "0770"
k8s_admin_conf_owner: "..."
k8s_admin_conf_group: "..."
k8s_admin_api_endpoint_host: "k8s-010102.p.example.com"
k8s_admin_api_endpoint_port: "6443"

The kube-apiserver settings defined in k8s_apiserver_settings can be overridden by defining a variable called k8s_apiserver_settings_user. You can also add additional settings for the kube-apiserver daemon by using this variable. I’ll change only two settings:

yaml

k8s_apiserver_settings_user:
  "bind-address": "0.0.0.0"
  "service-account-issuer": "https://{{ groups.k8s_controller | first }}:6443"

Set bind-address to 0.0.0.0 makes kube-apiserver available to the outside world. If you leave the default it will bind to the WireGuard interface wg0 or whatever interface you specified in k8s_interface. The service-account-issuer flag might get interesting if you use Kubernetes as an OIDC provider for Vault e.g. So whatever service wants to use Kubernetes as an OIDC provider must be able to communicate with the URL specified in service-account-issuer. By default it’s the first hostname in the Ansible k8s_controller group. If you’ve a loadbalancer for kube-apiserver then it might make sense to use an hostname that points to the loadbalancer IP and have a DNS entry for that IP like api.k8s-01.example.com. Just make sure that you included this hostname in the certificate for the kube-apiserver.

The same is true for the kube-controller-manager by adding entries to k8s_controller_manager_settings_user variable. There I’ll only change one setting:

yaml

k8s_controller_manager_settings_user:
  "cluster-name": "{{ k8s_config_cluster_name }}"

For kube-scheduler add entries to k8s_scheduler_settings_user variable to override settings in k8s_scheduler_settings dictionary or to add new one.

You definitely need to set k8s_encryption_config_key which is needed for k8s_apiserver_encryption_provider_config which will look like this in my case:

yaml

k8s_encryption_config_key: "..."

k8s_apiserver_encryption_provider_config: |
  ---
  kind: EncryptionConfiguration
  apiVersion: apiserver.config.k8s.io/v1
  resources:
    - resources:
        - secrets
        - configmaps
      providers:
        - secretbox:
            keys:
              - name: key1
                secret: {{ k8s_encryption_config_key }}
        - identity: {}  

To create a key for k8s_encryption_config_key use this command head -c 32 /dev/urandom | base64. Think about storing this value in ansible-vault as it is pretty important. Now what is this all about? As you know Kubernetes stores it’s state and resources in etcd. Two of these resources are Secrets and ConfigMaps e.g. But by default these Secrets will be stored in plain text in etcd which is not really what you want 😉 And that’s what EncryptionConfiguration is for. In the resources you specify what Kubernetes resources should be encrypted “at rest” which means encrypted before they’re transferred and stored in etcd. In my case that’s Kubernetes resources of type secrets and configmaps. And in providers I specify what encryption provider I want to use which is secretbox. If you have an external KMS provider (e.g. Google KMS) you should consider using kms v2. But I don’t have that so secretbox is the best choice. And finally there is the k8s_encryption_config_key again. It’s used to encrypt the data specified in the resources list. So if one has this key and has access to etcd he/she would be able to decrypt the data. That’s why it’s important to keep the k8s_encryption_config_key secure. For more information see Encrypting Confidential Data at Rest.

Setting k8s_apiserver_encryption_provider_config correctly right from the start makes life easier later if you don’t have to change it again. So please read carefully! But in general the settings above should be a good starting point. Nevertheless if you want to change the setting later it’s also doable. Please read Rotate a decryption key

Now I’ll add an entry for the controller hosts into Ansible’s hosts file e.g. (of course you need to change k8s-01[01:03]02.i.example.com to your own hostnames):

yaml

k8s_controller:
  hosts:
    k8s-01[01:03]02.i.example.com:

Install the role via

bash

ansible-galaxy install githubixx.kubernetes_controller

Next add the role ansible-role-kubernetes_controller to the k8s.yml playbook file e.g.:

yaml

  hosts: k8s_controller
  roles:
    -
      role: githubixx.kubernetes_controller
      tags: role-kubernetes-controller

Apply the role via

bash

ansible-playbook --tags=role-kubernetes-controller k8s.yml

I’ve already installed kubectl binary locally in harden the instances of my tutorial which I need now to check if I’ve a “rudimentary” K8s cluster running already. The kubernetes_controller role created a admin.kubeconfig file. The location of that file is the value of the variable k8s_admin_conf_dir (which I set to my Python venv: /opt/scripts/ansible/k8s-01_vms/kubeconfig). admin.kubeconfig contains the authentication information of the very first “user” of your Kubernetes cluster (so to say). This “user” is the most powerful “user” and can change everything in your cluster. So please keep admin.kubeconfig in a secure place! After everything is setup you should create other users with less privileges. “Users” is actually not correct as there is no such thing in K8s. Any user that presents a valid certificate signed by the cluster’s certificate authority (CA) is considered authenticated (but that doesn’t mean that it can do much 😉). For more information on this topic see:

admin.kubeconfig also contains the information for kubectl how to connect to kube-apiserver. That was specified in k8s_admin_api_endpoint_host and k8s_admin_api_endpoint_port. In my case there is entry in admin.kubeconfig that looks like this: server: https://192.168.11.3:6443. As you might remember I’ve configured kube-apiserver to listen on all interfaces. 192.168.11.3 is the IP of the first K8s controller node (k8s-010102). Since my laptop is part of the 192.168.10.0/23 network (which includes 192.168.11.0/24) I’m able to connect to this IP (granted that the firewall of that host allows the connection). So make sure that your workstation or host you run kubectl on is allowed to send https requests to this endpoint.

So with kubectl and kubeconfig ready I can check a few things already. E.g there should be currently one Service running:

bash

kubectl --kubeconfig=kubeconfig/admin.kubeconfig get services -A

NAMESPACE   NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
default     kubernetes   ClusterIP   10.32.0.1    <none>        443/TCP   3d19h

Here you see again the IP 10.32.0.1 as mentioned in one of the previous blog posts. Once it’s possible to deploy pods (I’ve currently no K8s worker nodes) this is the internal name/IP to communicate with the kube-apiserver. Actually this IP is a loadbalancer IP:

bash

kubectl --kubeconfig=kubeconfig/admin.kubeconfig get endpoints -A

NAMESPACE   NAME         ENDPOINTS                                      AGE
default     kubernetes   10.0.11.3:6443,10.0.11.6:6443,10.0.11.9:6443   3d19h

10.0.11.(3|6|9) are the WireGuard interfaces on my K8s controller nodes k8s-01(01|02|03)02.i.example.com and kube-apiserver is listening on port 6443.

As you can see above I always had to specify the kubeconfig file by using --kubeconfig=kubeconfig/admin.kubeconfig flag. That’s a little bit annoying. By default kubectl uses $HOME/.kube/config. You most probably stored admin.kubeconfig in a different location. One way to tell kubectl to use a different kubeconfig file is to set an environment variable called KUBECONFIG. As mentioned in one of the previous blog posts I use direnv utility to automatically set environment variables or execute commands when I enter the Python venv directory. So I’ll extend the .envrc file (read by direnv when you enter a directory) a bit. E.g.:

bash

export ANSIBLE_HOME="$(pwd)"
export KUBECONFIG="$(pwd)/kubeconfig/admin.kubeconfig"

source ./bin/activate

If you now run direnv allow it should load the updated .envrc file and KUBECONFIG variable should be set. E.g.:

bash

env | grep KUBECONFIG
KUBECONFIG=/opt/scripts/ansible/k8s-01_vms/kubeconfig/admin.kubeconfig

Or you can create a softlink to admin.kubeconfig. E.g.

bash

cd $HOME
mkdir .kube
cd .kube
ln -s <path_to_admin_kubeconfig_file/admin.kubeconfig config

Another possibility would be to create an alias. E.g.:

bash

alias kubectl='kubectl --kubeconfig="<path_to_admin_kubeconfig_file/admin.kubeconfig"'

If you now run kubectl cluster-info you should see something like this:

bash

kubectl cluster-info

Kubernetes control plane is running at https://192.168.11.3:6443
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

Now it’s time to setup the Kubernetes worker.