Kubernetes upgrade notes: 1.27.x to 1.28.x
Introduction
If you used my Kubernetes the Not So Hard Way With Ansible blog posts to setup a Kubernetes (K8s) cluster this notes might be helpful for you (and maybe for others too that manage a K8s cluster on their own e.g.). I’ll only mention changes that might be relevant because they will either be interesting for most K8s administrators anyways (even in case they run a fully managed Kubernetes deployment) or if it’s relevant if you manage your own bare-metal/VM based on-prem Kubernetes deployment. I normally skip changes that are only relevant for GKE, AWS EKS, Azure or other cloud providers.
I’ve a general upgrade guide Kubernetes the Not So Hard Way With Ansible - Upgrading Kubernetes that worked quite well for me for the last past K8s upgrades. So please read that guide if you want to know HOW the components are updated. This post here is esp. for the 1.27.x
to 1.28.x
upgrade and WHAT was interesting for me.
As usual I don’t update a production system before the .2
release of a new major version is released. In my experience the .0
and .1
are just too buggy. Nevertheless it’s important to test new releases (and even beta or release candidates if possible) already in development environments and report bugs!
Important upgrade notes for my Ansible Kubernetes roles
With version 22.0.0+1.27.8
of my kubernetes_controller and version 24.0.0+1.27.8
of my kubernetes_worker quite some refactoring took place. So please read kubernetes_controller CHANGELOG and kubernetes_worker CHANGELOG carefully!
This refactoring was needed to make it possible to have githubixx.kubernetes_controller
and githubixx.kubernetes_worker
roles deployed on the same host e.g. They were some intersections between the two roles that had to be fixed. Also security for kube-apiserver
, kube-scheduler
and kube-controller-manager
was increased by using systemd
which allows to limit the exposure of the system towards the unit’s processes.
Basically if you keep the new defaults of k8s_ctl_conf_dir
and k8s_worker_conf_dir
you can delete the following directories after you upgraded a node to the new role version:
On the controller nodes:
- /var/lib/kube-controller-manager
- /var/lib/kube-scheduler
On the worker nodes:
- /var/lib/kube-proxy
On both type of nodes:
- /var/lib/kubernetes
Before this role version there was only k8s_conf_dir: /usr/lib/kubernetes
which was valid for both nodes. This variable is gone. The new defaults are k8s_ctl_conf_dir: /etc/kubernetes/controller
(for kubernetes_controller
role) and k8s_worker_conf_dir: /etc/kubernetes/worker
(for kubernetes_worker
role).
Basically all kubernetes_controller
related variables now start with k8s_ctl_
and all kubernetes_worker
related variables with k8s_worker_
.
The kubernetes_worker
role contains a Molecule scenario that set ups a fully functional Kubernetes cluster. You don’t need to deploy all the VMs but the Molecule configuration files might give you a good hint about which variables might need to be adjusted for your own deployment.
Also my containerd role had quite some changes recently with version 0.11.0+1.7.8
. So please consult the CHANGELOG of this role too. Esp. runc
and CNI plugins
are no longer installed by this role. Please use runc role and cni role accordingly.
And finally etcd role had quite some changes too with version 13.0.0+3.5.9
. So please read that CHANGELOG too.
Update to latest current release
I only upgrade from the latest version of the former major release. At the time writing this blog post 1.27.8
was the latest 1.27.x
release. After reading the 1.27 CHANGELOG to figure out if any important changes where made between the current 1.27.x
and latest 1.27.8
release I didn’t see anything that prevented me updating and I don’t needed to change anything.
So I did the 1.27.8
update first. If you use my Ansible roles that basically only means to change k8s_ctl_release
variable from 1.27.x
to 1.27.8
(for the controller nodes) and the same for k8s_worker_release
(for the worker nodes). Deploy the changes for the control plane and worker nodes as described in my upgrade guide.
After that everything still worked as expected so I continued with the next step.
Upgrading kubectl
As it’s normally no problem to have a newer kubectl
utility that is only one major version ahead of the server version I updated kubectl
from 1.27.x
to latest 1.28.x
using my kubectl Ansible role.
Release notes
Since K8s 1.14 there are also searchable release notes available. You can specify the K8s version and a K8s area/component (e.g. kublet, apiserver, …) and immediately get an overview what changed in that regard. Quite nice! 😉
Urgent Upgrade Notes
As always before a major upgrade read the Urgent Upgrade Notes! If you used my Ansible roles to install Kubernetes and used most of the default settings then there should be no need to adjust any settings. For K8s 1.28
release I actually couldn’t find any urgent notes that were relevant for my Ansible roles or my own on-prem setup.
If you’re a custom scheduler plugin developer or use a custom scheduler plugin or use CEPH or RBD volume plugin please read the urgent upgrade notes.
What’s New (Major Themes)
- Non-Graceful Node Shutdown Moves to GA I case a node goes down because of power outage or hardware problems then
kublet
’s Shutdown Manager has no chance to recognize this situation. For stateless apps that’s normally not a problem but for stateful apps. In case of a non-graceful node shutdown one can now add a taint e.g.:kubectl taint nodes <node-name> node.kubernetes.io/out-of-service=nodeshutdown:NoExecute
. This taint triggers pods on the node to be forcefully deleted if there are no matching tolerations on the pods. Persistent volumes attached to the shutdown node will be detached, and new pods will be created successfully on a different running node. - Retroactive Default StorageClass move to GA The PersistentVolume (PV) controller has been modified to automatically assign a default StorageClass to any unbound PersistentVolumeClaim with the
storageClassName
not set. Additionally, the PersistentVolumeClaim admission validation mechanism within the API server has been adjusted to allow changing values from an unset state to an actual StorageClass name. - Improved failure handling for Jobs (Alpha) This change adds
Pod replacement policy
e.g. to avoid having two Pods (orJob
) running at the same time. This makes it possible that a new Pod is not scheduled immediately but only after the Pod entered thefailed
phase. - Node podresources API Graduates to GA
- Beta support for using swap on Linux Support for swap on Linux nodes has graduated to Beta, along with many new improvements. This requires
cgroup v2
. - Introducing native sidecar containers Enables restartable init containers and is available in alpha in Kubernetes 1.28. Kubernetes 1.28 adds a new
restartPolicy
field to init containers that is available when theSidecarContainers
feature gate is enabled. This is interesting for Batch or AI/ML workloads, Network proxies (Istio), Log collection containers, andJobs
e.g. - A New (alpha) Mechanism For Safer Cluster Upgrades This blog describes the mixed version proxy, a new alpha feature in Kubernetes 1.28. The mixed version proxy enables an HTTP request for a resource to be served by the correct API server in cases where there are multiple API servers at varied versions in a cluster. For example, this is useful during a cluster upgrade, or when you’re rolling out the runtime configuration of the cluster’s control plane.
Deprecation
kubectl version
default output changed to be identical to whatkubectl version --short
printed, and removed--short
flag entirely.kube-controller-manager
deprecate--volume-host-cidr-denylist
and--volume-host-allow-local-loopback
KMSv1
is deprecated and will only receive security updates going forward. UseKMSv2
instead.
For more information see:
API changes
- When an Indexed Job has a number of completions higher than 10^5 and parallelism higher than 10^4, and a big number of Indexes fail, Kubernetes might not be able to track the termination of the Job. Kubernetes now emits a warning, at Job creation, when the Job manifest exceeds both of these limits.
- Added a warning that TLS 1.3 ciphers are not configurable.
- If using
cgroups v2
, then the cgroup aware OOM killer will be enabled for container cgroups via memory.oom.group . This causes processes within the cgroup to be treated as a unit and killed simultaneously in the event of an OOM kill on any process in the cgroup. - Indexed Job pods now have the pod completion index set as a pod label.
kube-proxy
: added--logging-format
flag to support structured logging.- Pods which set
hostNetwork: true
and declare ports, get thehostPort
field set automatically. Previously this would happen in thePodTemplate
of aDeployment
,DaemonSet
or other workload API. StatefulSet
pods now have the pod index set as a pod labelstatefulset.kubernetes.io/pod-index
.- The IPTablesOwnershipCleanup feature (KEP-3178) is now GA; kubelet no longer creates the
KUBE-MARK-DROP
chain (which has been unused for several releases) or theKUBE-MARK-MASQ
chain (which is now only created bykube-proxy
). kube-scheduler
component config (KubeSchedulerConfiguration
)kubescheduler.config.k8s.io/v1beta2
is removed in v1.28. Migratekube-scheduler
configuration files tokubescheduler.config.k8s.io/v1
.
Features
- Added
--concurrent-cron-job-syncs
flag forkube-controller-manager
to set the number of workers for cron job controller. - Added a new command line argument
--interactive
tokubectl
. The new command line argument lets a user confirm deletion requests per resource interactively. - Added full
cgroup v2
swap support for bothLimited
andUnlimited
swap. Support forcgroup v1
is removed. - Added the implementation for
PodRecreationPolicy
to wait for the creation of pods once the existing ones are fully terminated. - Allow to monitor client-go DNS resolver latencies via
rest_client_dns_resolution_duration_seconds
Prometheus metric. - Implemented alpha support for a drop-in
kubelet
configuration directory. kube-proxy
: Implemented connection draining for terminating nodes.
Other
etcd
: Updated tov3.5.9
CSI
If you use CSI then also check the CSI Sidecar Containers documentation. Every sidecar container contains a matrix which version you need at a minimum, maximum and which version is recommend to use with whatever K8s version.
Nevertheless if your K8s update to v1.28
worked fine I would recommend to also update the CSI sidecar containers sooner or later.
Additional resources
- Kubernetes v1.28: Planternetes
- Kubernetes 1.28: the security perspective
- Kubernetes 1.28 Accommodates the Service Mesh, Sudden Outages
Upgrade Kubernetes
Now I finally upgraded the K8s controller and worker nodes to version 1.28.x
as described in Kubernetes the Not So Hard Way With Ansible - Upgrading Kubernetes.
That’s it for today! Happy upgrading! 😉