Kubernetes upgrade notes: 1.30.x to 1.31.x
Introduction
If you used my Kubernetes the Not So Hard Way With Ansible blog posts to setup a Kubernetes (K8s) cluster this notes might be helpful for you (and maybe for others too that manage a K8s cluster on their own e.g.). I’ll only mention changes that might be relevant because they will either be interesting for most K8s administrators anyways (even in case they run a fully managed Kubernetes deployment) or if it’s relevant if you manage your own bare-metal/VM based on-prem Kubernetes deployment. I normally skip changes that are only relevant for GKE, AWS EKS, Azure or other cloud providers.
I’ve a general upgrade guide Kubernetes the Not So Hard Way With Ansible - Upgrading Kubernetes that worked quite well for me for the last past K8s upgrades. So please read that guide if you want to know HOW the components are updated. This post here is esp. for the 1.30.x
to 1.31.x
upgrade and WHAT was interesting for me.
As usual I don’t update a production system before the .2
release of a new major version is released. In my experience the .0
and .1
are just too buggy. Nevertheless it’s important to test new releases (and even beta or release candidates if possible) already in development environments and report bugs!
Update to latest current release
I only upgrade from the latest version of the former major release. At the time writing this blog post 1.30.9
was the latest 1.30.x
release. After reading the 1.30 CHANGELOG to figure out if any important changes where made between the current 1.30.x
and latest 1.30.9
release I didn’t see anything that prevented me updating and I don’t needed to change anything.
So I did the 1.30.9
update first. If you use my Ansible roles that basically only means to change k8s_ctl_release
variable from 1.30.x
to 1.30.9
(for the controller nodes) and the same for k8s_worker_release
(for the worker nodes). Deploy the changes for the control plane and worker nodes as described in my upgrade guide. Hint: To save some time, IMHO it should be good enough to only update the controller nodes to the latest 1.30.x
release as it’s mostly the kube-apiserver
that stores the state of the Kubernetes cluster in etcd
and that state is quite important. That’s what I’m normally doing. Upgrading to the next major release can then be done for all nodes as usual. But if you want to be absolutely sure just upgrade the whole cluster to the latest 1.30.x
release first.
After that everything still worked as expected, I continued with the next step.
Upgrading kubectl
As it’s normally no problem (and actually the supported method) to have a newer kubectl
utility that is only one major version ahead of the server version I updated kubectl
from 1.30.x
to latest 1.31.x
using my kubectl Ansible role.
Upgrading various components
This time I did a pretty big update of various components together with the K8s 1.31.x
release. As this changes quite a lot on the worker nodes, it makes sense to Safely Drain a Node so that the workload gets migrated to other nodes. Then the software can be safely upgraded on the drained node. Maybe also have a look at my blog post Upgrading Kubernetes for further information.
Upgrading etcd
While my roles are not using kubeadm
to manage my K8s cluster, it’s recommended to have at least etcd
3.5.11
running. I updated my etcd role to current 3.5.17
and updated my etcd
deployment accordingly. See Upgrading Kubernetes - etcd for more information how to upgrade etcd
. Also Kubernetes v1.31: Accelerating Cluster Performance with Consistent Reads from Cache
Upgrading containerd
containerd was updated from 1.7.22
to 2.0.2
. That’s a major release upgrade. I updated my containerd role accordingly. Please read the CHANGELOG for potential breaking changes. From my experience the upgrade “just works” if no changes were made for the containerd_config
Ansible variable which contains the containerd
configuration. Otherwise you need to check as the format has changed to version 3
.
Upgrading runc
runc was upgraded from 1.1.14
to 1.2.4
. I’ve updated my runc role accordingly. Release notes for runc
are here but shouldn’t be that interesting. Sadly still no User Namespaces support yet.
Upgrading CNI
And finally the CNI plugins are updated to 1.6.2
. Again updated my CNI role accordingly. The release notes for CNI 1.6.0 might be worth a read but only if you want to go deeper 😉
Release notes
Since K8s 1.14 there are also searchable release notes available. You can specify the K8s version and a K8s area/component (e.g. kubelet, apiserver, …) and immediately get an overview what changed in that regard. Quite nice! 😉
Urgent Upgrade Notes
I guess most users wont be affected by any Urgent Upgrade Notes.
- Kubelet flag
--keep-terminated-pod-volumes
was removed. This flag was deprecated in 2017. (#122082, @carlory) [SIG Apps, Node, Storage and Testing] - Action required for custom scheduler plugin developers: Plugins have to implement a
QueueingHint
forPod/Update
event if the rejection from them could be resolved by updating unscheduled Pods themselves. - ACTION REQUIRED: If you are using the
RecoverVolumeExpansionFailure
alpha feature gate then after upgrading to this release, you need to update some objects.
What’s New (Major Themes)
All important stuff is listed in the Kubernetes v1.31: Elli release announcement.
The following listing of changes and features only contains stuff that I found useful and interesting. See the full Kubernetes v1.31 Changelog for all changes.
Graduated to stable
- AppArmor support is now stable: To learn more read the AppArmor tutorial.
- Improved ingress connectivity reliability for kube-proxy: This feature implements a mechanism in kube-proxy for load balancers to do connection draining for terminating Nodes exposed by services of
type: LoadBalancer
andexternalTrafficPolicy: Cluster
and establish some best practices for cloud providers and Kubernetes load balancers implementations. - Persistent Volume last phase transition time: This allows you to measure time between when a PersistentVolume moves from
Pending
toBound
. This can be also useful for providing metrics and SLOs. Also Kubernetes v1.31: PersistentVolume Last Phase Transition Time Moves to GA
Graduated to beta
- nftables backend for kube-proxy: Requires kernel 5.13 or later. May not be compatible with all network plugins yet.
- Changes to reclaim policy for PersistentVolumes: With the introduction of this feature, Kubernetes now guarantees that the “Delete” reclaim policy will be enforced, ensuring the deletion of the underlying storage object from the backend infrastructure, regardless of the deletion sequence of the PV and PVC.
- Multiple Service CIDRs: That’s an interesting feature for some cluster admins most probably. Services IP ranges are defined during the cluster creation as a hardcoded flag in the
kube-apiserver
. This new feature allows users and cluster admins to dynamically modify Service CIDR ranges with zero downtime (see Virtual IPs and Service Proxies) - Kubernetes VolumeAttributesClass ModifyVolume: The VolumeAttributesClass provides a generic, Kubernetes-native API for modifying dynamically volume parameters like provisioned IO. Also Kubernetes 1.31: VolumeAttributesClass for Volume Modification Beta
Alpha features
- New DRA APIs for better accelerators and other hardware management
- Support for image volumes: The Kubernetes community is moving towards fulfilling more Artificial Intelligence (AI) and Machine Learning (ML) use cases in the future. Adds a new alpha feature to allow using an OCI image as a volume in a Pod. This feature allows users to specify an image reference as volume in a pod while reusing it as volume mount within containers. Also Kubernetes 1.31: Read Only Volumes Based On OCI Artifacts (alpha)
- Exposing device health information through Pod status: By enabling this feature, the field
allocatedResourcesStatus
will be added to each container status, within the.status
for each Pod. TheallocatedResourcesStatus
field reports health information for each device assigned to the container. - Finer-grained authorization based on selectors
- Restrictions on anonymous API access
Further reading
- Kubernetes 1.31 – What’s new?
- Kubernetes 1.31 Arrives with New Support for AI/ML, Networking
- Kubernetes 1.31: MatchLabelKeys in PodAffinity graduates to beta
- Kubernetes 1.31: Prevent PersistentVolume Leaks When Deleting out of Order
- Kubernetes 1.31: Pod Failure Policy for Jobs Goes GA
- Kubernetes 1.31: Streaming Transitions from SPDY to WebSockets: In Kubernetes 1.31, by default kubectl now uses the WebSocket protocol instead of SPDY for streaming.
- Kubernetes 1.31: Autoconfiguration For Node Cgroup Driver (beta)
- Kubernetes 1.31: Custom Profiling in Kubectl Debug Graduates to Beta
- Kubernetes 1.31: Fine-grained SupplementalGroups control
- Kubernetes v1.31: New Kubernetes CPUManager Static Policy: Distribute CPUs Across Cores
Deprecations
- Cgroup v1 enters the maintenance mode: It is recommended that you start switching to use cgroup v2 as soon as possible. This transition depends on your architecture, including ensuring the underlying operating systems and container runtimes support cgroup v2 and testing workloads to verify that workloads and applications function correctly with cgroup v2. E.g. if you still use OpenJDK 8 you should read OpenJDK 8u372 to feature cgroup v2 support. Also Kubernetes 1.31: Moving cgroup v1 Support into Maintenance Mode
- Deprecation of status.nodeInfo.kubeProxyVersion field for Nodes
- Removal of CephFS volume plugin: CephFS volume plugin was removed in this release and the
cephfs
volume type became non-functional. - Removal of Ceph RBD volume plugin
API changes
kube-apiserver
: the--encryption-provider-config
file is now loaded with strict deserialization, which fails if the config file contains duplicate or unknown fields. This protects against accidentally running with config files that are malformed, mis-indented, or have typos in field names, and getting unexpected behavior. When--encryption-provider-config-automatic-reload
is used, new encryption config files that contain typos after the kube-apiserver is running are treated as invalid and the last valid config is used. (#124912, @enj) [SIG API Machinery and Auth]- Removed deprecated command flags
--volume-host-cidr-denylist
and--volume-host-allow-local-loopback
fromkube-controller-manager
. (#124017, @carlory) [SIG API Machinery, Apps, Cloud Provider and Storage] - ACTION REQUIRED for custom scheduler plugin developers: EventsToRegister in the EnqueueExtensions interface gets ctx in the parameters and error in the return values. Please change your plugins’ implementation accordingly. (#126113, @googs1025) [SIG Node, Scheduling, Storage and Testing]
- Add
--for=create
option tokubectl wait
(#125868, @soltysh) [SIG CLI and Testing] - Added
--keep-*
flags tokubectl debug
, which enables to control the removal of probes, labels, annotations and initContainers from copy pod. (#123149, @mochizuki875) [SIG CLI and Testing] - Added flag to
kubectl logs
called--all-pods
to get all pods from a object that uses a pod selector. (#124732, @cmwylie19) [SIG CLI and Testing] - Added ports autocompletion for
kubectl port-forward
command. (#124683, @TessaIO) [SIG CLI]
Features
- Changed Linux swap handling to restrict access to swap for containers in high priority Pods. New Pods that have a node- or cluster-critical priority are prohibited from accessing swap on Linux, even if your cluster and node configuration could otherwise allow this. (#125277, @iholder101) [SIG Node and Testing]
- Enabled feature gates for PortForward (
kubectl port-forward
) over WebSockets by default (beta).- Server-side feature gate: PortForwardWebsocket
- Client-side (kubectl) feature gate:
PORT_FORWARD_WEBSOCKETS
environment variable - To turn off PortForward over WebSockets for kubectl, the environment variable feature gate must be explicitly set -
PORT_FORWARD_WEBSOCKETS=false
(#125528, @seans3) [SIG API Machinery and CLI]
- Kube-proxy’s nftables mode (
--proxy-mode=nftables
) is now beta and available by default. (#124383, @danwinship) [SIG Cloud Provider and Network] - Added
kubectl
support for:kubectl create secret docker-registry --from-file=<path/to/.docker/config.json>
kubectl create secret docker-registry --from-file=.dockerconfigjson=<path/to/.docker/config.json>
(#119589, @carlory)
- Drop support for the deprecated and unsupported
kubectl run
flags:- filename
- force
- grace-period
- kustomize
- recursive
- timeout
- wait
- Drop support for the deprecated
--delete-local-data
fromkubectl drain
, users should use--delete-emptydir-data
, instead. - Kube-apiserver: the
--enable-logs-handler
flag and log-serving functionality which was already deprecated is now switched off by default and scheduled to be removed in v1.33. (#125787, @dims) [SIG API Machinery, Network and Testing] - Removed Kubelet flags
--iptables-masquerade-bit
and--iptables-drop-bit
as they were deprecated in v1.28. in v1.28 and have now been removed entirely. (#122363, @carlory) [SIG Network and Node] kubectl describe service
now shows internal traffic policy and ip mode of load balancer IP. (#125117, @tnqn) [SIG CLI and Network]
Other
etcd
: Updated tov3.5.14
CSI
If you use CSI then also check the CSI Sidecar Containers documentation. Every sidecar container contains a matrix which version you need at a minimum, maximum and which version is recommend to use with whatever K8s version.
Nevertheless if your K8s update to v1.31
worked fine I would recommend to also update the CSI sidecar containers sooner or later.
Upgrade Kubernetes
Now I finally upgraded the K8s controller and worker nodes to version 1.31.x
as described in Kubernetes the Not So Hard Way With Ansible - Upgrading Kubernetes.
That’s it for today! Happy upgrading! 😉