Kubernetes the not so hard way with Ansible - Persistent storage - Part 1 - (K8s v1.18)
CHANGELOG
2020-08-12
- updated for Kubernetes v1.18
- update
hcloud_csi_node_driver_registrar
tov1.3.0
- update
hcloud_csi_attacher
tov2.2.0
- update
hcloud_csi_provisioner
tov1.6.0
- update
hcloud_csi_driver
to1.4.0
- added
hcloud_csi_resizer: "0.5.0"
- added
hcloud_csi_livenessprobe: "1.1.0"
- removed
hcloud_csi_cluster_driver_registrar
variable (no longer needed see CSI cluster-driver-registrar - added
allowVolumeExpansion: true
toStorageClass
While stateless workloads with Kubernetes is quite common now stateful workloads like databases are beginning to become more common since Container Storage Interface (CSI) was introduced. The Container Storage Interface (CSI). Container Storage Interface (CSI) defines a standard interface for container orchestration systems (like Kubernetes) to expose arbitrary storage systems to their container workloads. CSI support was introduced as alpha in Kubernetes v1.9, moved to beta in Kubernetes v1.10, and is GA in Kubernetes v1.13.
Once a CSI compatible volume driver is deployed on a Kubernetes cluster, users may use the CSI volume type to attach, mount, etc. the volumes exposed by the CSI driver. See Drivers section at Kubernetes CSI documentation for a list of available CSI drivers.
I’ve currently four block storage workloads that I want to migrate from host storage (which basically means mounting a Kubernetes worker node local storage into the pod via hostPath
) to CSI based storage: PostgreSQL, an old CMS that has no object storage support, Redis and my Postfix mailserver.
If you can’t find a storage driver on the CSI Drivers list or if you are on-premise you can also use storage solutions like Rook or OpenEBS among others. Rook is basically an operator for Ceph which not only provides block storage but also object and file storage. I’ll cover this in the next part. OpenEBS is an open-source project for container-attached and container-native storage on Kubernetes. OpenEBS adopts Container Attached Storage (CAS) approach, where each workload is provided with a dedicated storage controller.
Luckily for Hetzner Cloud there exits a CSI driver. As I’m currently running Kubernetes 1.18 all needed feature gates for kube-apiserver
and kubelet
already in beta so that means that they’re enabled by default. The driver needs at least Kubernetes 1.13.
I’ve created a Ansible playbook to install all resources needed for Hetzner CSI driver.
The README of the playbook contains all information on how to install and use the playbook. It basically should just work and there should be no need for further changes. If you don’t care about further details you can basically stop reading here and just run the playbook and enjoy K8s persistence ;-)
Personally I often try to get to the bottom of things. So if you’re interested in further details read on ;-) I was curious how everything fits together, what all the pods are responsible for and so on. So this is what I found out and the information I collected about CSI and the Hetzner CSI driver in general so far. If you find any errors please let me know.
For the following K8s resources we assume that you’ve set this variable values in group_vars/all.yml
(for more information see further down the text and the README of my Ansible Hetzner CSI playbook ):
hcloud_namespace: "kube-system"
hcloud_resource_prefix: "hcloud"
hcloud_is_default_class: "true"
hcloud_volume_binding_mode: "WaitForFirstConsumer"
k8s_worker_kubelet_conf_dir: "/var/lib/kubelet"
# DaemonSet:
hcloud_csi_node_driver_registrar: "1.3.0"
# StatefulSet:
hcloud_csi_attacher: "2.2.0"
hcloud_csi_provisioner: "1.6.0"
hcloud_csi_resizer: "0.5.0"
hcloud_csi_livenessprobe: "1.1.0"
# Hetzner CSI driver
hcloud_csi_driver: "1.4.0"
So let’s see what resources the Ansible playbook will install:
---
apiVersion: v1
kind: Secret
metadata:
name: {{ hcloud_resource_prefix }}-csi
namespace: {{ hcloud_namespace }}
stringData:
token: {{ hcloud_csi_token }}
The first resource is a Secret. The secret contains the token you created in the Hetzner Cloud Console. Its needed for the driver to actually have the authority to interact wit the Hetzner API. The secret is called hcloud-csi
by default (depends on how you set the hcloud_resource_prefix
variable of course) and will be placed into the kube-system
namespace by default. All {{ ... }}
placeholder are of course variables that Ansible will replace during execution. So make sure to set them accordingly as mentioned in the Ansible playbook for the CSI driver.
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
namespace: {{ hcloud_namespace }}
name: {{ hcloud_resource_prefix }}-volumes
annotations:
storageclass.kubernetes.io/is-default-class: "{{ hcloud_is_default_class }}"
provisioner: csi.hetzner.cloud
volumeBindingMode: {{ hcloud_volume_binding_mode }}
allowVolumeExpansion: true
A StorageClass provides a way for administrators to describe the classes
of storage they offer. The name
and the parameters are significant as they can’t be changed later. The name
should reflect what the user can expect from this storage class. As we’ve only one storage type at Hetzner a generic name like hcloud-volumes
can be used. Other storage classes for AWS EBS or Google GCP persistent disks offers additional parameters
. E.g. type: pd-ssd
for a GCP persistent disk would allocate a fast SSD disk instead of a standard disk. The storage class name
will be later used if you create a PersistentVolumeClaim
where you need to provide a storageClassName
. The annotation storageclass.kubernetes.io/is-default-class: true
makes this storage class the default storage class if no storage class was defined. The volumeBindingMode
field controls when volume binding and dynamic provisioning should occur. The default value is Immediate
. The Immediate
mode indicates that volume binding and dynamic provisioning occurs once the PersistentVolumeClaim
is created. The WaitForFirstConsumer
mode which will delay the binding and provisioning of a PersistentVolume
until a Pod
using the PersistentVolumeClaim
is created. PersistentVolumes
will be selected or provisioned conforming to the topology that is specified by the Pod’s scheduling constraints. These include, but are not limited to, resource requirements, node selectors, pod affinity and anti-affinity, and taints and toleration’s.
Also you can add a few parameters in the storageClass
manifest that later handled by the external-provisioner
(see further down). E.g. you can additionally specify
parameters:
csi.storage.k8s.io/fstype: ext4
If the PVC VolumeMode
is set to Filesystem
, and the value of csi.storage.k8s.io/fstype
is specified, it is used to populate the FsType
in CreateVolumeRequest.VolumeCapabilities[x].AccessType
and the AccessType
is set to Mount
.
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ hcloud_resource_prefix }}-csi
namespace: {{ hcloud_namespace }}
A ServiceAccount provides an identity for processes that run in a Pod. As you’ll see below there will be a few pods running to make CSI work. This pods contain one or more containers which contains the various CSI processes needed. As some of the processes needs to be able to receive various information from the Kubernetes API server they’ll use the service account we defined above.
But the service account also needs permissions or roles assigned what resources the service account is allowed to get from the API server. For this we need a ClusterRole
:
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: {{ hcloud_resource_prefix }}-csi
rules:
# attacher
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "update", "patch"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list", "watch"]
- apiGroups: ["csi.storage.k8s.io"]
resources: ["csinodeinfos"]
verbs: ["get", "list", "watch"]
- apiGroups: ["storage.k8s.io"]
resources: ["csinodes"]
verbs: ["get", "list", "watch"]
- apiGroups: ["storage.k8s.io"]
resources: ["volumeattachments"]
verbs: ["get", "list", "watch", "update", "patch"]
# provisioner
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "list"]
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "create", "delete", "patch"]
- apiGroups: [""]
resources: ["persistentvolumeclaims", "persistentvolumeclaims/status"]
verbs: ["get", "list", "watch", "update", "patch"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["list", "watch", "create", "update", "patch"]
- apiGroups: ["snapshot.storage.k8s.io"]
resources: ["volumesnapshots"]
verbs: ["get", "list"]
- apiGroups: ["snapshot.storage.k8s.io"]
resources: ["volumesnapshotcontents"]
verbs: ["get", "list"]
# node
- apiGroups: [""]
resources: ["events"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
Here you can see what permissions (the verbs
) the various processes need to be able to retrieve the needed information from the APIs. The permissions (verbs
) should be self-explanatory and the resources
define what information can be accessed. Since it is a ClusterRole
which is not namespaced it also allows to get information about nodes
e.g. apiGroups: [""]
indicates the core API group (also see API groups.
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: {{ hcloud_resource_prefix }}-csi
subjects:
- kind: ServiceAccount
name: {{ hcloud_resource_prefix }}-csi
namespace: "{{ hcloud_namespace }}"
roleRef:
kind: ClusterRole
name: {{ hcloud_resource_prefix }}-csi
apiGroup: rbac.authorization.k8s.io
The ClusterRoleBinding
is basically the “glue” between the ServiceAccount
and ClusterRole
. Here we basically map the ClusterRole
(and therefore its permissions) to the ServiceAccount
we created above.
---
kind: StatefulSet
apiVersion: apps/v1
metadata:
name: {{ hcloud_resource_prefix }}-csi-controller
namespace: {{ hcloud_namespace }}
spec:
selector:
matchLabels:
app: {{ hcloud_resource_prefix }}-controller
serviceName: {{ hcloud_resource_prefix }}-controller
replicas: 1
template:
metadata:
labels:
app: {{ hcloud_resource_prefix }}-controller
spec:
serviceAccount: {{ hcloud_resource_prefix }}-csi
containers:
- name: csi-attacher
image: quay.io/k8scsi/csi-attacher:v{{ hcloud_csi_attacher }}
args:
- --csi-address=/var/lib/csi/sockets/pluginproxy/csi.sock
- --v=5
volumeMounts:
- name: socket-dir
mountPath: /var/lib/csi/sockets/pluginproxy/
securityContext:
privileged: true
capabilities:
add: ["SYS_ADMIN"]
allowPrivilegeEscalation: true
- name: csi-resizer
image: quay.io/k8scsi/csi-resizer:v{{ hcloud_csi_resizer }}
args:
- --csi-address=/var/lib/csi/sockets/pluginproxy/csi.sock
- --v=5
volumeMounts:
- name: socket-dir
mountPath: /var/lib/csi/sockets/pluginproxy/
securityContext:
privileged: true
capabilities:
add: ["SYS_ADMIN"]
allowPrivilegeEscalation: true
- name: csi-provisioner
image: quay.io/k8scsi/csi-provisioner:v{{ hcloud_csi_provisioner }}
args:
- --provisioner=csi.hetzner.cloud
- --csi-address=/var/lib/csi/sockets/pluginproxy/csi.sock
- --feature-gates=Topology=true
- --v=5
volumeMounts:
- name: socket-dir
mountPath: /var/lib/csi/sockets/pluginproxy/
securityContext:
privileged: true
capabilities:
add: ["SYS_ADMIN"]
allowPrivilegeEscalation: true
- name: hcloud-csi-driver
image: hetznercloud/hcloud-csi-driver:{{ hcloud_csi_driver }}
imagePullPolicy: Always
env:
- name: CSI_ENDPOINT
value: unix:///var/lib/csi/sockets/pluginproxy/csi.sock
- name: METRICS_ENDPOINT
value: 0.0.0.0:9189
- name: HCLOUD_TOKEN
valueFrom:
secretKeyRef:
name: hcloud-csi
key: token
volumeMounts:
- name: socket-dir
mountPath: /var/lib/csi/sockets/pluginproxy/
ports:
- containerPort: 9189
name: metrics
- name: healthz
containerPort: 9808
protocol: TCP
livenessProbe:
failureThreshold: 5
httpGet:
path: /healthz
port: healthz
initialDelaySeconds: 10
timeoutSeconds: 3
periodSeconds: 2
securityContext:
privileged: true
capabilities:
add: ["SYS_ADMIN"]
allowPrivilegeEscalation: true
- name: liveness-probe
imagePullPolicy: Always
image: quay.io/k8scsi/livenessprobe:v{{ hcloud_csi_livenessprobe }}
args:
- --csi-address=/var/lib/csi/sockets/pluginproxy/csi.sock
volumeMounts:
- mountPath: /var/lib/csi/sockets/pluginproxy/
name: socket-dir
volumes:
- name: socket-dir
emptyDir: {}
Next we’ve a StatefulSet called hcloud-csi-controller
. Like a Deployment
, a StatefulSet
manages Pods
that are based on an identical container spec. Unlike a Deployment
, a StatefulSet
maintains a sticky identity for each of their Pods
. These pods are created from the same spec, but are not interchangeable: each has a persistent identifier that it maintains across any rescheduling. In the specification above the pod consists of four containers and we’ve only one replica that means that StatefulSet
will scheduled only on one of the worker nodes. This is normally sufficient for smaller K8s cluster.
So if you deploy the manifest above later you’ll see something like this (I deployed everything to kube-system
namespace so you may change to that namespace first or specifying one via -n
flag):
kubectl get statefulsets -o wide
NAME READY AGE CONTAINERS IMAGES
hcloud-csi-controller 1/1 2m4s csi-attacher,csi-resizer,csi-provisioner,hcloud-csi-driver,liveness-probe quay.io/k8scsi/csi-attacher:v2.2.0,quay.io/k8scsi/csi-resizer:v0.5.0,quay.io/k8scsi/csi-provisioner:v1.6.0,hetznercloud/hcloud-csi-driver:1.4.0,quay.io/k8scsi/livenessprobe:v1.1.0
As you can see the StatefulSet
consists of a pod with five containers: csi-attacher, csi-provisioner, csi-resizer, hcloud-csi-driver, livenessprobe
.
As we specified that we want one replica for the StatefulSet
we should now see at least one pod with the five containers running (in fact you’ll see other pods that start with hcloud
prefix but we don’t care about them yet):
kubectl get pods -o wide | grep hcloud-csi-controller
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hcloud-csi-controller-0 5/5 Running 0 4m27s 10.200.0.100 worker02 <none> <none>
As expected there is now a pod called hcloud-csi-controller-0
running on worker02
. StatefulSet
pods have a unique identity that is comprised of an ordinal, a stable network identity, and stable storage. The identity sticks to the Pod, regardless of which node it’s (re)scheduled on. So in our case the pods name consists of the StatefulSet
’s name plus an integer ordinal which starts at 0 for the first pod. If we have a look inside the pod we’ll again see that it consists of the four container that were specified:
kubectl get pod hcloud-csi-controller-0 -o custom-columns='CONTAINER:.spec.containers[*].name'
CONTAINER
csi-attacher,csi-resizer,csi-provisioner,hcloud-csi-driver,liveness-probe
So let’s have some fun and delete pod hcloud-csi-controller-0
:
kubectl delete pod hcloud-csi-controller-0
pod "hcloud-csi-controller-0" deleted
And after a while the pod gets recreated:
kubectl get pods -o wide | grep hcloud-csi-controller
hcloud-csi-controller-0 5/5 Running 0 1m27s 10.200.0.100 worker02 <none> <none
Unlike normal pods the name doesn’t change even if you delete the pod. For some applications like databases e.g. that’s quite useful as this means that also the DNS entry won’t change.
Before we figure out what the containers are doing lets have a look at the last part of the whole CSI deployment and that’s a DaemonSet (the Ansible playbook installed this one too of course). A DaemonSet
ensures that all (or some) Nodes run a copy of a Pod
. As nodes are added to the cluster, Pods
are added to them. As nodes are removed from the cluster, those Pods
are garbage collected. Deleting a DaemonSet
will clean up the Pods it created. So here is the specification:
---
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: {{ hcloud_resource_prefix }}-csi-node
namespace: {{ hcloud_namespace }}
labels:
app: {{ hcloud_resource_prefix }}-csi
spec:
selector:
matchLabels:
app: {{ hcloud_resource_prefix }}-csi
template:
metadata:
labels:
app: {{ hcloud_resource_prefix }}-csi
spec:
tolerations:
- effect: NoExecute
operator: Exists
- effect: NoSchedule
operator: Exists
- key: CriticalAddonsOnly
operator: Exists
serviceAccount: {{ hcloud_resource_prefix }}-csi
containers:
- name: csi-node-driver-registrar
image: quay.io/k8scsi/csi-node-driver-registrar:v{{ hcloud_csi_node_driver_registrar }}
args:
- --v=5
- --csi-address=/csi/csi.sock
- --kubelet-registration-path={{ k8s_worker_kubelet_conf_dir }}/plugins/csi.hetzner.cloud/csi.sock
env:
- name: KUBE_NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
volumeMounts:
- name: plugin-dir
mountPath: /csi
- name: registration-dir
mountPath: /registration
securityContext:
privileged: true
- name: hcloud-csi-driver
image: hetznercloud/hcloud-csi-driver:{{ hcloud_csi_driver }}
imagePullPolicy: Always
env:
- name: CSI_ENDPOINT
value: unix:///csi/csi.sock
- name: METRICS_ENDPOINT
value: 0.0.0.0:9189
- name: HCLOUD_TOKEN
valueFrom:
secretKeyRef:
name: hcloud-csi
key: token
volumeMounts:
- name: kubelet-dir
mountPath: {{ k8s_worker_kubelet_conf_dir }}
mountPropagation: "Bidirectional"
- name: plugin-dir
mountPath: /csi
- name: device-dir
mountPath: /dev
securityContext:
privileged: true
ports:
- containerPort: 9189
name: metrics
- name: healthz
containerPort: 9808
protocol: TCP
livenessProbe:
failureThreshold: 5
httpGet:
path: /healthz
port: healthz
initialDelaySeconds: 10
timeoutSeconds: 3
periodSeconds: 2
- name: liveness-probe
imagePullPolicy: Always
image: quay.io/k8scsi/livenessprobe:v{{ hcloud_csi_livenessprobe }}
args:
- --csi-address=/csi/csi.sock
volumeMounts:
- mountPath: /csi
name: plugin-dir
volumes:
- name: kubelet-dir
hostPath:
path: {{ k8s_worker_kubelet_conf_dir }}
type: Directory
- name: plugin-dir
hostPath:
path: {{ k8s_worker_kubelet_conf_dir }}/plugins/csi.hetzner.cloud/
type: DirectoryOrCreate
- name: registration-dir
hostPath:
path: {{ k8s_worker_kubelet_conf_dir }}/plugins_registry/
type: Directory
- name: device-dir
hostPath:
path: /dev
type: Directory
So let’s see how this looks like when the DaemonSet
is deployed:
kubectl get daemonset -o wide | grep hcloud
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR
hcloud-csi-node 1 1 1 1 1 kubernetes.io/hostname=worker02 7m38s csi-node-driver-registrar,hcloud-csi-driver,liveness-probe quay.io/k8scsi/csi-node-driver-registrar:v1.3.0,hetznercloud/hcloud-csi-driver:1.4.0,quay.io/k8scsi/livenessprobe:v1.1.0 app=hcloud-csi
Lets see what pods we have:
kubectl get pods -o wide | grep hcloud-csi-node
hcloud-csi-node-zl6z7 3/3 Running 0 8m58s 10.200.0.249 worker02 <none> <none>
As you can see there is one pod with three containers running on every node now.
We can also get a list of CSI drivers we’ve installed (which is of course the Hetzner CSI driver at the moment if you don’t have other CSI drivers installed):
kubectl get CSIDriver
NAME ATTACHREQUIRED PODINFOONMOUNT MODES AGE
csi.hetzner.cloud true true Persistent 12m
CSI drivers generate node specific information. Instead of storing this in the Kubernetes Node
API Object (which we can query with kubectl describe node
, a new CSI specific Kubernetes CSINode object was created. With kubectl describe csinodes
(or fully qualified kubectl describe csinodes.storage.k8s.io
we can get a little bit more information about our CSINode’s:
kubectl describe csinodes
Name: worker02
Labels: <none>
Annotations: <none>
CreationTimestamp: Sat, 08 Aug 2020 23:58:45 +0200
Spec:
Drivers:
csi.hetzner.cloud:
Allocatables:
Count: 16
Node ID: 6956103
Topology Keys: [csi.hetzner.cloud/location]
Events: <none>
Now let’s try creating a PersistentVolumeClaim. A PersistentVolumeClaim (PVC)
is a request for storage by a user. It is similar to a pod. Pods consume node resources and PVCs consume PV (persistent volume) resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., can be mounted once read/write or many times read-only).
While PersistentVolumeClaims
allow a user to consume abstract storage resources, it is common that users need PersistentVolumes
with varying properties, such as performance, for different problems. Cluster administrators need to be able to offer a variety of PersistentVolumes
that differ in more ways than just size and access modes, without exposing users to the details of how those volumes are implemented. For these needs there is the StorageClass
resource. But again at Hetzner currently only one StorageClass
exists.
Let’s have a look at the log output of the containers in the StatefulSet
if we create a PersistentVolumeClaim. You can either watch all logs at once or each container individually:
# Watch all container logs
kubectl logs -f --since=1s hcloud-csi-controller-0 --all-containers
# OR every container log separately:
kubectl logs -f --since=1s hcloud-csi-controller-0 csi-attacher
kubectl logs -f --since=1s hcloud-csi-controller-0 csi-provisioner
kubectl logs -f --since=1s hcloud-csi-controller-0 csi-cluster-driver-registrar
kubectl logs -f --since=1s hcloud-csi-controller-0 hcloud-csi-driver
Here is the manifest for the PersistentVolumeClaim
we want to create (10Gi is the smallest size possible):
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: csi-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: hcloud-volumes
Save the content above to a file called pvc.yml
e.g. The name of the PVC
will be csi-pvc
. You need this name in the next step when we create a pod that consumes the volume. Make sure that the storageClassName
matches the name of the storageClass
defined above. Basically you can also remove it here as we specified the annotation storageclass.kubernetes.io/is-default-class: true
above and therefore the hcloud-volumes
storageClass
would have taken anyways. accessModes
ReadWriteOnce
means the volume can be mounted as read-write by a single node.
Now create the PVC
with kubectl create -f pvc.yml
. If you now have a look at the logs you will see that only the csi-provisioner
created some output:
kubectl logs -f --since=50s hcloud-csi-controller-0 --all-containers
I0812 21:57:35.068229 1 reflector.go:432] k8s.io/client-go/informers/factory.go:135: Watch close - *v1.PersistentVolume total 0 items received
I0812 21:57:54.111686 1 reflector.go:432] k8s.io/client-go/informers/factory.go:135: Watch close - *v1beta1.CSINode total 0 items received
I0812 21:57:37.131925 1 controller.go:225] Started PVC processing "kube-system/csi-pvc"
I0812 21:57:37.131968 1 controller.go:244] No need to resize PVC "kube-system/csi-pvc"
So not really much happened ;-) You’ll see in your Hetzner Cloud console that no volume was created yet. The reason for that is that we specified volumeBindingMode: WaitForFirstConsumer
for the StorageClass
. So as long as no pod requests the volume it won’t be created. So if you want your pods to startup more quickly you may reconsider this setting or simply create an additional StorageClass
with a different setting.
To finally create the volume we need a consumer which means a pod in our case. So let’s create one:
---
kind: Pod
apiVersion: v1
metadata:
name: my-csi-app
spec:
containers:
- name: my-frontend
image: busybox
volumeMounts:
- mountPath: "/data"
name: my-csi-volume
command: [ "sleep", "1000000" ]
volumes:
- name: my-csi-volume
persistentVolumeClaim:
claimName: csi-pvc
Save the content above in a file called pod.yml
e.g. This specification will create a pod called my-csi-app
. It will launch a container called my-frontend
using the busybox
container image. This container image is quite small so the pod should become ready quite quickly. You also see that we want that container to have a volumeMount
at /data
. So our PVC
will be available at /data
later. The name of the volume is my-csi-volume
. So far we haven’t specified the PVC
anywhere. For this we need volumes
(you can of course mount different volumes into a container). Now the two name
directives in the container and volumes specification need to match. And finally we specify the persistentVolumeClaim
which has the claimName: csi-pvc
of course to match the name of the PVC
we created above.
Now we can create the pod:
kubectl create -f pod.yml
A few seconds later we should see that the pod is ready. This is normally bad practice but since we don’t care about that let’s log into the new container:
kubectl exec -it my-csi-app sh
df -h /data/
Filesystem Size Used Available Use% Mounted on
/dev/disk/by-id/scsi-0HC_Volume_2816975
9.8G 36.0M 9.7G 0% /data
And there it is the new volume! :-) You can now create a file on that new volume, log out, delete the pod and re-create it. If you now login again you’ll see that the file that you just created is still there. You now also see the volume at Hetzner Cloud console. At the console you’ll also see that the volume was attached to the node where the pod was scheduled.
You can even see at the node systemd journal that the volume was attached to the node e.g.:
ansible -m command -a 'journalctl --since=-1h | grep "kernel: s"' worker02
Jun 30 23:00:38 worker01 kernel: scsi 2:0:0:1: Direct-Access HC Volume 2.5+ PQ: 0 ANSI: 5
Jun 30 23:00:38 worker01 kernel: sd 2:0:0:1: Power-on or device reset occurred
Jun 30 23:00:38 worker01 kernel: sd 2:0:0:1: Attached scsi generic sg2 type 0
Jun 30 23:00:38 worker01 kernel: sd 2:0:0:1: [sdb] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB)
Jun 30 23:00:38 worker01 kernel: sd 2:0:0:1: [sdb] Write Protect is off
Jun 30 23:00:38 worker01 kernel: sd 2:0:0:1: [sdb] Mode Sense: 63 00 00 08
Jun 30 23:00:38 worker01 kernel: sd 2:0:0:1: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Jun 30 23:00:38 worker01 kernel: sd 2:0:0:1: [sdb] Attached SCSI disk
Next we’ve the logs for the csi-attacher:
I0630 21:00:35.702906 1 controller.go:175] Started VA processing "csi-c5ad589c45b49b4a688fdb2235f92497f853d14ff1de408ff3cdc27fc505b6ed"
I0630 21:00:35.702998 1 csi_handler.go:87] CSIHandler: processing VA "csi-c5ad589c45b49b4a688fdb2235f92497f853d14ff1de408ff3cdc27fc505b6ed"
I0630 21:00:35.703021 1 csi_handler.go:114] Attaching "csi-c5ad589c45b49b4a688fdb2235f92497f853d14ff1de408ff3cdc27fc505b6ed"
I0630 21:00:35.703040 1 csi_handler.go:253] Starting attach operation for "csi-c5ad589c45b49b4a688fdb2235f92497f853d14ff1de408ff3cdc27fc505b6ed"
I0630 21:00:35.703266 1 csi_handler.go:214] Adding finalizer to PV "pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189"
I0630 21:00:35.716703 1 csi_handler.go:222] PV finalizer added to "pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189"
I0630 21:00:35.716776 1 csi_handler.go:509] Found NodeID 123456 in CSINode worker01
I0630 21:00:35.716957 1 csi_handler.go:175] VA finalizer added to "csi-c5ad589c45b49b4a688fdb2235f92497f853d14ff1de408ff3cdc27fc505b6ed"
I0630 21:00:35.716980 1 csi_handler.go:189] NodeID annotation added to "csi-c5ad589c45b49b4a688fdb2235f92497f853d14ff1de408ff3cdc27fc505b6ed"
I0630 21:00:35.718444 1 controller.go:205] Started PV processing "pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189"
I0630 21:00:35.718493 1 csi_handler.go:412] CSIHandler: processing PV "pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189"
I0630 21:00:35.718511 1 csi_handler.go:416] CSIHandler: processing PV "pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189": no deletion timestamp, ignoring
I0630 21:00:35.725351 1 csi_handler.go:199] VolumeAttachment "csi-c5ad589c45b49b4a688fdb2235f92497f853d14ff1de408ff3cdc27fc505b6ed" updated with finalizer and/or NodeID annotation
I0630 21:00:35.725409 1 connection.go:180] GRPC call: /csi.v1.Controller/ControllerPublishVolume
I0630 21:00:35.725417 1 connection.go:181] GRPC request: {"node_id":"123456","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}},"volume_context":{"storage.kubernetes.io/csiProvisionerIdentity":"1561841327391-8081-csi.hetzner.cloud"},"volume_id":"2816975"}
I0630 21:00:37.666022 1 reflector.go:370] k8s.io/client-go/informers/factory.go:133: Watch close - *v1beta1.VolumeAttachment total 2 items received
I0630 21:00:39.121391 1 connection.go:183] GRPC response: {}
I0630 21:00:39.122173 1 connection.go:184] GRPC error: <nil>
I0630 21:00:39.122183 1 csi_handler.go:127] Attached "csi-c5ad589c45b49b4a688fdb2235f92497f853d14ff1de408ff3cdc27fc505b6ed"
I0630 21:00:39.122193 1 util.go:32] Marking as attached "csi-c5ad589c45b49b4a688fdb2235f92497f853d14ff1de408ff3cdc27fc505b6ed"
I0630 21:00:39.152060 1 util.go:42] Marked as attached "csi-c5ad589c45b49b4a688fdb2235f92497f853d14ff1de408ff3cdc27fc505b6ed"
I0630 21:00:39.152098 1 csi_handler.go:133] Fully attached "csi-c5ad589c45b49b4a688fdb2235f92497f853d14ff1de408ff3cdc27fc505b6ed"
I0630 21:00:39.152140 1 csi_handler.go:103] CSIHandler: finished processing "csi-c5ad589c45b49b4a688fdb2235f92497f853d14ff1de408ff3cdc27fc505b6ed"
I0630 21:00:39.152186 1 controller.go:175] Started VA processing "csi-c5ad589c45b49b4a688fdb2235f92497f853d14ff1de408ff3cdc27fc505b6ed"
I0630 21:00:39.152225 1 csi_handler.go:87] CSIHandler: processing VA "csi-c5ad589c45b49b4a688fdb2235f92497f853d14ff1de408ff3cdc27fc505b6ed"
I0630 21:00:39.152252 1 csi_handler.go:109] "csi-c5ad589c45b49b4a688fdb2235f92497f853d14ff1de408ff3cdc27fc505b6ed" is already attached
I0630 21:00:39.152265 1 csi_handler.go:103] CSIHandler: finished processing "csi-c5ad589c45b49b4a688fdb2235f92497f853d14ff1de408ff3cdc27fc505b6ed"
The external-attacher is an external controller that monitors VolumeAttachment
objects created by controller-manager
and attaches/detaches
volumes to/from nodes (i.e. calls ControllerPublish/ControllerUnpublish
). And indeed we see messages like Started VA processing ...
(VA
-> VolumeAttachment
) or GRPC call: /csi.v1.Controller/ControllerPublishVolume
. If you have closer look you also see {"fs_type":"ext4"}
. So the volume will be formatted as ext4
filesystem by default.
If you’re wondering why this is called external-attacher
: The external-attacher
is a sidecar container that attaches volumes to nodes by calling ControllerPublish
and ControlerUnpublish
functions of CSI drivers. It is necessary because internal Attach/Detach
controller running in Kubernetes controller-manager
does not have any direct interfaces to CSI drivers.
BTW: There is a matrix which version of the container is recommended for which Kubernetes version. It can be found here.
Next the logs for csi-provisioner:
I0630 21:00:32.655824 1 controller.go:1196] provision "kube-system/csi-pvc" class "hcloud-volumes": started
I0630 21:00:32.682110 1 event.go:209] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"kube-system", Name:"csi-pvc", UID:"0cf5cabc-9b7a-11e9-b3f4-9600000d4189", APIVersion:"v1", ResourceVersion:"42222536", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "kube-system/csi-pvc"
I0630 21:00:32.705535 1 controller.go:442] CreateVolumeRequest {Name:pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189 CapacityRange:required_bytes:10737418240 VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:requisite:<segments:<key:"csi.hetzner.cloud/location" value:"fsn1" > > preferred:<segments:<key:"csi.hetzner.cloud/location" value:"fsn1" > > XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I0630 21:00:32.705801 1 connection.go:180] GRPC call: /csi.v1.Controller/CreateVolume
I0630 21:00:32.705817 1 connection.go:181] GRPC request: {"accessibility_requirements":{"preferred":[{"segments":{"csi.hetzner.cloud/location":"fsn1"}}],"requisite":[{"segments":{"csi.hetzner.cloud/location":"fsn1"}}]},"capacity_range":{"required_bytes":10737418240},"name":"pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I0630 21:00:35.405281 1 connection.go:183] GRPC response: {"volume":{"accessible_topology":[{"segments":{"csi.hetzner.cloud/location":"fsn1"}}],"capacity_bytes":10737418240,"volume_id":"2816975"}}
I0630 21:00:35.416118 1 connection.go:184] GRPC error: <nil>
I0630 21:00:35.416148 1 controller.go:486] create volume rep: {CapacityBytes:10737418240 VolumeId:2816975 VolumeContext:map[] ContentSource:<nil> AccessibleTopology:[segments:<key:"csi.hetzner.cloud/location" value:"fsn1" > ] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I0630 21:00:35.416238 1 controller.go:558] successfully created PV {GCEPersistentDisk:nil AWSElasticBlockStore:nil HostPath:nil Glusterfs:nil NFS:nil RBD:nil ISCSI:nil Cinder:nil CephFS:nil FC:nil Flocker:nil FlexVolume:nil AzureFile:nil VsphereVolume:nil Quobyte:nil AzureDisk:nil PhotonPersistentDisk:nil PortworxVolume:nil ScaleIO:nil Local:nil StorageOS:nil CSI:&CSIPersistentVolumeSource{Driver:csi.hetzner.cloud,VolumeHandle:2816975,ReadOnly:false,FSType:ext4,VolumeAttributes:map[string]string{storage.kubernetes.io/csiProvisionerIdentity: 1561841327391-8081-csi.hetzner.cloud,},ControllerPublishSecretRef:nil,NodeStageSecretRef:nil,NodePublishSecretRef:nil,}}
I0630 21:00:35.416349 1 controller.go:1278] provision "kube-system/csi-pvc" class "hcloud-volumes": volume "pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189" provisioned
I0630 21:00:35.416379 1 controller.go:1295] provision "kube-system/csi-pvc" class "hcloud-volumes": succeeded
I0630 21:00:35.416389 1 volume_store.go:147] Saving volume pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189
I0630 21:00:35.437775 1 volume_store.go:150] Volume pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189 saved
The external-provisioner
is an external controller that monitors PersistentVolumeClaim
objects created by user and creates/deletes volumes for them. The external-provisioner
is a sidecar container that dynamically provisions volumes by calling ControllerCreateVolume
and ControllerDeleteVolume
functions of CSI drivers.
In the beginning of the logs we see an Event
(Event(v1.ObjectReference{Kind:"PersistentVolumeClaim"...
) which triggers a CreateVolumeRequest
which causes GRPC call: /csi.v1.Controller/CreateVolume
to be called. You’ll also see object parameter like "csi.hetzner.cloud/location":"fsn1"
. This is needed for Topology
awareness. The value fsn1
in this case means the Hetzner data center located at Falkenstein
. As Hetzner has different data center it of course makes sense to have the storage in the same data center as the pods ;-).
And finally the logs of the last container which is hcloud-csi-driver
which is part of the StatefulSet
:
level=debug ts=2019-06-30T21:00:32.720427572Z component=grpc-server msg="handling request" req="name:\"pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189\" capacity_range:<required_bytes:10737418240 > volume_capabilities:<mount:<fs_type:\"ext4\" > access_mode:<mode:SINGLE_NODE_WRITER > > accessibility_requirements:<requisite:<segments:<key:\"csi.hetzner.cloud/location\" value:\"fsn1\" > > preferred:<segments:<key:\"csi.hetzner.cloud/location\" value:\"fsn1\" > > > "
level=info ts=2019-06-30T21:00:32.72228972Z component=idempotent-volume-service msg="creating volume" name=pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189 min-size=10 max-size=0 location=fsn1
level=info ts=2019-06-30T21:00:32.722340541Z component=api-volume-service msg="creating volume" volume-name=pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189 volume-size=10 volume-location=fsn1
level=info ts=2019-06-30T21:00:35.401583756Z component=idempotent-volume-service msg="volume created" volume-id=2816975
level=info ts=2019-06-30T21:00:35.402348851Z component=driver-controller-service msg="created volume" volume-id=2816975 volume-name=pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189
level=debug ts=2019-06-30T21:00:35.403048521Z component=grpc-server msg="finished handling request"
level=debug ts=2019-06-30T21:00:35.728688237Z component=grpc-server msg="handling request" req="volume_id:\"2816975\" node_id:\"898689\" volume_capability:<mount:<fs_type:\"ext4\" > access_mode:<mode:SINGLE_NODE_WRITER > > volume_context:<key:\"storage.kubernetes.io/csiProvisionerIdentity\" value:\"1561841327391-8081-csi.hetzner.cloud\" > "
level=info ts=2019-06-30T21:00:35.728781078Z component=api-volume-service msg="attaching volume" volume-id=2816975 server-id=898689
level=debug ts=2019-06-30T21:00:39.120803804Z component=grpc-server msg="finished handling request"
That one is basically coordinating the volume creation process. That were the container that are part the the StatefulSet
.
Now lets see what the DaemonSet
has to offer. In our case the DaemonSet
resources gets deployed on every worker node which is basically a pod with two containers.
We can fetch the logs like
1> kubectl logs hcloud-csi-node-h824m csi-node-driver-registrar
and
1> kubectl logs hcloud-csi-node-h824m hcloud-csi-driver
So have a look at the csi-node-driver-registrar logs:
I0630 20:57:40.840045 1 main.go:110] Version: v1.1.0-0-g80a94421
I0630 20:57:40.840124 1 main.go:120] Attempting to open a gRPC connection with: "/csi/csi.sock"
I0630 20:57:40.840143 1 connection.go:151] Connecting to unix:///csi/csi.sock
I0630 20:57:45.125805 1 main.go:127] Calling CSI driver to discover driver name
I0630 20:57:45.125841 1 connection.go:180] GRPC call: /csi.v1.Identity/GetPluginInfo
I0630 20:57:45.125850 1 connection.go:181] GRPC request: {}
I0630 20:57:45.129031 1 connection.go:183] GRPC response: {"name":"csi.hetzner.cloud","vendor_version":"1.1.4"}
I0630 20:57:45.130010 1 connection.go:184] GRPC error: <nil>
I0630 20:57:45.130023 1 main.go:137] CSI driver name: "csi.hetzner.cloud"
I0630 20:57:45.130410 1 node_register.go:54] Starting Registration Server at: /registration/csi.hetzner.cloud-reg.sock
I0630 20:57:45.130521 1 node_register.go:61] Registration Server started at: /registration/csi.hetzner.cloud-reg.sock
I0630 20:57:45.132094 1 main.go:77] Received GetInfo call: &InfoRequest{}
I0630 20:57:45.169465 1 main.go:87] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}
The CSI node-driver-registrar
is a sidecar container that fetches driver information (using NodeGetInfo
) from a CSI endpoint and registers it with the kubelet
on that node using the kubelet plugin registration mechanism
. This is necessary because kubelet
is responsible for issuing CSI NodeGetInfo
, NodeStageVolume
, NodePublishVolume
calls. The node-driver-registrar
registers your CSI driver with kubelet
so that it knows which Unix domain socket to issue the CSI calls on.
And finally we’ve the logs from hcloud-csi-driver
:
level=debug ts=2019-06-30T20:57:44.383529379Z msg="getting instance id from metadata service"
level=debug ts=2019-06-30T20:57:44.389617131Z msg="fetching server"
level=info ts=2019-06-30T20:57:44.762089045Z msg="fetched server" server-name=worker01
level=debug ts=2019-06-30T20:57:45.128497624Z component=grpc-server msg="handling request" req=
level=debug ts=2019-06-30T20:57:45.128599355Z component=grpc-server msg="finished handling request"
level=debug ts=2019-06-30T20:57:45.137641968Z component=grpc-server msg="handling request" req=
level=debug ts=2019-06-30T20:57:45.137719407Z component=grpc-server msg="finished handling request"
level=debug ts=2019-06-30T21:00:43.431530684Z component=grpc-server msg="handling request" req=
level=debug ts=2019-06-30T21:00:43.431721975Z component=grpc-server msg="finished handling request"
level=debug ts=2019-06-30T21:00:43.449264791Z component=grpc-server msg="handling request" req="volume_id:\"2816975\" staging_target_path:\"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189/globalmount\" volume_capability:<mount:<fs_type:\"ext4\" > access_mode:<mode:SINGLE_NODE_WRITER > > volume_context:<key:\"storage.kubernetes.io/csiProvisionerIdentity\" value:\"1561841327391-8081-csi.hetzner.cloud\" > "
level=debug ts=2019-06-30T21:00:43.536347081Z component=linux-mount-service msg="staging volume" volume-name=pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189 staging-target-path=/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189/globalmount fs-type=ext4
E0630 21:00:43.540339 1 mount_linux.go:151] Mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t ext4 -o defaults /dev/disk/by-id/scsi-0HC_Volume_2816975 /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189/globalmount
Output: mount: mounting /dev/disk/by-id/scsi-0HC_Volume_2816975 on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189/globalmount failed: Invalid argument
level=debug ts=2019-06-30T21:00:44.344371915Z component=grpc-server msg="finished handling request"
level=debug ts=2019-06-30T21:00:44.349042447Z component=grpc-server msg="handling request" req=
level=debug ts=2019-06-30T21:00:44.349115338Z component=grpc-server msg="finished handling request"
level=debug ts=2019-06-30T21:00:44.372982401Z component=grpc-server msg="handling request" req="volume_id:\"2816975\" staging_target_path:\"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189/globalmount\" target_path:\"/var/lib/kubelet/pods/196815b9-9b7a-11e9-b3f4-9600000d4189/volumes/kubernetes.io~csi/pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189/mount\" volume_capability:<mount:<fs_type:\"ext4\" > access_mode:<mode:SINGLE_NODE_WRITER > > volume_context:<key:\"storage.kubernetes.io/csiProvisionerIdentity\" value:\"1561841327391-8081-csi.hetzner.cloud\" > "
level=debug ts=2019-06-30T21:00:44.431227055Z component=linux-mount-service msg="publishing volume" volume-name=pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189 target-path=/var/lib/kubelet/pods/196815b9-9b7a-11e9-b3f4-9600000d4189/volumes/kubernetes.io~csi/pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189/mount staging-target-path=/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-0cf5cabc-9b7a-11e9-b3f4-9600000d4189/globalmount fs-type=ext4 readonly=false additional-mount-options="unsupported value type"
level=debug ts=2019-06-30T21:00:44.43677855Z component=grpc-server msg="finished handling request"
We already used the hcloud-csi-driver
container image in the StatefulSet
and now its used again in the DaemonSet
. But here different GRPC entrypoints of the driver are used. The Kubernetes kubelet
runs on every node and is responsible for making the CSI Node service calls. These calls mount
and unmount
the storage volume from the storage system, making it available to the pod to consume. kubelet
makes calls to the CSI driver through a UNIX domain socket shared on the host via a HostPath
volume. There is also a second UNIX domain socket that the node-driver-registrar
uses to register the CSI driver to kubelet
.
If you have a look at your worker nodes you’ll find /var/lib/kubelet/plugins/csi.hetzner.cloud/csi.sock
which is the socket mentioned above which kubelet
and CSI driver are sharing (granted that your kubelet
config directory is /var/lib/kubelet/
of course). csi.hetzner.cloud
is the CSI plugin name in this case.
Now finally to a actual use-case I had. I have a blog server called Apache Roller. It packaged as .war
file and runs with good old Apache Tomcat. As I needed persistent storage where Roller can store assets like uploaded images e.g. I used hostPath. A hostPat
volume mounts a file or directory from the host node’s filesystem into your Pod. That’s of course nothing that you normally should do but was the only option for me at that time. So the containers
spec in Deployment
looked like this:
apiVersion: apps/v1
kind: Deployment
metadata:
...
spec:
replicas: 1
selector:
...
template:
metadata:
...
spec:
volumes:
- hostPath:
path: /opt/roller/logs
name: logs
- hostPath:
path: /opt/roller/rolleridx
name: rolleridx
- hostPath:
path: /opt/roller/docs/resources
name: resources
containers:
- name: roller
image: your-docker-registry:5000/roller:5.2.3
...
volumeMounts:
- mountPath: /usr/local/tomcat/logs
name: logs
- mountPath: /usr/local/tomcat/rolleridx
name: rolleridx
- mountPath: /usr/local/tomcat/resources
name: resources
As you can see this config will mount three host directories (volumes
) “into” the pod (volumeMounts
). Now I needed to migrate the data over to the new CSI volume. There’re a few ways to do this but I decided to do it this way: First create a persistentVolume
with the help of CSI and mount it temporary to /data
in the pod. Then create the directories needed once after logging into the container manually and copy the data into that directories.
So for the Deployment
I needed to adjust spec.template.spec
. I added this configuration:
volumes:
- name: data
persistentVolumeClaim:
claimName: pvc
Also in spec.template.containers.volumeMounts
I needed to mount the new volume:
volumeMounts:
- mountPath: /data
name: data
The change needs to be applied with kubectl apply -f deployment.yml
.
Lucky me in my case I used a Tomcat container that is based on Debian. So I had at least a few common commands like cp
, mkdir
, chown
and bash
already installed. So I logged into the pod via
kubectl exec -it tomcat-xxxxxxxx-xxxxx bash
Then I realized that my shiny new /data
mount was owned by root
and also the group root
:
ls -al /
...
drwxr-xr-x 6 root root 4096 Jul 17 20:27 data
...
That was bad as Tomcat was running with the permissions of user www-data
(and the same group). The www-data
group had the id 33
. What’s needed in this case is a securityContext for the deployment. And this looks like this:
apiVersion: apps/v1
kind: Deployment
metadata:
...
spec:
replicas: 1
selector:
...
template:
metadata:
...
spec:
securityContext:
fsGroup: 33
...
As already mentioned 33
is the group id of the group www-data
which is also used by the Tomcat process. Now if I deploy the new setting it looks like this:
ls -al /
...
drwxrwsr-x 6 root www-data 4096 Jul 17 20:27 data
...
Much better :-) So now I was able to create the directories like /data/resources
and copied the data from the old to the new mount e.g. cd data; cp --archive /usr/local/tomcat/resources .
. The --archive
options includes --preserve
which preserve the specified attributes (default: mode,ownership,timestamps), if possible additional attributes: context, links, xattr, all (a little bit like rsync -av ...
).
Now that everything is copied to the new directories I changed the Deployment
manifest accordingly so that the CSI volumes were used and removed the hostPath
entries. Now the pods are no longer “tied” to a specific Kubernetes worker node as it was the case with HostPath
.
So this blog post is quite long and if you still read: Congratulations! ;-)