Lab 6 - Kubernetes Storage
Welcome to the lab 6. In this session, the following topics are covered:
- Kubernetes storage basics;
- Persistent Volumes;
- Setting up persistent databases in Kubernetes
- Setting up and configuring Longhorn persistent storage
Kubernetes storage basics¶
Kubernetes only includes temporary storage options through volumes by default. If we want to use more permanent storage, we need to configure Persistent storage classes. Often these require deploying third party provisioners.
In this lab, we will take a look at some of the Storage building blocks of Kubernetes:
(Source: link )
Generic Kubernetes Volumes¶
Currently our Ghostfolio deployment uses a Postgresql
Pod as a database, which uses the default emptyDir
ephemeral type of volume for data storage.
Let's investigate what happens if Pods get replaced.
Complete
Log into the Ghostfolio application (use any of the node IPs and the port configured in ghostfolio Nodeport service).
Use the "Get Started" button and save the security Token that is provided. This token will be used for loging in to Ghostfolio.
Also, make a new account. The account information is stored in the Postgres DB.
Now, delete the posgresql Pod. As we have configured postgre as a StatefulSet, Kubernetes will notice that the number of Pods is smaller than required (1) and will recreate the Pod automatically.
Verify
Log into Ghostfolio again, and check what happened to the previous account you have created.
PS! You may also need to delete the Ghostfolio Pod to be able to recreate the Security Token.
As we can see, both the DB and Ghostfolio itself are not resilient to Pod failures.
We must set up persistent data storage for Postgresql.
Info
In the fourth lab, we set up Postgresql
as a StatefulSet and defined the data Volume to be of type emptyDir
:
volumeMounts:
- name: postgresql-data
mountPath: /bitnami/postgresql
volumes:
- name: postgresql-data
emptyDir: {}
Inside the container, the postgres data is stored in path /bitnami/postgresql
. This path should not be changed, but we can modify what type of Volume is used.
Let's now create a Persistent Volume for our Postgresql Pods.
Persistent Volumes¶
Before we start creating persistent volumes, we need to define which Storage Classes are available in our Kubernetes cluster.
Complete
Create a new Storage Class template with "local-storage" name and the following content:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
Some explanations to used values:
provisioner: kubernetes.io/no-provisioner
means that we do not use an automated (or dynamic) provisioner. And we will have to create the Persitent Volumes manually.volumeBindingMode: WaitForFirstConsumer
means that PV is not initiated until a Pod claiming it is provisioned.reclaimPolicy: Retain
means that PV content should be kept between. This setting has no real effect for us as we are not using a provisioner.
Apply the template to Kubernetes.
Verify
You can check that the storage class was created and is now available with the following command:
kubectl get storageclass
Complete
Create a "/mnt/data" folder in both Kubernetes nodes. We will store persistent volumes there.
Let's now create a new Persitant Volume postgres-0-pv. Add the following content to a new pv-postgre.yaml file:
apiVersion: v1
kind: PersistentVolume
metadata:
name: postgresql-0-pv
labels:
app: postgresql
ver: postgresql-0-pv-v1
spec:
capacity:
storage: 1Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /mnt/data/postgresql-0-pv
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- NODE-HOSTNAME
NODE-HOSTNAME must be replaced with actual full hostname (e.g., pelle-worker.cloud.ut.ee
) Apply the pv-postgre.yaml template using kubectl.
Also, create a folder for a new volume at /mnt/data/postgresql-0-pv
in your first/main machine.
Set its folder permissions to 755
.
Depending on how the persistent volume will be accessed inside the pod, you may need to modify its owner or permissions. Keep this in mind, if you later notice that a Pod fails to use the mounted volume properly.
Verify
Check the list of Persistent Volumes:
kubectl get pv
Complete
Let's now create a new Persitant Volume Claim (PVC) postgres-0-pv-claim and reconfigure our posgresql database stateful set to use it.
Add the following content to a new pv-claim-postgre.yaml
file:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-0-pv-claim
spec:
storageClassName: local-storage
selector:
matchLabels: #Select a volume with this labels
app: postgresql
ver: postgresql-0-pv-v1
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
Apply the pv-claim-postgre.yaml
template using kubectl.
Modify the posgresql stateful set we created in lab 4 (statefulset-postgre.yaml
)
Replace the line:
volumes:
- name: postgresql-data
emptyDir: {}
with:
volumes:
- name: postgresql-data
persistentVolumeClaim:
claimName: postgres-0-pv-claim
And apply the modified posgresql stateful set template.
Verify
Check that the claim was created:
kubectl get pvc
Also check that the content of the configured folder on the cluster node (which you specified under Node Affinity)
ls /mnt/data -al
Once the Postgresql is reconfigured, you should see that a new folder named postgresql-data
has been created there.
Info
If you run into issues with creating or testing PV and PVC's, here are some debugging tips:
- Use
kubectl describe
onpv
pvc
, andpods
to get more informations - Use
kubectl logs
postgresql-0` to check whether postgresql container starts properly.
Complete
First, make sure you have created a user inside the Ghostfolio Application. Secondly, delete the postgresql-0 Pod and check that the database stays intact afterwards.
Verify
Verify that the user is not deleted.
If the user is still there afterwards, this means that the Persistent Volume was created and is being used properly.
You may also need to delete the Ghostfolio Pod, if the DB deletion caused your previous Security Token to stop working.
Info
As you can see, there were quite a few manual steps that had to be performed, like manual folder preparation and permission management.
Such steps can be automated with third party local-storage provisioners, which take care of dynamicaly preparing folders and their permissions, but there are also other disadvantages to using local volumes.
For instance, Pods can only be deployed where required pv's are located which can lead to unbalanced node load.
Lets next take a look at a more powerful Storage Class: Longhorn
Preparing the nodes for Longhorn¶
You'll need to configure BOTH of the nodes for compatibility with Longhorn.
Complete
You'll be completing this section from Longhorn documentation
There's few packages you'll need to install, using the package manager:
nfs-utils
iscsi-initiator-utils
jq
Make sure to also load the iscsi_tcp
kernel module, and make it persistent across reboots. You can achieve this by loading the module with modprobe
command and adding the kernel module name to a file in /etc/modules-load.d/<file>.conf
Finally, start and enable the iscsid
systemd
service.
Verify
Usually verifying things like this is not easy, when Kubernetes is involved, but thankfully Longhorn developers have published a tool that does this for you.
On your control plane machine, with KUBECONFIG
properly exported, run this command: curl -sSfL https://raw.githubusercontent.com/longhorn/longhorn/v1.5.1/scripts/environment_check.sh | bash
This tool will run for a while, and let you know if there's any issues.
Danger
Never run scripts in this fashion if you don't understand them. This pattern is sadly very popular, but from a security standpoint, you should not download a file and then instantly execute it.
Installing Longhorn¶
In this task we will install Longhorn inside our cluster as a set of Kubernetes entities.
Complete
Do this part only if the verification of the previous step did not bring up any issues. Perform these tasks on the main server only.
There are several ways to install Longhorn, but we'll use the simplest and quickest one, which uses kubectl
.
kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v1.5.1/deploy/longhorn.yaml
This command downloads the manifest from the URL and applies it to the cluster.
Verify
You can verify the installation by checking the pods in the longhorn-system
namespace. They should all change into running
state in a minute or two.
You can also check the kubectl get storageclass
command to see, if a default storageclass
got created. The storageclass
defines which storage system you use for your deployments, for when you have multiple in the cluster. The default one is used when a storageclass does not get specified.
First Longhorn workload¶
Let's now create a Pod that uses Longhorn volumes.
Complete
Create a namespace called storage
, and run the following manifest:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: longhorn-pvc
namespace: storage
spec:
accessModes:
- ReadWriteOnce
storageClassName: longhorn
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
name: writer-pod
namespace: storage
spec:
containers:
- name: busybox
image: busybox
command: ["/bin/sh", "-c", "echo 'Hello, Persistent Storage!' >> /data/hello.txt; sleep 3600"]
volumeMounts:
- name: longhorn-volume
mountPath: /data
volumes:
- name: longhorn-volume
persistentVolumeClaim:
claimName: longhorn-pvc
Verify
You'll be able to see from kubectl -n storage get persistentvolumeclaim
(short kubectl get pvc
) and kubectl -n storage get persistentvolume
(short kubectl get pv
) that a persistent volume is created for your pod.
When you exec
into your pod, you'll see that a file has been made by the container /data/hello.txt
. Every time the pod has run, there will be a new line added to the file.
You can delete and recreate the pod until it goes to another node, and you'll still be able to see the file and its content there. You can also attach the PVC
to another container, and you'll still see the file there.
NB! The pod is not automatically recreated as we did not define a Deployment or a StatefulSet for it.
Migrating the database to use Longhorn volumes¶
The last task will be to migrate the Postgresql
database from local persistent volumes to Longhorn volumes.
Complete
Create a new Longhorn type PersistentVolumeClaim for the Postgresql
StatefulSet.
NB! Name the persistent volume claim postgresql-lh-pvc
Reconfigure Postgresql
to claim the Longhorn Persistent Volume instead of local volume type.
Because Postgresql
does not run as a root user inside the container, we need to make sure that the directory permissions are correct in the mounted volume. At the start of the lab, we did this manually for the local volume. Now we will automate it using a Kubernetes init container. Add a new init container to the Postgre StatefulSet which changes the owner of the mounted volume before the Postgresql
container is started.
Add the following inside the spec:
block of Postgresql
StatefulSet template:
initContainers:
- name: permission-init
image: alpine:latest
command:
- sh
- -c
- (chown 1001:1001 /bitnami/postgresql)
volumeMounts:
- name: postgresql-data
mountPath: /bitnami/postgresql
Verify
Check that the Postgresql
Pod stays in the Running status.
Also, make sure the ghostfolio app works properly after the change.
NB! You may also need to delete the Ghostfolio Pod to be able to recreate the Security Token, as the database has been reinitilized again. Without reinitializing Ghostfolio also, it may not offer the ability to create new token. Once the database is set up persistenly, the token will not disapear again.