Skip to content

Lab 6 - Kubernetes Storage

Welcome to the lab 6. In this session, the following topics are covered:

  • Kubernetes storage basics;
  • Persistent Volumes;
  • Setting up persistent databases in Kubernetes
  • Setting up and configuring Longhorn persistent storage

Kubernetes storage basics

Kubernetes only includes temporary storage options through volumes by default. If we want to use more permanent storage, we need to configure Persistent storage classes. Often these require deploying third party provisioners.

In this lab, we will take a look at some of the Storage building blocks of Kubernetes:

Kubernetes storage objects (Source: link )

Generic Kubernetes Volumes

Currently our Ghostfolio deployment uses a Postgresql Pod as a database, which uses the default emptyDir ephemeral type of volume for data storage.

Let's investigate what happens if Pods get replaced.

Complete

Log into the Ghostfolio application (use any of the node IPs and the port configured in ghostfolio Nodeport service).

Use the "Get Started" button and save the security Token that is provided. This token will be used for loging in to Ghostfolio.

Also, make a new account. The account information is stored in the Postgres DB.

Now, delete the posgresql Pod. As we have configured postgre as a StatefulSet, Kubernetes will notice that the number of Pods is smaller than required (1) and will recreate the Pod automatically.

Verify

Log into Ghostfolio again, and check what happened to the previous account you have created.

PS! You may also need to delete the Ghostfolio Pod to be able to recreate the Security Token.

As we can see, both the DB and Ghostfolio itself are not resilient to Pod failures.

We must set up persistent data storage for Postgresql.

Info

In the fourth lab, we set up Postgresql as a StatefulSet and defined the data Volume to be of type emptyDir:

  volumeMounts:
  - name: postgresql-data
    mountPath: /bitnami/postgresql
volumes:
  - name: postgresql-data
    emptyDir: {}

Inside the container, the postgres data is stored in path /bitnami/postgresql. This path should not be changed, but we can modify what type of Volume is used.

Let's now create a Persistent Volume for our Postgresql Pods.

Persistent Volumes

Before we start creating persistent volumes, we need to define which Storage Classes are available in our Kubernetes cluster.

Complete

Create a new Storage Class template with "local-storage" name and the following content:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer 

Some explanations to used values:

  • provisioner: kubernetes.io/no-provisioner means that we do not use an automated (or dynamic) provisioner. And we will have to create the Persitent Volumes manually.
  • volumeBindingMode: WaitForFirstConsumer means that PV is not initiated until a Pod claiming it is provisioned.
  • reclaimPolicy: Retain means that PV content should be kept between. This setting has no real effect for us as we are not using a provisioner.

Apply the template to Kubernetes.

Verify

You can check that the storage class was created and is now available with the following command:

kubectl get storageclass

Complete

Create a "/mnt/data" folder in both Kubernetes nodes. We will store persistent volumes there.

Let's now create a new Persitant Volume postgres-0-pv. Add the following content to a new pv-postgre.yaml file:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: postgresql-0-pv
  labels: 
    app: postgresql
    ver: postgresql-0-pv-v1
spec:
  capacity:
    storage: 1Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /mnt/data/postgresql-0-pv
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - NODE-HOSTNAME

NODE-HOSTNAME must be replaced with actual full hostname (e.g., pelle-worker.cloud.ut.ee) Apply the pv-postgre.yaml template using kubectl.

Also, create a folder for a new volume at /mnt/data/postgresql-0-pv in your first/main machine.

Set its folder permissions to 755.

Depending on how the persistent volume will be accessed inside the pod, you may need to modify its owner or permissions. Keep this in mind, if you later notice that a Pod fails to use the mounted volume properly.

Verify

Check the list of Persistent Volumes:

kubectl get pv
You should now see that a new Persistent Volumes is available to be used.

Complete

Let's now create a new Persitant Volume Claim (PVC) postgres-0-pv-claim and reconfigure our posgresql database stateful set to use it.

Add the following content to a new pv-claim-postgre.yaml file:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-0-pv-claim
spec:
  storageClassName: local-storage
  selector:
    matchLabels:  #Select a volume with this labels
      app: postgresql
      ver: postgresql-0-pv-v1
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi

Apply the pv-claim-postgre.yaml template using kubectl.

Modify the posgresql stateful set we created in lab 4 (statefulset-postgre.yaml)

Replace the line:

  volumes:
    - name: postgresql-data
      emptyDir: {}
with:
  volumes:
    - name: postgresql-data
      persistentVolumeClaim:
        claimName: postgres-0-pv-claim

And apply the modified posgresql stateful set template.

Verify

Check that the claim was created:

kubectl get pvc

Also check that the content of the configured folder on the cluster node (which you specified under Node Affinity)

ls /mnt/data -al

Once the Postgresql is reconfigured, you should see that a new folder named postgresql-data has been created there.

Info

If you run into issues with creating or testing PV and PVC's, here are some debugging tips:

  • Use kubectl describe on pv pvc, and pods to get more informations
  • Use kubectl logspostgresql-0` to check whether postgresql container starts properly.

Complete

First, make sure you have created a user inside the Ghostfolio Application. Secondly, delete the postgresql-0 Pod and check that the database stays intact afterwards.

Verify

Verify that the user is not deleted.

If the user is still there afterwards, this means that the Persistent Volume was created and is being used properly.

You may also need to delete the Ghostfolio Pod, if the DB deletion caused your previous Security Token to stop working.

Info

As you can see, there were quite a few manual steps that had to be performed, like manual folder preparation and permission management.

Such steps can be automated with third party local-storage provisioners, which take care of dynamicaly preparing folders and their permissions, but there are also other disadvantages to using local volumes.

For instance, Pods can only be deployed where required pv's are located which can lead to unbalanced node load.

Lets next take a look at a more powerful Storage Class: Longhorn

Preparing the nodes for Longhorn

You'll need to configure BOTH of the nodes for compatibility with Longhorn.

Complete

You'll be completing this section from Longhorn documentation

There's few packages you'll need to install, using the package manager:

  • nfs-utils
  • iscsi-initiator-utils
  • jq

Make sure to also load the iscsi_tcp kernel module, and make it persistent across reboots. You can achieve this by loading the module with modprobe command and adding the kernel module name to a file in /etc/modules-load.d/<file>.conf

Finally, start and enable the iscsid systemd service.

Verify

Usually verifying things like this is not easy, when Kubernetes is involved, but thankfully Longhorn developers have published a tool that does this for you.

On your control plane machine, with KUBECONFIG properly exported, run this command: curl -sSfL https://raw.githubusercontent.com/longhorn/longhorn/v1.5.1/scripts/environment_check.sh | bash

This tool will run for a while, and let you know if there's any issues.

Danger

Never run scripts in this fashion if you don't understand them. This pattern is sadly very popular, but from a security standpoint, you should not download a file and then instantly execute it.

Installing Longhorn

In this task we will install Longhorn inside our cluster as a set of Kubernetes entities.

Complete

Do this part only if the verification of the previous step did not bring up any issues. Perform these tasks on the main server only.

There are several ways to install Longhorn, but we'll use the simplest and quickest one, which uses kubectl.

kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v1.5.1/deploy/longhorn.yaml

This command downloads the manifest from the URL and applies it to the cluster.

Verify

You can verify the installation by checking the pods in the longhorn-system namespace. They should all change into running state in a minute or two.

You can also check the kubectl get storageclass command to see, if a default storageclass got created. The storageclass defines which storage system you use for your deployments, for when you have multiple in the cluster. The default one is used when a storageclass does not get specified.

First Longhorn workload

Let's now create a Pod that uses Longhorn volumes.

Complete

Create a namespace called storage, and run the following manifest:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: longhorn-pvc
  namespace: storage
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: longhorn
  resources:
    requests:
      storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: writer-pod
  namespace: storage
spec:
  containers:
  - name: busybox
    image: busybox
    command: ["/bin/sh", "-c", "echo 'Hello, Persistent Storage!' >> /data/hello.txt; sleep 3600"]
    volumeMounts:
    - name: longhorn-volume
      mountPath: /data
  volumes:
  - name: longhorn-volume
    persistentVolumeClaim:
      claimName: longhorn-pvc

Verify

You'll be able to see from kubectl -n storage get persistentvolumeclaim (short kubectl get pvc) and kubectl -n storage get persistentvolume (short kubectl get pv) that a persistent volume is created for your pod.

When you exec into your pod, you'll see that a file has been made by the container /data/hello.txt. Every time the pod has run, there will be a new line added to the file.

You can delete and recreate the pod until it goes to another node, and you'll still be able to see the file and its content there. You can also attach the PVC to another container, and you'll still see the file there.

NB! The pod is not automatically recreated as we did not define a Deployment or a StatefulSet for it.

Migrating the database to use Longhorn volumes

The last task will be to migrate the Postgresql database from local persistent volumes to Longhorn volumes.

Complete

Create a new Longhorn type PersistentVolumeClaim for the Postgresql StatefulSet.

NB! Name the persistent volume claim postgresql-lh-pvc

Reconfigure Postgresql to claim the Longhorn Persistent Volume instead of local volume type.

Because Postgresql does not run as a root user inside the container, we need to make sure that the directory permissions are correct in the mounted volume. At the start of the lab, we did this manually for the local volume. Now we will automate it using a Kubernetes init container. Add a new init container to the Postgre StatefulSet which changes the owner of the mounted volume before the Postgresql container is started.

Add the following inside the spec: block of Postgresql StatefulSet template:

      initContainers:
      - name: permission-init
        image: alpine:latest
        command:
        - sh
        - -c
        - (chown 1001:1001 /bitnami/postgresql)
        volumeMounts:
          - name: postgresql-data
            mountPath: /bitnami/postgresql

Verify

Check that the Postgresql Pod stays in the Running status.

Also, make sure the ghostfolio app works properly after the change.

NB! You may also need to delete the Ghostfolio Pod to be able to recreate the Security Token, as the database has been reinitilized again. Without reinitializing Ghostfolio also, it may not offer the ability to create new token. Once the database is set up persistenly, the token will not disapear again.