Lab 6 - Kubernetes Storage
Welcome to the lab 6. In this session, the following topics are covered:
- Kubernetes storage basics;
- Persistent Volumes;
- Setting up persistent databases in Kubernetes
- Setting up and configuring Longhorn persistent storage
Kubernetes storage basics¶
Kubernetes only includes temporary storage options through volumes by default. If we want to use more permanent storage, we need to configure Persistent storage classes. Often these require deploying third party provisioners.
In this lab, we will take a look at some of the Storage building blocks of Kubernetes:
(Source: link )
Generic Kubernetes Volumes¶
Our History data server deployment uses a MinIO
Pod as a database, which currently uses the default emptyDir
ephemeral type of volume for data storage.
Let's investigate what happens if MinIO Pods get replaced.
Complete
Make sure some data is stored in the History data server and send a GET request to fetch the data (use the history-server service IP).
Now, delete the MinIO Pod. As we have configured it as a StatefulSet, Kubernetes will notice that the number of Pods is smaller than required (1) and will recreate the Pod automatically.
Verify
Send a GET request to the history data server again, and check what happened to the previously stored data.
As we can see, the Minio database is not currently resilient to Pod failures.
We must set up persistent data storage for MinIO.
Info
In the fourth lab, we set up MinIO
as a StatefulSet and defined the data Volume to be of type emptyDir
:
volumeMounts:
- name: minio-storage
mountPath: "/storage"
volumes:
- name: minio-storage
emptyDir: {}
Inside the container, the MinIo data is stored in path /storage
. This path should not be changed, but we can modify what type of Volume is used.
Let's now create a Persistent Volume for our MinIO Pods.
Persistent Volumes¶
Before we start creating persistent volumes, we need to define which Storage Classes are available in our Kubernetes cluster.
Complete
Create a new Storage Class template with "local-storage" name and the following content:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
Some explanations to used values:
provisioner: kubernetes.io/no-provisioner
means that we do not use an automated (or dynamic) provisioner. And we will have to create the Persistent Volumes manually.volumeBindingMode: WaitForFirstConsumer
means that PV is not initiated until a Pod claiming it is provisioned.reclaimPolicy: Retain
means that PV content should be kept between. This setting has no real effect for us as we are not using a provisioner.
Apply the template to Kubernetes.
Verify
You can check that the storage class was created and is now available with the following command:
kubectl get storageclass
Complete
Create a "/mnt/data" folder in all Kubernetes nodes. We will store persistent volumes there.
Let's now create a new Persistent Volume minio-pv-0. Add the following content to a new pv-minio.yaml file:
apiVersion: v1
kind: PersistentVolume
metadata:
name: minio-pv-0
labels:
app: minio
ver: minio-pv-v1-0
spec:
capacity:
storage: 1Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /mnt/data/minio-pv-0
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- NODE-HOSTNAME
NODE-HOSTNAME must be replaced with actual full hostname (e.g., pelle-worker-a.cloud.ut.ee
) Apply the pv-minio.yaml template using kubectl.
Also, create a folder for the new volume at /mnt/data/minio-pv-0
on the same node.
Verify
Check the list of Persistent Volumes:
kubectl get pv
You should now see that a new Persistent Volumes is available to be used.
Complete
Let's now create a new Persistent Volume Claim (PVC) minio-pvc-0
and reconfigure our MinIO database stateful set to use it.
Add the following content to a new pvc-minio.yaml
file:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: minio-pvc-0
spec:
storageClassName: local-storage
selector:
matchLabels: #Select a volume with this labels
app: minio
ver: minio-pv-v1-0
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Apply the pvc-minio.yaml
template using kubectl.
Modify the MinIO stateful set we created in lab 4 (statefulset-minio.yaml
)
Replace the line:
volumes:
- name: minio-storage
emptyDir: {}
with:
volumes:
- name: minio-storage
persistentVolumeClaim:
claimName: minio-pvc-0
And apply the modified MinIO stateful set template (remember to use the production
namespace!).
Verify
Check that the claim was created:
kubectl get pvc -n production
Also check that the content of the configured folder on the correct cluster node (which you specified under Node Affinity)
ls /mnt/data/minio-pv-0 -al
Once the MinIO is reconfigured and history server has created the MinIO bucket, you should see that a new folder named price-data
has been created there.
Info
If you run into issues with creating or testing PV and PVC's, here are some debugging tips:
- Use
kubectl describe
onpv
pvc
, andpods
to get more information - Use
kubectl logs
minio-0` to check whether MinIO container starts properly.
Complete
Make sure some data is stored in the History data server and send a GET request to fetch the data (use the history-server service IP).
Delete the minio-0 Pod and check that the database stays intact afterwards.
Verify
Send a GET request to the history data server again, and check what happened to the previously stored data.
If the price data is still there afterwards, this means that the Persistent Volume was created and is being used properly.
Info
As you can see, there were quite a few manual steps that had to be performed, like manual folder preparation and permission management.
Such steps can be automated with third party local-storage provisioners, which take care of dynamically preparing folders and their permissions, but there are also other disadvantages to using local volumes.
For instance, Pods can only be deployed where required pv's are located which can lead to unbalanced node load.
Lets next take a look at a more powerful Storage Class: Longhorn
Preparing the nodes for Longhorn¶
In this task, we will prepare the nodes for the installation of the Longhorn storage controller which will orchestrate the provisioning on reliable and replicated storage volumes in our Kubernetes cluster.
Complete
You'll be completing this section from Longhorn documentation
Usually, installing Longhorn requires you to prepare the machine with additional packages and kernel modules, but thankfully Longhorn developers have published a tool that does this for you.
Switch to root user and export the KUBECONFIG variable:
export KUBECONFIG=/etc/kubernetes/admin.conf
Download the Longhorn command line client:
curl -sSfL -o longhornctl https://github.com/longhorn/cli/releases/download/v1.7.1/longhornctl-linux-amd64
And set execution permissions: chmod +x longhornctl
After that, run the following command on the Controller node to install everything necessary on all the Kubernetes nodes:
./longhornctl install preflight
This tool will run for a while. If you are interested in what is happening, you can log into the Controller node in a different terminal. You should notice that this command has deployed Pods on all the Kubernetes nodes with elevated permissions that installs the required libraries, kernel modules and configures the nodes as needed for running Longhorn.
Verify
Usually verifying things like this is not easy, when Kubernetes is involved, but the same longhornctl
command also contains a step for verifying that everything was set up properly.
Run the checking command to verify everything has been installed properly:
./longhornctl check preflight
This tool will run for a while, and let you know if there's any issues.
Danger
Never run scripts in this fashion if you don't understand them. This pattern is sadly very popular, but from a security standpoint, you should not download a file and then instantly execute it.
Installing Longhorn¶
In this task we will install Longhorn inside our cluster as a set of Kubernetes entities.
Complete
Do this part only if the verification of the previous step did not bring up any issues. Perform these tasks on the main server only.
There are several ways to install Longhorn, but we'll use the simplest and quickest one, which uses kubectl
.
Download the Longhorn version 1.7.1 manifest:
wget https://raw.githubusercontent.com/longhorn/longhorn/v1.7.1/deploy/longhorn.yaml
Modify the longhorn.yaml
file and change the numberOfReplicas
parameter of the Longhorn StorageClass
from 3 to 2. Otherwise Longhorn volumes will use up too much storage space of our cluster. It should look something like:
parameters:
numberOfReplicas: "2"
staleReplicaTimeout: "30"
Apply the modified manifest to install Longhorn in your Kubernetes cluster:
kubectl apply -f longhorn.yaml
Verify
You can verify the installation by checking the pods in the longhorn-system
namespace. They should all change into running
state in a minute or two.
You can also check the kubectl get storageclass
command to see, if a default storageclass
got created. The storageclass
defines which storage system you use for your deployments, for when you have multiple in the cluster. The default one is used when a storageclass does not get specified. It should show that the Longhorn sotrage class number of replicas is set to 2.
First Longhorn workload¶
Let's now create a Pod that uses Longhorn volumes.
Complete
Create a namespace called storage
, and run the following manifest:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: longhorn-pvc
namespace: storage
spec:
accessModes:
- ReadWriteOnce
storageClassName: longhorn
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
name: writer-pod
namespace: storage
spec:
containers:
- name: busybox
image: busybox
command: ["/bin/sh", "-c", "echo 'Hello, Persistent Storage!' >> /data/hello.txt; sleep 3600"]
volumeMounts:
- name: longhorn-volume
mountPath: /data
volumes:
- name: longhorn-volume
persistentVolumeClaim:
claimName: longhorn-pvc
Verify
You'll be able to see from kubectl -n storage get persistentvolumeclaim
(short kubectl get pvc
) and kubectl -n storage get persistentvolume
(short kubectl get pv
) that a persistent volume is created for your pod.
When you exec
into your pod, you'll see that a file has been made by the container /data/hello.txt
. Every time the pod has run, there will be a new line added to the file.
You can delete and recreate the pod until it goes to another node, and you'll still be able to see the file and its content there. You can also attach the PVC
to another container, and you'll still see the file there.
NB! The pod is not automatically recreated as we did not define a Deployment or a StatefulSet for it.
Accessing the Longhorn user interface¶
In this task, we will check out the Longhorn user interface, which can be used to manage created volumes, their replicas and change the amount of replicas for a volume or manually clean up unnecessary volumes after they are no longer needed.
We will not open LongHorn user interface port to the outside world because it does not have user authentication. Instead we will set up a Kubernetes port forwarder, that will temporarily route traffic between our local computer and a Kubernetes service.
Complete
Download the kubernetes administrator config file (located at KUBECONFIG=/etc/kubernetes/admin.conf
) into your laptop or PC. You can use the scp
command for downloading files from the Virtual Machine. As the file is initially only accessible for the root user, you may need to first copy the file into centos user folder and change its permission before it can be moved by scp command.
Install the kubectl tool inside your laptop or PC.
Follow the kubectl
documentation to make the configuration file that you downloaded accessible for the kubectl
tool.
Verify
Use the typical kubectl
commands to check that the command is working properly. For example, listing all the pods.
Once kubectl
is configured and working, you can continue using it directly from your computer and you no longer need to log into the Virtual Machine to use Kubernetes commands.
Complete
Look up what is the name of the longhorn service for the user interface in the longhorn-system
namespace.
Set up the port forwarding between a local port in your compuer and the lonhorn user interface service in the Kubernetes cluster:
kubectl --namespace longhorn-system port-forward --address 127.0.0.1 service/<name_of_the_service> 5080:80
Verify
Access the configured port from a browser in your computer: http://localhost:5080/#/dashboard
In future, you can similarily access Kubernetes services from your computer without having to explicitly make them available to everyone.
Migrating the use case database to use Longhorn volumes¶
The last task will be to migrate the MinIO database from local persistent volumes to Longhorn volumes. We will not create Persistent volume claims manually, and will instead specify a volume claim template in the stateful set manifest. Kubernetes controllers will automatically create a volume claim for every new Pod that is created for the StatefulSet.
Complete
Update the MinIO StatefulSet, remove the volumes:
block and instead add a new volumeClaimTemplates:
under the StatefulSet spec:
block (NB! Not under container block!). You will find an example of how to define volume claim templates in LongHorn GitHub.
Name the VolumeClaimTemplate minio-data
. Do not use selector:
block. Set the storage space to 1Gi
and use longhorn as storage class.
Apply the new StatefulSet manifest. You may need to delete the previous statefulSet.
Verify
Check that the MinIO Pod stays in the Running status.
Also, make sure the history data server works properly after the change.
NB! If would also be good to delete the history data server Pod to verify that the data is still there after the Pod is recreated.