Lab 4 - Kubernetes Workloads

Introduction¶

In this lab, you are learning how to deploy software and services in Kubernetes. Your tasks are:

Getting familiar with basic Kubernetes resources
Deployment of NGINX server on Kubernetes
Setup of different Kubernetes workloads for the price-fetcher-service micro service
Setup a Kubernetes workload for MinIO database

Workload basics¶

In Kubernetes, a manifest is a .yaml file with description of a Kubernetes resource. Typical resources are:

Pod - a wrapper for containers
Deployment - a manager for Pods, containing stateless applications, for example web services
ConfigMap - a storage for static non-sensitive data used by applications (configuration text files, images)
Secret - a storage for static sensitive data (passwords and API keys)
StatefulSet - a manager for Pods, containing stateful applications, for example database services).

Pod workload¶

A Pod is a minimal scheduling unit in Kubernetes. It represents a set of containers with common storage and network resources.

An example manifest file for Pod with NGINX server container looks like this:

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    name: nginx
spec:
  containers:
  - name: nginx
    image: registry.hpc.ut.ee/mirror/library/nginx:1.27.1
    resources:
      requests:
        memory: "128Mi"
        cpu: "100m"
      limits:
        memory: "256Mi"
        cpu: "200m"
    ports:
    - containerPort: 80
      name: http

The major sections are:

apiVersion - a version of Kubernetes API;
kind - a kind of Kubernetes resource, Pod in this case;
metadata describes Pod name, labels for filtering, annotation and other metadata;
spec describes a specification for containers, volumes, etc.;
containers describes settings for containers within a Pod including container images, exposed ports and computing resource limits.

Info

It is best to run the following commands under normal centos user. If you did not configure Kubernetes access for the centos user in the previous lab (and used root user instead), then use can use the following commands to confgure the location of Kubernetes configuration for the centos user:

# Setup kubeconfig access file for centos user if you haven't done it on the previous lab
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Complete

First of all, create a separate namespace test for example resources deployment and testing.

# Create a namespace test
kubectl create namespace test

To deploy the Pod on Kubernetes cluster, put this manifest into a file (for example pod-nginx.yaml) and run:

kubectl apply -f pod-nginx.yaml --namespace test
# pod/nginx created

You can get the Pod info:

kubectl get pod nginx -n test
# NAME    READY   STATUS    RESTARTS   AGE
# nginx   1/1     Running   0          2m43s

Feel free to login into the Pod's container and explore it:

kubectl exec -it -n test nginx -- /bin/bash
# root@nginx:/#

Use this command to get detailed info about the Pod.

kubectl describe pod nginx -n test
# Name:             nginx
# Namespace:        default
# ...
# Status:           Running
# IP:               10.0.1.244
# ...

It does the same as nerdctl inspect container, but also includes Kubernetes-related metadata.

Validate

You can access the server welcome page using pod's IP:

curl 10.0.1.244
#<a href="http://nginx.com/">nginx.com</a>.</p>
#
#<p><em>Thank you for using nginx.</em></p>
#...

Although accessing the Pod via browser isn't possible now without setting up Ingress or NodePort service, the next lab explains how to do it.

You can also check the logs of the pod to see the logs of Nginx which should show list of previous HTTP requests or errors:

kubectl logs nginx -n test

Deployment workload¶

One of the most powerful features of Kubernetes is Pod lifecycle management. It covers different cases, for example: Pod restarts when an application container crashes with error. To handle such situations, Kubernetes has a higher-level structure called Deployment managing stateless Pods.

As you can see, when the NGINX Pod is removed, Kubernetes won't recreate it automatically:

kubectl delete pod nginx -n test
# pod "nginx" deleted
kubectl get pod nginx -n test
# Error from server (NotFound): pods "nginx" not found

Complete

Now, explore capabilities of the Deployment resource. The example of Deployment manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: registry.hpc.ut.ee/mirror/library/nginx:1.27.1
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "200m"
        ports:
        - containerPort: 80
          name: http

You should note that this Deployment assigns a label to all created Pods: app: nginx. Theese labels can later be used to find pods that were deployed by this Deployment.

Put the content into deployment-nginx.yaml file and deploy it to the Kubernetes cluster:

kubectl apply -f deployment-nginx.yaml -n test
# deployment.apps/nginx created

Verify

After creation, you can find both the Deployment in the same namespace:

kubectl get deployment nginx -n test
# NAME    READY   UP-TO-DATE   AVAILABLE   AGE
# nginx   1/1     1            1           35s

We can search for the linked pods that belong to this Deployment by using the app label:

kubectl get pod -l app=nginx -n test
# NAME                     READY   STATUS    RESTARTS   AGE
# nginx-74547bd6d7-smgmk   1/1     Running   0          51s

Use the describe command to see more information about the deployment:

kubectl describe deployment nginx -n test

As you can see from the last command output, the Deployment doesn't include any specific information about containers, rather the Pod template.

Use-case¶

Price Fetcher¶

Deployment¶

Now you can create a Kubernetes Deployment for the price-fetching-service you created during previous labs.

Bug

The container image for price-fetcher-service is in the registry available to your control-plane VM only. To deploy a container on Kubernetes, the image must be available for all the VMs. One of the options is to deploy a private registry in kubernetes and use it to get images, but we won't cover it in this lab. To simplify the process, we will use Dockerhub

Complete

Let's build and push image to the Dockerhub.

Firstly, create an account at Dockerhub and construct write its credentials to the environment variables:

# Paste username and password of your user into these 2 variables
export DOCKER_USER="change_me"
export DOCKER_PASSWORD="change_me"

Secondly, build and push an image

# Go to the directory with your price-fetching-service code
cd ...
# Build your image
buildah build -f Dockerfile -t "${DOCKER_USER}/price-fetcher-service:latest" .
# Publish the image to Dockerhub
buildah push --creds="${DOCKER_USER}:${DOCKER_PASSWORD}" ${DOCKER_USER}/price-fetcher-service:latest docker://${DOCKER_USER}/price-fetcher-service:latest

Validate

Go to Dockerhub, login and view you image in the list.

Complete

Now, we are ready to create a Kubernetes Deployment for the price-fetcher-service. Use the following content for the deployment-price-fetcher.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: price-fetcher
spec:
  selector:
    matchLabels:
      app: price-fetcher
  template:
    metadata:
      labels:
        app: price-fetcher
    spec:
      containers:
      - name: price-fetcher
        image: registry.hpc.ut.ee/mirror/CHANGE_ME/price-fetcher-service:latest
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "200m"

NB: don't forget to replace the CHANGE_ME part in image name

We use registry.hpc.ut.ee/mirror/ as a mirror to avoid Dockerhub pull limit.

Now, start the Deployment in test namespace.

kubectl apply -f deployment-price-fetcher.yaml -n test
# deployment.apps/nginx created

Validate

You can check running Deployment in the test namespace:

kubectl get pods -n test -l app=price-fetcher
# NAME                             READY   STATUS    RESTARTS   AGE
# price-fetcher-77675578cf-9js87   1/1     Running   0          4m44s

ConfigMap¶

For now, we have the app running, let's make it configurable. For example, we can add an option to modify a Cron schedule via ConfigMap. A ConfigMap is a set of key-value records, which can be used as environment variables or be mounted into containers as files.

Danger

In the following examples, we assume that the price fetcher script is inside the container in the /root folder,the name of the script is script.py, and that we use a Python virtual enviorenment located at /venv/.

If your solution is different, change these details accordingly in the following manifests.

Complete

Create a ConfiMap mainfest file configmap-cron-schedule.yaml with this content:

apiVersion: v1
kind: ConfigMap
metadata:
  name: cron-schedule
  namespace: test
  labels:
    app: price-fetcher
data:
  cronjobs: |
    0 12 * * * /venv/bin/python /root/script.py

Create a ConfigMap object in the test namespace:

kubectl apply -f configmap-cron-schedule.yaml
# Validate it exists in the namespace
kubectl get configmap -n test -l app='price-fetcher'
# NAME            DATA   AGE
# cron-schedule   1      70s

After this, modify the deployment-price-fetcher.yaml file the following way:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: price-fetcher
spec:
  # ...
  template:
    metadata:
      labels:
        app: price-fetcher
    spec:
      containers:
      - name: price-fetcher
        # ...
        volumeMounts: #(4)
        - name: cronjobs-file #(5)
          mountPath: /etc/crontabs/root #(6)
          subPath: cronjobs #(7)
      volumes: #(1)
      - name: cronjobs-file #(2)
        configMap:
          name: cron-schedule #(3)

List of volumes used by the deployment
Volume name used as reference within the deployment
ConfigMap's name been used
List of volume mounted into the container
Volume name (must match one in volumes list)
A path where the volume been mounted
A name of the file from the volume been mounted. This allows us to use only one record from it, instead of mounting the entire ConfiGmap as a dir.

Info

NB: when you use subPath field, you may need to restart the deployment every time, when content of the respecting ConfigMap record is updated. Kubernetes doesn't keep deployment up-to-date with the ConfigMaps mounted this way, so we have to do it manually. To restart a Deployment use:

kubectl rollout restart deployment example-deployment -n example-namespace

Complete

After modification of the deployment-price-fetcher-service.yaml file, apply it to update the price-fetcher deployment:

kubectl apply -f deployment-price-fetcher.yaml -n test

Validate

After a cronjob is triggered, you can check if the prices.csv file was fetched:

$ kubectl exec -it deployment/price-fetcher -n test -- sh
# ...
# cat /tmp/price-data/prices.csv
# "Ajatempel (UTC)";"Kuupäev (Eesti aeg)";"NPS Eesti"
# ...

CronJob¶

Deployment is a useful resource in Kubernetes for stateful apps, but there is one resource, which fits better the needs of cronjob running. The resource is called CronJob and it describes a Kubernetes Job running by a schedule.

Complete

The current price-fetcher-service image already includes crond as a start command for a container. Let's change the Dockerfile as we don't need the crond in containers anymore and we will run the command directly:

FROM registry.hpc.ut.ee/mirror/library/alpine

...
# Changes start here:

# Start the python script
#(1)
CMD [ "/venv/bin/python", "/root/script.py" ]

Replace crond with a direct execution of the script.

You can remove all Dockerfile lines that deal with the cronjobs file.

A container with this image runs until the script is finished.

We need to push this image to the Dockerhub with a different name as the image does the same but a different way.

export DOCKER_USER="CHANGE_ME"
buildah build -f Dockerfile -t "${DOCKER_USER}/price-fetcher-script:1.0.0" .
buildah push --creds="${DOCKER_USER}:${DOCKER_PASSWORD}" ${DOCKER_USER}/price-fetcher-script:1.0.0 docker://${DOCKER_USER}/price-fetcher-script:1.0.0

Note a different name: price-fetcher-script instead of price-fetcher-service as well as a numeric tag instead of latest.

Having a latest image tag in Kubernetes workloads is not a good practice: it is unreliable as its is dynamic in time. For the price-fetcher-script image you better use fixed numeric version, for example, 1.0.0.

Info

This script is a complete component of the use-case system, which should run in a separate namespace. For this, we need to create a production namespace first and then deploy the production-ready components in it. The Kubernetes resources within this namespace will be checked by the scoring system. These checks will indicate completion of future labs.

Complete

Create a production namespace:

kubectl create namespace production

Prepare a CronJob manifest for the price-fetcher-script:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: price-fetcher
  labels:
    app: electricity-calculator
    microservice: price-fetcher
spec:
  timeZone: "Etc/UTC" #(1)
  schedule: "0 12 * * *" #(2)
  jobTemplate: #(3)
    spec:
      template:
        spec:
          containers:
          - name: price-fetcher
            image: registry.hpc.ut.ee/mirror/CHANGE_ME/price-fetcher-script:1.0.0 #(4)
            resources:
              requests:
                memory: "128Mi"
                cpu: "100m"
              limits:
                memory: "256Mi"
                cpu: "200m"
          restartPolicy: OnFailure

You can set a custom timezone for the CronJob
Cron-styled schedule
A job template
Note the new image name, also replace CHANGE_ME with your username

NB: don't forget to replace the CHANGE_ME part in image name

The CronJob doesn't need a ConfigMap with a schedule config file, because the schedule is a part of the manifest.

Now, deploy this CronJob into the same namespace:

kubectl apply -f cronjob-price-fetcher.yaml -n production
# cronjob.batch/price-fetcher created
kubectl get cronjob -l app=electricity-calculator -l microservice=price-fetcher -n production
# NAME            SCHEDULE     TIMEZONE   SUSPEND   ACTIVE   LAST SCHEDULE   AGE
# price-fetcher   0 12 * * *   Etc/UTC     False     0        <none>          24s

Validate

Trigger the CronJob manually to check if it works correctly:

# Trigger the CronJob manually
kubectl create job --from=cronjob/price-fetcher price-fetcher-cronjob-test -n production
# Check if a Job is created
kubectl get job price-fetcher-cronjob-test -n production
# Check if a pod is created
kubectl get pod -l job-name=price-fetcher-cronjob-test -n production
# View the Pod's logs
kubectl logs -l job-name=price-fetcher-cronjob-test -n production

For now, this CronJob doesn't save the prices.csv file anywhere. In the future labs we will make the price-fetcher script send this file to the history-server service that we create later.

MinIO¶

MinIO is an S3-compatible object storage. In this course, it serves persistent storage needs of history-server. The server is not implemented yet, but in this lab you will deploy MinIO into your Kubernetes cluster and make use of it in the further labs.

Secret¶

In Kubernetes, the proper way to store static passwords, tokens, etc. is Secret resource. In this lab, you will store MinIO admin credentials in a Secret. These credentials are shared: MinIO server uses it for initial admin setup and a history server uses them to access the MinIO instance.

Info

A Secret represents a set of key-value pairs, where a key is an alias and a value is base64-encoded secret. This resource is namespace-based, therefore a Secret can't be shared between namespaces.

Complete

Create a file secret-minio.yaml with this content:

apiVersion: v1
kind: Secret
metadata:
  name: minio-secret
type: Opaque
data:
  MINIO_ROOT_USER: "YWRtaW51c2Vy" #(1)
  MINIO_ROOT_PASSWORD: "MjA5NWY0MTQ3ZTA2OTFjZGY2ZmFhNGMyNTNhOGZhOWEzOGQzMmE2Mg==" #(2)

Base64-encoded "adminuser" username
Base64-encoded random string used as password

Opaque type means the data in the secret is generic and doesn't relate to the Kubernetes cluster. The values in data section must be encoded by a user, otherwise the cluster raises the error: error decoding from json: illegal base64 data.

Apply the file to the Kubernetes cluster:

kubectl apply -f secret-minio.yaml -n production
# secret/minio-secret created

Validate

Make sure the value of the MINIO_ROOT_USER key in the secret matches adminuser string:

kubectl get secret minio-secret -n production -o jsonpath='{.data.MINIO_ROOT_USER}' | base64 -d
# adminuser

StatefulSet¶

The last step for this lab is creation of StatefulSet for the MinIO server. StatefulSet is a management resource as a Deployment, which controls stateful applications only. The key differences between the two:

StatefulSet creates and removes pods in a strict order;
Deployment Pods are identical and can be interchanged;
Deployment assigns random name suffix for Pods, StatefulSet uses sequential numbers;
Deployment's Pods share the same persistent storage while StatefulSet creates a volume for each replica.

For now, the storage setup for Kubernetes is not covered by this lab, so instead of persistent volumes you use in-memory storage.

Complete

Use this manifest to create MinIO stateful set:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: minio
  labels:
    app: minio
spec:
  selector:
    matchLabels:
      app: minio
  serviceName: minio
  replicas: 1
  template:
    metadata:
      labels:
        app: minio
    spec:
      containers:
      - name: minio
        image: registry.hpc.ut.ee/mirror/minio/minio:RELEASE.2024-09-13T20-26-02Z #(1)
        args: #(2)
        - server
        - /storage
        env: #(3)
        - name: MINIO_ROOT_USER
          valueFrom:
            secretKeyRef:
              name: minio-secret
              key: MINIO_ROOT_USER
        - name: MINIO_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: minio-secret
              key: MINIO_ROOT_PASSWORD
        ports:
        - containerPort: 9000
          hostPort: 9000
        volumeMounts: #(4)
        - name: minio-storage
          mountPath: "/storage"
      volumes: # (5)
        - name: minio-storage
          emptyDir: {} #(6)

This tag is fixed and after every Pod recreation the image content stays the same
Custom args for the container
Environment with secret data
List of volume mounts
List of volumes
For now, we use emptyDir - an ephemeral storage without persistence. It's lifecycle bound to a Pod's one.

As you can notice, the minio StatefulSet references the minio-secret in the manifest. This approach is helpful, when an application needs to share the credentials with a storage client service like history-server.

Validate

Now you can login into the minio-0 pod, create a storage alias and create a bucket in it.

# Create an alias for the current session
# Using this alias we can manage the object storage
mc alias set default http://0.0.0.0:9000 $MINIO_ROOT_USER $MINIO_ROOT_PASSWORD

# Create a "price-data" bucket in the storage
mc mb default/price-data
# Bucket created successfully `default/price-data`.

# Check if the bucket was created
mc ls default
#[... UTC]     0B price-data/

We will start using this Minio Statefulset and the created bucket in the following labs.