Skip to content

Lab 4 - Kubernetes Workloads

Introduction

Welcome to the lab 4. In this session, the next topics are covered:

  • Getting familiar with basic Kubernetes resources;
  • Deployment of NGINX server on Kubernetes;
  • Setup of different Kubernetes workloads for price-fetcher-service;
  • Setup a Kubernetes workload for MinIO server.

Workload basics

In terms of Kubernetes, a manifest is a .yaml file with description of a Kubernetes-managed resource. Typical resources are Pod (container wrapper), Deployment (management unit for stateless Pods, for example web services), ConfigMap (set of configuration data), Secret (set of encrypted key-value pairs) StatefulSet (management unit for stateful Pods, like database services).

Pod workload

A Pod is a minimal unit of workload in Kubernetes, which represents a set of containers with common storage and network resources.

An example manifest file for Pod with NGINX server container looks like this:

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    name: nginx
spec:
  containers:
  - name: nginx
    image: registry.hpc.ut.ee/mirror/library/nginx:1.27.1
    resources:
      requests:
        memory: "128Mi"
        cpu: "100m"
      limits:
        memory: "256Mi"
        cpu: "200m"
    ports:
    - containerPort: 80
      name: http

The major sections are:

  1. apiVersion - a version of Kubernetes API;
  2. kind - a kind of Kubernetes resource, Pod in this case;
  3. metadata describes Pod name and labels for filtering;
  4. spec describes a specification for containers, volumes, etc.;
  5. containers describes settings for containers within a Pod including container images, exposed ports and computing resource limits.

Complete

First of all, create a separate namespace test for example resources deployment and testing.

# Setup kubeconfig access file for centos user if you haven't done it on the previous lab
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

# Create a namespace test
kubectl create namespace test

To deploy the Pod on Kubernetes cluster, put this manifest into a file (for example pod-nginx.yaml) and run:

kubectl apply -f pod-nginx.yaml --namespace test
# pod/nginx created

You can get the Pod info:

kubectl get pod nginx -n test
# NAME    READY   STATUS    RESTARTS   AGE
# nginx   1/1     Running   0          2m43s

Feel free to login into the Pod's container and explore it:

kubectl exec -it -n test nginx -- /bin/bash
# root@nginx:/#

Use this command to get detailed info about the Pod.

kubectl describe pod nginx -n test
# Name:             nginx
# Namespace:        default
# ...
# Status:           Running
# IP:               10.0.1.244
# ...

It does the space as nerdctl inspect container, but also includes Kubernetes-related metadata.

Validate

You can access the server welcome page using pod's IP:

curl 10.0.1.244
#<a href="http://nginx.com/">nginx.com</a>.</p>
#
#<p><em>Thank you for using nginx.</em></p>
#...

Although accessing the Pod via browser isn't possible now, the next lab explains the way to do it.

Deployment workload

One of the most powerful features of Kubernetes is Pod lifecycle management. It covers many cases, for example: Pod restarts when an application container crashes with error. To handle these situations, Kubernetes has a higher-level structure called Deployment managing stateless Pods.

As you can test, when the NGINX Pod is removed, Kubernetes won't recreate it automatically:

kubectl delete pod nginx -n test
# pod "nginx" deleted
kubectl get pod nginx -n test
# Error from server (NotFound): pods "nginx" not found

Complete

Now, explore possibilites of a Deployment resource. The Deployment manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: registry.hpc.ut.ee/mirror/library/nginx:1.27.1
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "200m"
        ports:
        - containerPort: 80
          name: http

Put the content into deployment-nginx.yaml file and create a Kubernetes resource:

kubectl apply -f deployment-nginx.yaml -n test
# deployment.apps/nginx created

Verify

After creation, you can find both the Deployment and the linked Pod in the same namespace:

kubectl get deployment nginx -n test
# NAME    READY   UP-TO-DATE   AVAILABLE   AGE
# nginx   1/1     1            1           35s

kubectl get pod -l app=nginx -n test
# NAME                     READY   STATUS    RESTARTS   AGE
# nginx-74547bd6d7-smgmk   1/1     Running   0          51s

kubectl describe deployment nginx -n test

As you can see from the last command, the Deployment doesn't include any specific information about containers, rather the Pod template and availability info.

Use-case

Price Fetcher

Deployment

Now you can create a Kubernetes Deployment for the price-fetching-service you created during previous labs.

Bug

The container image for price-fetcher-service is in the registry available to your control-planne VM only. To deploy a container on Kubernetes, the image must be accessible for all the VMs. One of the options is to deploy a private registry in kubernetes and use it to get images, but we won't cover it in this lab. To simplify the process, we will use Dockerhub

Complete

Let's build and push image to the Dockerhub.

Firstly, create an account at Dockerhub and construct an auth file for kaniko builder:

# Better execute this script on your PC in security concerns (the following lines work for Linux)
# Paste username and password of your user into these 2 variables
export DOCKER_USER="change_me"
export DOCKER_PASSWORD="change_me"
# Encode your docker username and password using base64 and the format "username:password"
token=$(printf "%s:%s" "${DOCKER_USER}" "${DOCKER_PASSWORD}" | base64 | tr -d '\n')
# Print and copy the secret
echo $token

copy the printed value and create a docker-login.json file in your VM:

{
    "auths": {
        "https://index.docker.io/v1/": {
            "auth": "value_you_copied_at_the_previous_step"
        }
    }
}
put this file into /home/centos/docker-login.json.

Secondly, build and push an image

# Go to the directory with your price-fetching-service code
cd ...
# Build your image and push it to Dockerhub
export DOCKER_USER="change_me"
sudo nerdctl run --network host -v /home/centos/docker-login.json:/kaniko/.docker/config.json -v ${PWD}:/workspace gcr.io/kaniko-project/executor:latest --destination=docker.io/${DOCKER_USER}/price-fetcher-service:latest

Validate

Go to Dockerhub, login and view you image in the list.

Complete

Now, we are ready to create a Kubernetes Deployment for the price-fetcher-service. Use the following content for the deployment-price-fetcher.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: price-fetcher
spec:
  selector:
    matchLabels:
      app: price-fetcher
  template:
    metadata:
      labels:
        app: price-fetcher
    spec:
      containers:
      - name: price-fetcher
        image: registry.hpc.ut.ee/mirror/CHANGE_ME/price-fetcher-service:latest
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "200m"

NB: don't forget to replace the CHANGE_ME part in image name

We use registry.hpc.ut.ee/mirror/ as a mirror to avoid Dockerhub pull limit.

Now, start the Deployment in test namespace.

kubectl apply -f deployment-price-fetcher.yaml -n test
# deployment.apps/nginx created

Validate

You can check running Deployment in the test namespace:

kubectl get pods -n test -l app=price-fetcher
# NAME                             READY   STATUS    RESTARTS   AGE
# price-fetcher-77675578cf-9js87   1/1     Running   0          4m44s

ConfigMap

For now, we have the app running, let's make it configurable. For example, we can add an option to modify a Cron schedule via ConfigMap. A ConfigMap is a set of key-value records, which can be used as environment variables or be mounted into containers as files.

Danger

In the following examples, we assume that the price fetcher script is inside the container in the /app folder and the name of the script is price_fetcher.py. This is slightly different from the example in the second lab.

The script location and name is likely to be different inside the container that you have prepared. So you will have to modify these details accordingly in the following manifests.

Complete

Create a ConfiMap mainfest file configmap-cron-schedule.yaml with this content:

apiVersion: v1
kind: ConfigMap
metadata:
  name: cron-schedule
  namespace: test
  labels:
    app: price-fetcher
data:
  cronjobs: |
    0 12 * * * python3 /app/price_fetcher.py

Create a ConfigMap object in the test namespace:

kubectl apply -f configmap-cron-schedule.yaml
# Validate it exists in the namespace
kubectl get configmap -n test -l app='price-fetcher'
# NAME            DATA   AGE
# cron-schedule   1      70s

After this, modify the deployment-price-fetcher.yaml file the following way:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: price-fetcher
spec:
  # ...
  template:
    metadata:
      labels:
        app: price-fetcher
    spec:
      containers:
      - name: price-fetcher
        # ...
        volumeMounts: #(4)
        - name: cronjobs-file #(5)
          mountPath: /etc/crontabs/root #(6)
          subPath: cronjobs #(7)
      volumes: #(1)
      - name: cronjobs-file #(2)
        configMap:
          name: cronjob-schedule #(3)
  1. List of volumes used by the deployment
  2. Volume name used as reference within the deployment
  3. ConfigMap's name been used
  4. List of volume mounted into the container
  5. Volume name (must match one in volumes list)
  6. A path where the volume been mounted
  7. A name of the file from the volume been mounted. This allows us to use only one record from it, instead of mounting the entire ConfiGmap as a dir.

Info

NB: when you use subPath field, you need to restart the deployment every time, when content of the respecting record is updated. Kubernetes doesn't keep deployment up-to-date with the ConfigMaps mounted this way, so we have to do it manually. To restart a Deployment use:

kubectl rollout restart deployment example-deployment -n example-namespace

Complete

After modification of the deployment-price-fetcher-service.yaml file, apply it to update the price-fetcher deployment:

kubectl apply -f deployment-price-fetcher-service.yaml -n test

Validate

After a cronjob is triggered, you can check if the prices.csv file was fetched:

$ kubectl exec -it deployment/price-fetcher -n test -- sh
# ...
/app $ cat /tmp/price-data/prices.csv
# "Ajatempel (UTC)";"KuupƤev (Eesti aeg)";"NPS Eesti"
# ...

CronJob

Deployment is a useful resource in Kubernetes for stateful apps, but there is one resource, which fits better the needs of cronjob running. The resource is called CronJob and it describes a Kubernetes Job running by a schedule.

Complete

The current price-fetcher-service image already includes crond as a start command for a container. Let's change the Dockerfile as we don't need the crond in containers anymore:

FROM registry.hpc.ut.ee/mirror/library/alpine

...
# Changes start here:

# Start the python script
#(1)
CMD [ "python3", "/app/price_fetcher.py" ]
  1. Replace crond with a direct execution of the script.

A container with this image runs until the script is finished.

We need to push this image to the Dockerhub with a different name as the image does the same but a different way.

export DOCKER_USER="CHANGE_ME"
sudo nerdctl run --network host -v /home/centos/docker-login.json:/kaniko/.docker/config.json -v ${PWD}:/workspace gcr.io/kaniko-project/executor:latest --destination=docker.io/${DOCKER_USER}/price-fetcher-script:1.0.0 #(1)
  1. Note a different name: price-fetcher-script instead of price-fetcher-service as well as a numeric tag instead of latest.

Having a latest image tag in Kubernetes workloads is not a good practice: it is unreliable as its is dynamic in time. For the price-fetcher-script image you better use fixed numeric version, for example, 1.0.0.

Info

This script is a complete component of the use-case system, which should run in a separate namespace. For this, we need to create a production namespace first and then deploy the production-ready components in it. The Kubernetes resources within this namespace will be checked by the scoring system. These checks will indicate completion of future labs.

Complete

Create a production namespace:

kubectl create namespace production

Prepare a CronJob manifest for the price-fetcher-script:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: price-fetcher
  labels:
    app: electricity-calculator
    microservice: price-fetcher
spec:
  timeZone: "Etc/UTC" #(1)
  schedule: "0 12 * * *" #(2)
  jobTemplate: #(3)
    spec:
      template:
        spec:
          containers:
          - name: price-fetcher
            image: registry.hpc.ut.ee/mirror/CHANGE_ME/price-fetcher-script:1.0.0 #(4)
            resources:
              requests:
                memory: "128Mi"
                cpu: "100m"
              limits:
                memory: "256Mi"
                cpu: "200m"
          restartPolicy: OnFailure
  1. You can set a custom timezone for the CronJob
  2. Cron-styled schedule
  3. A job template
  4. Note the new image name, also replace CHANGE_ME with your username

The CronJob doesn't need a ConfigMap with a schedule file, because the schedule is a part of the manifest.

Now, deploy this CronJob into the same namespace:

kubectl apply -f cronjob-price-fetcher.yaml -n production
# cronjob.batch/price-fetcher created
kubectl get cronjob -l app=electricity-calculator -l microservice=price-fetcher -n production
# NAME            SCHEDULE     TIMEZONE   SUSPEND   ACTIVE   LAST SCHEDULE   AGE
# price-fetcher   0 12 * * *   Etc/UTC     False     0        <none>          24s

Validate

Trigger the CronJob manually to check if it works correctly:

# Trigger the CronJob manually
kubectl create job --from=cronjob/price-fetcher price-fetcher-cronjob-test -n production
# Check if a Job is created
kubectl get job price-fetcher-cronjob-test -n production
# Check if a pod is created
kubectl get pod -l job-name=price-fetcher-cronjob-test -n production
# View the Pod's logs
kubectl logs -l job-name=price-fetcher-cronjob-test -n production

For now, this CronJob doesn't save the prices.csv file anywhere. In the future labs we will make the price-fetcher script send this file to the history-server service.

MinIO

MinIO is an S3-compatible object storage. In this course, it serves persistent storage needs of history-server. The server is not implemented yet, but in this lab you will deploy MinIO into your Kubernetes cluster and make use of it in the further labs.

Secret

In Kubernetes, the proper way to store static passwords, tokens, etc. is Secret resource. In this lab, you will store MinIO admin credentials in a Secret. These credentials are shared: MinIO server uses it for initial admin setup and a history server uses them to access the MinIO instance.

Info

A Secret represents a set of key-value pairs, where a key is an alias and a value is base64-encoded secret. This resource is namespace-based, therefore a Secret can't be shared between namespaces.

Complete

Create a file secret-minio.yaml with this content:

apiVersion: v1
kind: Secret
metadata:
  name: minio-secret
type: Opaque
data:
  MINIO_ROOT_USER: "YWRtaW51c2Vy" #(1)
  MINIO_ROOT_PASSWORD: "MjA5NWY0MTQ3ZTA2OTFjZGY2ZmFhNGMyNTNhOGZhOWEzOGQzMmE2Mg==" #(2)
  1. Base64-encoded "adminuser" username
  2. Base64-encoded random string used as password

Opaque type means the data in the secret is generic and doesn't relate to the Kubernetes cluster. The values in data section must be encoded by a user, otherwise the cluster raises the error: error decoding from json: illegal base64 data.

Apply the file to the Kubernetes cluster:

kubectl apply -f secret-minio.yaml -n production
# secret/minio-secret created

Validate

Make sure the value of the MINIO_ROOT_USER key in the secret matches adminuser string:

kubectl get secret minio-secret -n production -o jsonpath='{.data.MINIO_ROOT_USER}' | base64 -d
# adminuser

StatefulSet

The last step for this lab is creation of StatefulSet for the MinIO server. StatefulSet is a management resource as a Deployment, which controls stateful applications only. The key differences between the two:

  • StatefulSet creates and removes pods in a strict order;
  • Deployment Pods are identical and can be interchanged;
  • Deployment assigns random name suffix for Pods, StatefulSet uses sequential numbers;
  • Pods managed by a Deployment share the same persistent storage while StatefulSet creates a volume for each replica.

For now, the storage setup for Kubernetes is not covered by this lab, so instead of persistent volumes you use in-memory storage.

Complete

Use this manifest to create MinIO stateful set:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: minio
  labels:
    app: minio
spec:
  selector:
    matchLabels:
      app: minio
  serviceName: minio
  replicas: 1
  template:
    metadata:
      labels:
        app: minio
    spec:
      containers:
      - name: minio
        image: registry.hpc.ut.ee/mirror/minio/minio:RELEASE.2024-09-13T20-26-02Z #(1)
        args: #(2)
        - server
        - /storage
        env: #(3)
        - name: MINIO_ROOT_USER
          valueFrom:
            secretKeyRef:
              name: minio-secret
              key: MINIO_ROOT_USER
        - name: MINIO_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: minio-secret
              key: MINIO_ROOT_PASSWORD
        ports:
        - containerPort: 9000
          hostPort: 9000
        volumeMounts: #(4)
        - name: minio-storage
          mountPath: "/storage"
      volumes: # (5)
        - name: minio-storage
          emptyDir: {} #(6)
  1. This tag is fixed and after every Pod recreation the image content stays the same
  2. Custom args for the container
  3. Environment with secret data
  4. List of volume mounts
  5. List of volumes
  6. For now, we use emptyDir - an ephemeral storage without persistence. It's lifecycle bound to a Pod's one.

As you can notice, the minio StatefulSet references the minio-secret in the manifest. This approach is helpful, when an application needs to share the credentials with a storage client service like history-server.

Validate

Now you can login into the minio-0 pod, create a storage alias and create a bucket in it.

# Create an alias for the current session
# Using this alias we can manage the object storage
mc alias set default http://0.0.0.0:9000 $MINIO_ROOT_USER $MINIO_ROOT_PASSWORD

# Create a "price-data" bucket in the storage
mc mb default/price-data
# Bucket created successfully `default/price-data`.

# Check if the bucket was created
mc ls default
#[... UTC]     0B price-data/