Lab 4 - Kubernetes Workloads
Introduction¶
Welcome to the lab 4. In this session, the next topics are covered:
- Getting familiar with basic Kubernetes resources;
- Deployment of NGINX server on Kubernetes;
- Setup of different Kubernetes workloads for price-fetcher-service;
- Setup a Kubernetes workload for MinIO server.
Workload basics¶
In terms of Kubernetes, a manifest is a .yaml file with description of a Kubernetes-managed resource. Typical resources are Pod (container wrapper), Deployment (management unit for stateless Pods, for example web services), ConfigMap (set of configuration data), Secret (set of encrypted key-value pairs) StatefulSet (management unit for stateful Pods, like database services).
Pod workload¶
A Pod is a minimal unit of workload in Kubernetes, which represents a set of containers with common storage and network resources.
An example manifest file for Pod with NGINX server container looks like this:
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
name: nginx
spec:
containers:
- name: nginx
image: registry.hpc.ut.ee/mirror/library/nginx:1.27.1
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "200m"
ports:
- containerPort: 80
name: http
The major sections are:
apiVersion
- a version of Kubernetes API;kind
- a kind of Kubernetes resource,Pod
in this case;metadata
describes Pod name and labels for filtering;spec
describes a specification for containers, volumes, etc.;containers
describes settings for containers within a Pod including container images, exposed ports and computing resource limits.
Complete
First of all, create a separate namespace test
for example resources deployment and testing.
# Setup kubeconfig access file for centos user if you haven't done it on the previous lab
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# Create a namespace test
kubectl create namespace test
To deploy the Pod on Kubernetes cluster, put this manifest into a file (for example pod-nginx.yaml
) and run:
kubectl apply -f pod-nginx.yaml --namespace test
# pod/nginx created
You can get the Pod info:
kubectl get pod nginx -n test
# NAME READY STATUS RESTARTS AGE
# nginx 1/1 Running 0 2m43s
Feel free to login into the Pod's container and explore it:
kubectl exec -it -n test nginx -- /bin/bash
# root@nginx:/#
Use this command to get detailed info about the Pod.
kubectl describe pod nginx -n test
# Name: nginx
# Namespace: default
# ...
# Status: Running
# IP: 10.0.1.244
# ...
It does the space as nerdctl inspect container
, but also includes Kubernetes-related metadata.
Validate
You can access the server welcome page using pod's IP:
curl 10.0.1.244
#<a href="http://nginx.com/">nginx.com</a>.</p>
#
#<p><em>Thank you for using nginx.</em></p>
#...
Although accessing the Pod via browser isn't possible now, the next lab explains the way to do it.
Deployment workload¶
One of the most powerful features of Kubernetes is Pod lifecycle management. It covers many cases, for example: Pod restarts when an application container crashes with error. To handle these situations, Kubernetes has a higher-level structure called Deployment managing stateless Pods.
As you can test, when the NGINX Pod is removed, Kubernetes won't recreate it automatically:
kubectl delete pod nginx -n test
# pod "nginx" deleted
kubectl get pod nginx -n test
# Error from server (NotFound): pods "nginx" not found
Complete
Now, explore possibilites of a Deployment resource. The Deployment manifest:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: registry.hpc.ut.ee/mirror/library/nginx:1.27.1
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "200m"
ports:
- containerPort: 80
name: http
Put the content into deployment-nginx.yaml
file and create a Kubernetes resource:
kubectl apply -f deployment-nginx.yaml -n test
# deployment.apps/nginx created
Verify
After creation, you can find both the Deployment and the linked Pod in the same namespace:
kubectl get deployment nginx -n test
# NAME READY UP-TO-DATE AVAILABLE AGE
# nginx 1/1 1 1 35s
kubectl get pod -l app=nginx -n test
# NAME READY STATUS RESTARTS AGE
# nginx-74547bd6d7-smgmk 1/1 Running 0 51s
kubectl describe deployment nginx -n test
As you can see from the last command, the Deployment doesn't include any specific information about containers, rather the Pod template and availability info.
Use-case¶
Price Fetcher¶
Deployment¶
Now you can create a Kubernetes Deployment for the price-fetching-service
you created during previous labs.
Bug
The container image for price-fetcher-service
is in the registry available to your control-planne VM only. To deploy a container on Kubernetes, the image must be accessible for all the VMs. One of the options is to deploy a private registry in kubernetes and use it to get images, but we won't cover it in this lab. To simplify the process, we will use Dockerhub
Complete
Let's build and push image to the Dockerhub.
Firstly, create an account at Dockerhub and construct an auth file for kaniko builder:
# Better execute this script on your PC in security concerns (the following lines work for Linux)
# Paste username and password of your user into these 2 variables
export DOCKER_USER="change_me"
export DOCKER_PASSWORD="change_me"
# Encode your docker username and password using base64 and the format "username:password"
token=$(printf "%s:%s" "${DOCKER_USER}" "${DOCKER_PASSWORD}" | base64 | tr -d '\n')
# Print and copy the secret
echo $token
copy the printed value and create a docker-login.json
file in your VM:
{
"auths": {
"https://index.docker.io/v1/": {
"auth": "value_you_copied_at_the_previous_step"
}
}
}
/home/centos/docker-login.json
. Secondly, build and push an image
# Go to the directory with your price-fetching-service code
cd ...
# Build your image and push it to Dockerhub
export DOCKER_USER="change_me"
sudo nerdctl run --network host -v /home/centos/docker-login.json:/kaniko/.docker/config.json -v ${PWD}:/workspace gcr.io/kaniko-project/executor:latest --destination=docker.io/${DOCKER_USER}/price-fetcher-service:latest
Validate
Go to Dockerhub, login and view you image in the list.
Complete
Now, we are ready to create a Kubernetes Deployment for the price-fetcher-service
. Use the following content for the deployment-price-fetcher.yaml
:
apiVersion: apps/v1
kind: Deployment
metadata:
name: price-fetcher
spec:
selector:
matchLabels:
app: price-fetcher
template:
metadata:
labels:
app: price-fetcher
spec:
containers:
- name: price-fetcher
image: registry.hpc.ut.ee/mirror/CHANGE_ME/price-fetcher-service:latest
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "200m"
NB: don't forget to replace the CHANGE_ME
part in image name
We use registry.hpc.ut.ee/mirror/
as a mirror to avoid Dockerhub pull limit.
Now, start the Deployment in test
namespace.
kubectl apply -f deployment-price-fetcher.yaml -n test
# deployment.apps/nginx created
Validate
You can check running Deployment in the test
namespace:
kubectl get pods -n test -l app=price-fetcher
# NAME READY STATUS RESTARTS AGE
# price-fetcher-77675578cf-9js87 1/1 Running 0 4m44s
ConfigMap¶
For now, we have the app running, let's make it configurable. For example, we can add an option to modify a Cron schedule via ConfigMap. A ConfigMap is a set of key-value records, which can be used as environment variables or be mounted into containers as files.
Danger
In the following examples, we assume that the price fetcher script is inside the container in the /app
folder and the name of the script is price_fetcher.py
. This is slightly different from the example in the second lab.
The script location and name is likely to be different inside the container that you have prepared. So you will have to modify these details accordingly in the following manifests.
Complete
Create a ConfiMap mainfest file configmap-cron-schedule.yaml
with this content:
apiVersion: v1
kind: ConfigMap
metadata:
name: cron-schedule
namespace: test
labels:
app: price-fetcher
data:
cronjobs: |
0 12 * * * python3 /app/price_fetcher.py
Create a ConfigMap object in the test
namespace:
kubectl apply -f configmap-cron-schedule.yaml
# Validate it exists in the namespace
kubectl get configmap -n test -l app='price-fetcher'
# NAME DATA AGE
# cron-schedule 1 70s
After this, modify the deployment-price-fetcher.yaml
file the following way:
apiVersion: apps/v1
kind: Deployment
metadata:
name: price-fetcher
spec:
# ...
template:
metadata:
labels:
app: price-fetcher
spec:
containers:
- name: price-fetcher
# ...
volumeMounts: #(4)
- name: cronjobs-file #(5)
mountPath: /etc/crontabs/root #(6)
subPath: cronjobs #(7)
volumes: #(1)
- name: cronjobs-file #(2)
configMap:
name: cronjob-schedule #(3)
- List of volumes used by the deployment
- Volume name used as reference within the deployment
- ConfigMap's name been used
- List of volume mounted into the container
- Volume name (must match one in volumes list)
- A path where the volume been mounted
- A name of the file from the volume been mounted. This allows us to use only one record from it, instead of mounting the entire ConfiGmap as a dir.
Info
NB: when you use subPath
field, you need to restart the deployment every time, when content of the respecting record is updated. Kubernetes doesn't keep deployment up-to-date with the ConfigMaps mounted this way, so we have to do it manually. To restart a Deployment use:
kubectl rollout restart deployment example-deployment -n example-namespace
Complete
After modification of the deployment-price-fetcher-service.yaml
file, apply it to update the price-fetcher deployment:
kubectl apply -f deployment-price-fetcher-service.yaml -n test
Validate
After a cronjob is triggered, you can check if the prices.csv
file was fetched:
$ kubectl exec -it deployment/price-fetcher -n test -- sh
# ...
/app $ cat /tmp/price-data/prices.csv
# "Ajatempel (UTC)";"KuupƤev (Eesti aeg)";"NPS Eesti"
# ...
CronJob¶
Deployment is a useful resource in Kubernetes for stateful apps, but there is one resource, which fits better the needs of cronjob running. The resource is called CronJob
and it describes a Kubernetes Job running by a schedule.
Complete
The current price-fetcher-service
image already includes crond
as a start command for a container. Let's change the Dockerfile as we don't need the crond
in containers anymore:
FROM registry.hpc.ut.ee/mirror/library/alpine
...
# Changes start here:
# Start the python script
#(1)
CMD [ "python3", "/app/price_fetcher.py" ]
- Replace
crond
with a direct execution of the script.
A container with this image runs until the script is finished.
We need to push this image to the Dockerhub with a different name as the image does the same but a different way.
export DOCKER_USER="CHANGE_ME"
sudo nerdctl run --network host -v /home/centos/docker-login.json:/kaniko/.docker/config.json -v ${PWD}:/workspace gcr.io/kaniko-project/executor:latest --destination=docker.io/${DOCKER_USER}/price-fetcher-script:1.0.0 #(1)
- Note a different name: price-fetcher-script instead of price-fetcher-service as well as a numeric tag instead of latest.
Having a latest
image tag in Kubernetes workloads is not a good practice: it is unreliable as its is dynamic in time. For the price-fetcher-script image you better use fixed numeric version, for example, 1.0.0
.
Info
This script is a complete component of the use-case system, which should run in a separate namespace. For this, we need to create a production
namespace first and then deploy the production-ready components in it. The Kubernetes resources within this namespace will be checked by the scoring system. These checks will indicate completion of future labs.
Complete
Create a production
namespace:
kubectl create namespace production
Prepare a CronJob manifest for the price-fetcher-script:
apiVersion: batch/v1
kind: CronJob
metadata:
name: price-fetcher
labels:
app: electricity-calculator
microservice: price-fetcher
spec:
timeZone: "Etc/UTC" #(1)
schedule: "0 12 * * *" #(2)
jobTemplate: #(3)
spec:
template:
spec:
containers:
- name: price-fetcher
image: registry.hpc.ut.ee/mirror/CHANGE_ME/price-fetcher-script:1.0.0 #(4)
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "200m"
restartPolicy: OnFailure
- You can set a custom timezone for the CronJob
- Cron-styled schedule
- A job template
- Note the new image name, also replace CHANGE_ME with your username
The CronJob doesn't need a ConfigMap with a schedule file, because the schedule is a part of the manifest.
Now, deploy this CronJob into the same namespace:
kubectl apply -f cronjob-price-fetcher.yaml -n production
# cronjob.batch/price-fetcher created
kubectl get cronjob -l app=electricity-calculator -l microservice=price-fetcher -n production
# NAME SCHEDULE TIMEZONE SUSPEND ACTIVE LAST SCHEDULE AGE
# price-fetcher 0 12 * * * Etc/UTC False 0 <none> 24s
Validate
Trigger the CronJob manually to check if it works correctly:
# Trigger the CronJob manually
kubectl create job --from=cronjob/price-fetcher price-fetcher-cronjob-test -n production
# Check if a Job is created
kubectl get job price-fetcher-cronjob-test -n production
# Check if a pod is created
kubectl get pod -l job-name=price-fetcher-cronjob-test -n production
# View the Pod's logs
kubectl logs -l job-name=price-fetcher-cronjob-test -n production
For now, this CronJob doesn't save the prices.csv
file anywhere. In the future labs we will make the price-fetcher script send this file to the history-server service.
MinIO¶
MinIO is an S3-compatible object storage. In this course, it serves persistent storage needs of history-server. The server is not implemented yet, but in this lab you will deploy MinIO into your Kubernetes cluster and make use of it in the further labs.
Secret¶
In Kubernetes, the proper way to store static passwords, tokens, etc. is Secret resource. In this lab, you will store MinIO admin credentials in a Secret. These credentials are shared: MinIO server uses it for initial admin setup and a history server uses them to access the MinIO instance.
Info
A Secret represents a set of key-value pairs, where a key is an alias and a value is base64-encoded secret. This resource is namespace-based, therefore a Secret can't be shared between namespaces.
Complete
Create a file secret-minio.yaml
with this content:
apiVersion: v1
kind: Secret
metadata:
name: minio-secret
type: Opaque
data:
MINIO_ROOT_USER: "YWRtaW51c2Vy" #(1)
MINIO_ROOT_PASSWORD: "MjA5NWY0MTQ3ZTA2OTFjZGY2ZmFhNGMyNTNhOGZhOWEzOGQzMmE2Mg==" #(2)
- Base64-encoded "adminuser" username
- Base64-encoded random string used as password
Opaque type means the data in the secret is generic and doesn't relate to the Kubernetes cluster. The values in data
section must be encoded by a user, otherwise the cluster raises the error: error decoding from json: illegal base64 data
.
Apply the file to the Kubernetes cluster:
kubectl apply -f secret-minio.yaml -n production
# secret/minio-secret created
Validate
Make sure the value of the MINIO_ROOT_USER
key in the secret matches adminuser
string:
kubectl get secret minio-secret -n production -o jsonpath='{.data.MINIO_ROOT_USER}' | base64 -d
# adminuser
StatefulSet¶
The last step for this lab is creation of StatefulSet for the MinIO server. StatefulSet is a management resource as a Deployment, which controls stateful applications only. The key differences between the two:
- StatefulSet creates and removes pods in a strict order;
- Deployment Pods are identical and can be interchanged;
- Deployment assigns random name suffix for Pods, StatefulSet uses sequential numbers;
- Pods managed by a Deployment share the same persistent storage while StatefulSet creates a volume for each replica.
For now, the storage setup for Kubernetes is not covered by this lab, so instead of persistent volumes you use in-memory storage.
Complete
Use this manifest to create MinIO stateful set:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: minio
labels:
app: minio
spec:
selector:
matchLabels:
app: minio
serviceName: minio
replicas: 1
template:
metadata:
labels:
app: minio
spec:
containers:
- name: minio
image: registry.hpc.ut.ee/mirror/minio/minio:RELEASE.2024-09-13T20-26-02Z #(1)
args: #(2)
- server
- /storage
env: #(3)
- name: MINIO_ROOT_USER
valueFrom:
secretKeyRef:
name: minio-secret
key: MINIO_ROOT_USER
- name: MINIO_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: minio-secret
key: MINIO_ROOT_PASSWORD
ports:
- containerPort: 9000
hostPort: 9000
volumeMounts: #(4)
- name: minio-storage
mountPath: "/storage"
volumes: # (5)
- name: minio-storage
emptyDir: {} #(6)
- This tag is fixed and after every Pod recreation the image content stays the same
- Custom args for the container
- Environment with secret data
- List of volume mounts
- List of volumes
- For now, we use emptyDir - an ephemeral storage without persistence. It's lifecycle bound to a Pod's one.
As you can notice, the minio
StatefulSet references the minio-secret
in the manifest. This approach is helpful, when an application needs to share the credentials with a storage client service like history-server.
Validate
Now you can login into the minio-0
pod, create a storage alias and create a bucket in it.
# Create an alias for the current session
# Using this alias we can manage the object storage
mc alias set default http://0.0.0.0:9000 $MINIO_ROOT_USER $MINIO_ROOT_PASSWORD
# Create a "price-data" bucket in the storage
mc mb default/price-data
# Bucket created successfully `default/price-data`.
# Check if the bucket was created
mc ls default
#[... UTC] 0B price-data/