Skip to content

Lab 11 - Cluster administration

Overview

  • Tearing down existing monitoring
  • Installing Helm versions of Prometheus, Grafana and Loki
  • Kubernetes Audit logging
  • Upgrading the cluster
  • Using Kubeadm patches

Introduction

In this, Cluster administration lab, you'll delve into key aspects of monitoring, logging, and cluster management. You'll be destroying the manually installed monitoring layer, and leverage the power of new, fancy tools like operators and Helm to setup new, better versions. You'll be doing this, as understanding proper visibility into your cluster and applications is a large portion of complexity in microservices environments, and without it, it's impossible to understand your infrastructure.

This lab is going to also guide you through the process of upgrading a Kubernetes cluster, while trying to rely on best practices and real-world experience. While updates are generally important, they're especially so in Kubernetes, due to the frequency of new releases, and the way the whole Kubernetes ecosystem moves together with the versions.

Tearing down existing monitoring

As you are going to reinstall your monitoring stack in this lab, you can delete the old ones to make some space.

Complete

Delete the existing grafana, prometheus, loki, node exporter, kube-state-metrics and promtail deployments using kubectl delete.

Verify

Make sure everything has been deleted by checking whether the PVC-s have been deleted, and the whole lab 9 turns red in scoring.

Installing Prometheus and Grafana with Helm

Initial setup of the Prometheus stack using Helm is fairly straightforward. You'll be installing a version called kube-prometheus-stack, which is a fairly complex system of Prometheus, Grafana, Alertmanager, node-exporter, kube-state-metrics and an adapter for Kubernetes Metrics API-s, last of which you'll need for Horizontal Pod Autoscaler, or HPA, in the last lab in the course.

The repository for the stack is located here: https://github.com/prometheus-operator/kube-prometheus

You can also take a look at the Helm repository: https://prometheus-community.github.io/helm-charts

Complete

Setup the Helm repository as shown on the Helm repository site, using the helm repo add command.

You should then be able to find a list of charts with the helm search repo prometheus-community command. You'll be using the kube-prometheus-stack chart.

First, make sure to read the Helm values to a file, so you can edit and view the settings:

helm show values prometheus-community/kube-prometheus-stack > prometheus.yaml

Make sure to change the following settings:

  • The Prometheus volume setting prometheus.prometheusSpec.storageSpec:
  • Make sure to enable the PersistentVolumeClaim usage by uncommenting the relevant part, and removing the {} from behind the storageSpec.
  • Change the storageClassName to longhorn, which is what you use in the cluster.
  • Change the resources.requests.storage to 10Gi.
  • Set the grafana admin password with grafana.adminPassword setting.
  • Set the podMonitorSelectorNilUsesHelmValues and serviceMonitorSelectorNilUsesHelmValues to false. This is necessary for this Prometheus to pick up all the Pod and Service monitoring across the cluster.
  • Setup an ingress for Grafana by configuring the grafana.ingress setting. You can use the hostname grafana.<NODE_IP>.nip.io.

And then make sure to install the kube-prometheus-stack with helm install. You can use the changed values file as options with the --values prometheus.yaml addition. Make sure you install it to the prometheus namespace. And use the name prometheus for the helm deployment (otherwise scoring server might not find your resources).

Verify

You can verify the initial installation with the Helm output, it should say something like:

NAME: prometheus
LAST DEPLOYED: Mon Nov 20 09:12:12 2023
NAMESPACE: prometheus
STATUS: deployed
REVISION: 1
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
  kubectl --namespace prometheus get pods -l "release=prometheus"

Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.

You can try accessing Prometheus and Grafana via port forwarding, or Grafana via the ingress. There's no need for a Grafana NodePort service this time around, as you'll be using ingress.

Taking a look inside

When you now take a look inside your new Grafana instance, something you'll see is that Helm automatically setup quite a few rules, dashboards and alerts.

This is the base monitoring capacity a Kubernetes cluster should have nowadays, due to the simplicity of setting it up.

And this should be useful for answering any questions related to Kubernetes and it's related hosts.

As an exercise, you can try optimizing the workloads running in your cluster, to reduce CPU and memory consumption, by comparing actual resource consumption to the requests and limits.

Replicating previous lab's configuration with new stack

In the previous lab, you setup pod scrape monitoring with Prometheus to be able to detect and view how your application updates. As deleting the stack also deleted this configuration, replicate this with the new system.

In the new system, you could just take the Prometheus scrape configuration given to you in the previous lab, and shove it into the Helm values file. While this would work, the new Prometheus stack provides interfaces for doing this without writing any scrape configuration.

The kube-prometheus-stack introduces two new CRD-s for this purpose - ServiceMonitor and PodMonitor. These are automatically configured probes, that either go to probe services or pods. In previous lab's case, you configured Prometheus to directly scrape Pods, so this is what you need to configure now.

There's a middle step as well - because the kube-prometheus-stack is slightly opinionated on how things are supposed to work, you need to specify the scrape port with a port name, and move the scrape annotation to labels. In the previous lab, you did not configure a name for the scrape port (oversight on teachers part), so you'll need to do it now.

Configure Prometheus to scrape your application, by creating the PodMonitor object.

Complete

Setup the new PodMonitor manifest, and apply it to the cluster.

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: port-9101
spec:
  selector:
    matchLabels:
      "prometheus.io/scrape": "true"
  podMetricsEndpoints:
  - port: metrics
    interval: 10s
    path: /metrics
    relabelings:
    - action: replace
      sourceLabels:
        - __meta_kubernetes_pod_label_app
      targetLabel: app
    - action: replace
      sourceLabels:
        - __meta_kubernetes_pod_label_version
      targetLabel: version
  namespaceSelector:
    any: true

We add two relabel configurations to add the version and app labels to the time series, so the given Grafana dashboard would work, which uses these labels for aggregation.

As you can see, this configuration expects to find a Pod with label prometheus.io/scrape: true, so make sure to add these to your workloads. This will cause Prometheus to start colecting metrics from the /metrics endpoint.

On top of this, add a port named metrics to your workloads. In case of the last week's workloads, you'll need to add this part to spec.containers.ports:

- name: metrics
  containerPort: 9101

But this will now make it so, that any pod with correct label and port metrics defined, will be automatically scraped.

Verify

You can verify whether last weeks dashboards work, and whether Prometheus sees the targets under it's service discovery and targets list.

Install Loki with Helm

Loki is also much more straightforward to setup using Helm. It has been automated to install in a sensible fashion, and takes a lot of the difficult configuration out of the equation. From the Loki Helm chart documentation https://grafana.com/docs/loki/latest/setup/install/helm/install-monolithic/:

Info

If you set the singleBinary.replicas value to 1, this chart configures Loki to run the all target in a monolithic mode, designed to work with a filesystem storage. It will also configure meta-monitoring of metrics and logs. If you set the singleBinary.replicas value to 2 or more, this chart configures Loki to run a single binary in a replicated, highly available mode. When running replicas of a single binary, you must configure object storage.

Complete

Setup the Helm repository as shown on the Helm documentation site, using the helm repo add command.

You should then be able to find a list of charts with the helm search repo grafana command. You'll be using the loki chart.

Configure it with the following options:

  • loki.auth_enabled: false - this means Grafana won't need to authenticate to Loki to query logs. We will use Kubernetes security in the next lab, instead.
  • loki.persistence.enabled: true, loki.persistence.storageClassName: longhorn, loki.persistence.size: 10Gi - these make Loki use a PVC and persistent storage.
  • deploymentMode: SingleBinary - this tells Loki to use SingleBinary deployment mode
  • singleBinary.replicas: 1 - this makes Loki be installed in a single-binary, non-distributed mode, which can use filesystem storage for backing storage. Otherwise we would require S3 storage.
  • loki.storage.type: filesystem - make it use the filesystem storage.
  • loki.commonConfig.replication_factor: 1 - as we are installing it in one instance, on one filesystem, we won't need to replicate it, and waste space on replication.
  • test.enabled: false, serviceMonitor.enabled: false and monitoring.selfMonitoring.enabled: false - these settings turn on self-monitoring of the Loki instance. This would be important and useful in production, but in your case - where you don't have any alerting capability - it's just going to eat up resources, so you can disable them.
  • write.replicas: 0
  • read.replicas: 0
  • backend.replicas: 0
  • chunksCache.allocatedMemory: 1500 - Initial size of 8000 is too large for our cluster. Adjust this size larger if you run into issues with not enough memory in Loki pods.

You will also need to set up a schema config at loki.schemaConfig, which tells Loki what format to use for storing log files:

  schemaConfig:
    configs:
      - from: 2024-01-01
        object_store: filesystem
        store: tsdb
        schema: v13
        index:
          prefix: index_
          period: 24h

A useful fact about Helm values file is that you do not need to get the full values file, change the specific parts, and then use the full, 1600 line file, as the values file.

You can also just change the important bits, and all others will be defaulted to by Helm automatically. So, for an example, the correct values file for above settings would be:

loki:
  persistence:
    enabled: true
    storageClassName: longhorn
    size: 10Gi
  auth_enabled: false
  commonConfig:
    replication_factor: 1
  storage:
    type: 'filesystem'
  schemaConfig:
    configs:
      - from: 2024-01-01
        object_store: filesystem
        store: tsdb
        schema: v13
        index:
          prefix: index_
          period: 24h
deploymentMode: SingleBinary
singleBinary:
  replicas: 1
monitoring:
  selfMonitoring:
    enabled: false
test:
  enabled: false
serviceMonitor:
  enabled: false
write:
  # -- Number of replicas for the read
  replicas: 0
read:
  # -- Number of replicas for the read
  replicas: 0
backend:
  # -- Number of replicas for the backend
  replicas: 0
chunksCache:
  allocatedMemory: 1500

You can now install the chart to the loki namespace using the appropriate values.

Verify

You should see a bunch of containers starting up out of the gate:

  • loki-0: This is the core Loki pod. This pod is responsible for storing and querying your log data.

  • loki-canary-*: These are instances of Loki Canary, a tool used for monitoring and alerting on the logging pipeline's integrity. Loki Canary writes a log to Loki and then ensures it can query back those logs within a certain time frame, alerting if this is not the case.

  • loki-gateway-*: This is a gateway for the Loki service, handling incoming HTTP requests. It is responsible for distributing the queries to the Loki server backend, especially if there are multiple Loki instances.

  • loki-grafana-agent-operator-*: This is the Grafana Agent Operator, a component that automates the deployment and management of Grafana Agents. Grafana Agent is a telemetry collector sending metrics, logs, and trace data to a Grafana Cloud or other monitoring systems. We won't be using this operator.

As with the first time when you installed Loki, there's two things you need to do - setup promtail and Grafana to talk with Loki. Thankfully, this is made easier by Helm as well.

Installing Promtail

Installing Promtail is made simple with Helm, as well. The chart is in the same repository as Loki's, so it'll be very easy to setup.

Complete

Install the chart grafana/promtail to the loki namespace. This time around, you won't need to specify any extra values, as the default ones work nicely for our purpose, currently.

Verify

After installation, you should have two promtail pods, that instantly start scraping logs. They might complain something like:

level=error ts=2023-11-20T14:34:29.141649402Z caller=client.go:430 component=client host=loki-gateway msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): 12 errors like: entry for stream '{app=\"teacher-test.cloud.ut.ee\", component=\"kube-apiserver\", container=\"kube-apiserver\", filename=\"/var/log/pods/kube-system_kube-apiserver-teacher-test.cloud.ut.ee_ff01467ad8a05d6aad9e55f6149157dc/kube-apiserver/14.log\", job=\"kube-system/teacher-test.cloud.ut.ee\", namespace=\"kube-system\", node_name=\"teacher-test.cloud.ut.ee\", pod=\"kube-apiserver-teacher-test.cloud.ut.ee\", stream=\"stderr\"}' has timestamp too old: 2023-09-19T05:44:10Z, oldest acceptable timestamp is: 2023-11-13T14:34:29Z; 9 errors like: entry for stream '{app=\"teacher-test.cloud.ut.ee\", component=\"kube-apiserver\", container=\"kube-apiserver\", filename=\"/var/log/pods/kube-system_kube-apiserver-teacher-test.cloud.ut.ee_ff01467ad8a05d6aad9e55f6149157dc/kube-apiserver/14.log\", job=\"kube-system/teacher-test.cloud.ut.ee\", namespace=\"kube-system\", node_name=\"teacher-test.cloud.ut.ee\", pod=\"kube-apiserver-teacher-test.cloud.ut.ee\", stream=\"stderr\"}' has timestamp too old: 2023-09-19T00:44:52Z, oldest acceptable timesta"

This is normal and fine - Loki, by default, only accepts data from up to a week ago. Your machines have much older logs on our file system, but they aren't that useful anyways that long ago.

You can also check the UI of promtail, and see how much configuration gets automatically done for you.

Configuring Grafana to use Loki data source

Out of the gate, Grafana does not know how to connect to Loki, to query the logs. Thankfully, the developers of the kube-prometheus-stack Helm chart have thought about this problem, and have exposed a Helm values file setting called grafana.additionalDataSources, where you can configure other data sources Grafana should connect to.

Complete

Add a new element to the grafana.additionalDataSources:

  • name: loki
  • access: proxy
  • isDefault: false
  • orgId: 1
  • url: DNS and port of the loki service in loki namespace.
  • version: 1
  • type: loki

This configures all the necessary settings for Grafana to connect to Loki, automatically. Upgrade your Prometheus Helm release with the new values file.

Verify

There are two things you can verify.

  1. Check if the Loki datasource works by going to the data sources configuration in Grafana UI. The Loki data source options should be greyed out, but you still have a test button in the bottom. This test should succeed.
  2. Check for Loki logs in the explore window. Label browser, in the explore window, should show a lot of sensible labels, like namespace, and they should be filled out with values, which you can easily use for queries. On top of this, all the log lines have proper time connected to them.

If everything is working, you now have a fully working logging and monitoring stack, which is much easier to configure than a custom-made Kubectl manifest bundle.

Setting up audit logs

Kubernetes makes some things very easy with it's REST API based architecture, because HTTP protocol is inherently transactional. Transactionality makes it, theoretically, also very auditable, as you can always tell who did what kind of operation in the cluster. This kind of tracking is called auditing or audit logging, and is very useful for both debugging issues for cluster users (for example, why can't I access this?), and also security (for example, which user brought down the production database pod).

Audit logging needs to be configured manually - Kubernetes does not default to any kind of audit logging. This is because Kubernetes API (which is what writes the audit logs) gets a lot of requests, and logging all of them is necessary only in the most secure of installations, and is going to consume a lot of storage and IO.

Instead, Kubernetes allows you to write a policy, defining what and how to audit. In this section, you'll be setting up audit logging with the Google Kubernetes Engine audit policy, which is shown here: GitHub.

Now, this policy is definitely not perfect, but through their countless hours of running Kubernetes clusters, this is what they've settled on.

Enabling audit logging requires you to re-configure the cluster partially, as you'll need to feed some settings to the Kubernetes API server. This needs to be done only on control plane nodes.

Complete

First, take the audit policy, and write it into a YAML file in /etc/kubernetes, for example /etc/kubernetes/audit.yaml. Make sure it is formatted as a proper YAML file. Also delete the two elements that have a variable substitution (YAML blocks which have variables starting with $), we're fine with logging them on metadata level.

The second part is editing the ClusterConfiguration, which, by Kubernetes' own documentation, is edited like this: kubernetes.io.

You'll be doing something similar. First, take the currently used cluster configuration from the appropriate configmap:

kubectl get cm -n kube-system kubeadm-config

Write it into a random file, and remove the metadata part. For example, like this: kubectl get cm -n kube-system kubeadm-config -o jsonpath='{.data.ClusterConfiguration}' > /root/clusterconfiguration.yaml

Now, edit it. Set the following options:

  • apiServer.extraArgs.audit-log-path: /var/log/kubernetes/audit.log - tells the API where to write the audit logs.
  • apiServer.extraArgs.audit-log-maxage: '1' - tells the API server to keep the logs for 1 day. You won't need more, as you'll be sending them off to your Loki as soon as possible.
  • apiServer.extraArgs.audit-log-maxbackup: '1' - tells the API server to keep 1 rotation of the logs, if they get full.
  • apiServer.extraArgs.audit-log-maxsize: '1000' - tells the API server to rotate the logs if the file becomes larger than 1000MB.
  • apiServer.extraArgs.audit-policy-file: <audit_policy_location> - tells the API server where to find the audit policy file.

Now, while the API now has correct configuration option, if you saved this file and uploaded it to the cluster, the APIserver would start failing. This is because, inside the Kubernetes API container, it does not have the log path and policy file location mounted, so it does not find the files and folders specified. Remedy this by also specifying a apiServer.extraVolumes option, which is a YAML list.

You need to define two elements in this list:

  1. An element for the policy file. Give it a name, set the hostPath to the path on the OS (where you added the policy file), and set mountPath to the same path. It's also a good idea to set readOnly: true, as API does not need to write to this file, and as this is a single file, set pathType: File.
  2. An element for the log directory. Give it a name, set the hostPath to the logs path on the OS (the directory, for example: /var/log/kubernetes), and mountPath to the same. Do not set readOnly as API needs to write logs to this directory. As you're dealing with a directory, which is the default, you won't need to set the pathType either.

Make sure that the log folder /var/log/kubernetes/ actually exists in the controller node.

Now that this has been done, you can update your control plane configuration. If you had multiple control planes, you would have to configure all of them one-by-one. Example:

kubeadm init phase control-plane apiserver --config /root/clusterconfiguration.yaml

Make sure to also load this configuration back up to the cluster, as the ConfigMap values are what are used for future upgrades.

kubectl create cm kubeadm-config --from-file=ClusterConfiguration=/root/clusterconfiguration.yaml -n kube-system --dry-run=client -o yaml | kubectl apply -f -

This command renders the ClusterConfiguration file as a ConfigMap, and then applies it to the cluster, overwriting the old version.

Verify

If the format is correct, you'll get the following line [control-plane] Creating static Pod manifest for "kube-apiserver". You might also, for a minute or two, lose the connectivity to the API, as the API will be rebooting. You can check the logs from Loki, or from /var/log/pods.

If the API does not come up, check the logs why. Edit the configuration, and upload it, until it'll work. Your cluster during that time is effectively down for changes, but workloads will continue running. You can also check the running containers with nerdctl if kubectl is not working, for example: sudo nerdctl -n k8s.io ps

If you did everything correctly, you should see a lot of logs in /var/log/kubernetes/audit.log in JSON format. You'll see how much logs a fairly empty and small Kubernetes cluster creates, which demonstrates why this functionality is not default. Each noteworthy event gets its own logline, which is easiest to query using jq over the command line.

Sending audit logs to Loki

These kinds of logs are only mildly useful when they stay on the host that they were emitted from. This is both because of security reasons - where if your node gets compromised, the attacker can delete the logs - and because of the fact, that these kinds of logs usually need to be stored for years.

In a real-world environment, you'd also not run the log aggregation system on the same system which produces the logs, for the same reasons as listed.

Configure Promtail to send the Kubernetes audit logs to Loki.

Complete

You installed Promtail without any specific configuration before. Now you'll need to improve it, so that it can access and forward the logs.

The first step is to allow Promtail access to the audit logs, by configuring extra volumes and mounts for the Promtail containers. Create a new Helm values file for promtail, and fill in extraVolumes and extraVolumeMounts to mount the path /var/log/kubernetes to the Promtail containers.

You can use the defaultVolumes and defaultVolumeMounts as reference from the default values file: Promtail Helm values

The second step is to configure Promtail to actually scrape the files. We can extend the scrape configuration with config.snippets.extraScrapeConfigs setting.

Set the config.snippets.extraScrapeConfigs to be the following:

config:
  snippets:
    extraScrapeConfigs: |
      - job_name: kubernetes-audit
        pipeline_stages:
        - json: 
            expressions:
                user: user
                verb: verb
                objectRef:
                annotations:
                timestamp: requestReceivedTimestamp
        - timestamp:
            source: timestamp
            format: RFC3339
        - json:
            expressions: 
                resource:
                namespace:
                apiGroup:
            source: objectRef
        - json:
            expressions:
                decision: '"authorization.k8s.io/decision"'
                reason: '"authorization.k8s.io/reason"'
            source: annotations
        - json:
            expressions:
                username:
                groups: 
            source: user
        - labels:
            username:
            groups: 
            verb: 
            resource: 
            namespace: 
            apiGroup: 
            decision: 
            reason: 
        static_configs:
        - targets:
            - localhost
          labels:
            job: auditlog
            __path__: /var/log/kubernetes/audit.log

A lot of work in this configuration goes to parsing the JSON values as labels for better readability on Loki side. This allows to understand the logs better later on, and categorize or visualize the data. You can check which other values are available by reading the Kubernetes audit logs with jq.

Upgrade the Promtail with these Helm values.

Verify

You can check the Promtail UI, whether the proper files are being scraped.

You can also check the Grafana UI, in Loki explorer. The audit logs should have a label job=auditlog. The logs, when opened up, should also have the different labels like verb or username available.

Tracing (optional)

Important

This part of the lab is optional. You can do it to learn new technologies and concepts, but there won't be any scoring checks to see, if you've completed it.

The point of this is to evaluate whether it makes this lab too long or difficult.

Tracing is a method of monitoring and observing the flow of requests as they traverse through a distributed system. It records details of individual operations (called spans) that collectively form a trace, showing the entire lifecycle of a request across multiple services.

So, to reiterate:

  • Span: A single unit of work in a system (For example an HTTP request, database query).
  • Trace: A collection of spans that together represent the full lifecycle of a request.

Tracing is usually done in the context of an application, but more and more infrastructure level software also provide interfaces to trace their activities nowadays. Utilizing tracing requires, similar to other monitoring, to have something that sends or publishes the data, and a server that receives and allows to query the data.

There are a lot of systems that provide this functionality, but the one you'll be setting up is Jaeger, an open source, distributed tracing platform. The benefits of Jaeger for us is that it is meant to do tracing only, not a million other things on top of it, which keeps it simple and understandable.

Complete

Jaeger has multiple ways of running it, some a bit more complex than others. You need to only run it in the AllInOne mode, where it provides a fully functioning Jaeger without replication or saving the traces into other systems, which would be important in production grade deployments.

Installing Jaeger into Kubernetes is simplest done via the Jaeger operator, which handles the running and lifecycle management of the Jaeger server for you: https://www.jaegertracing.io/docs/1.63/operator/

Also, keep in mind the Prerequisite section - Jaeger requires cert-manager to operate, so you will need to install that first.

Do the following tasks:

  • Install cert-manager. You can use the Default static install method.
  • Install jaeger-operator. You will need to install it in the cluster wide mode into the observability namespace.

Once this is done, you should have a single pod running inside the observability namespace. If it's green, it means everything is working.

Then, it's time to tell the operator to deploy Jaeger in your application namespace. To do this, deploy the CRD into the namespace where your application server runs:

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: simplest

This deploys the simplest Jaeger instance, which can then be used to collect traces from your applications.

Verify

You can verify whether Jaeger started up by checking if a pod named simplest was created, and is running in your namespace.

The container also has a plethora of ports, which the operator created services for. You might want to create your own services for easier access, or use the port forwarding feature to access.

  • Port 16686 is the UI of Jaeger, where you will be able to see traces.
  • Port 4317 or 4318 are the ports where you will need to send your traces.

Once you have a running Jaeger server, the other step is to configure your applications to send data to Jaeger. This is called instrumenting your application, which is usually done via some libraries, depending on the programming language used: https://www.jaegertracing.io/docs/1.63/client-features/

Jaeger subscribes to the OpenTelemetry telemetry data model, which is a set of API, SDK, and tools to improve interoperability between different telemetry systems, including tracing. This means you need to use OpenTelemetry libraries to send traces to Jaeger.

OpenTelemetry supports two main instrumentation methods - manual and automatic:

  • Manual is used to instrument your application, and export custom spans with custom data. Very useful in production environments where you want to have a lot of visibility into the context of an application, as you can control it very well.
  • Automatic is an easier way to get the baseline amount of traces out of the system. It's easier to setup and get running, and provides a lot of information out of the gate, but providing custom visibility is a bit more difficult.

You will need to see whether automatic instrumentation is provided for your language, and decide whether to use that. You will need to import the correct libraries, and set the environment variables to export your traces. For example, in Python and Flask:

With automatic instrumentation, you require the following libraries:

opentelemetry-instrumentation
opentelemetry-distro
opentelemetry-exporter-otlp
opentelemetry-instrumentation-flask
opentelemetry-exporter-jaeger

And need to configure at least these environment variables:

  • OTEL_SERVICE_NAME - Name of your service when viewed from Jaeger.
  • OTEL_EXPORTER_OTLP_TRACES_ENDPOINT - Where to send the traces. Do keep in mind there's one port for HTTP and one for gRPC.
  • OTEL_TRACES_EXPORTER - Configure this as console,otlp so that traces are written to both console, and sent off via OTLP.

And finally, you will need to execute your python program via opentelemetry-instrument python myapp.py, instead of calling the app directly.

Manual instrumentation is more complex, and depends on how you're trying to instrument the application precicely. There are a lot of examples in the opentelemetry-python repository, which solve it in different ways.

Our recommendation is to keep it simple at first by using the opentelemetry-api and opentelemetry-sdk libraries in Python, and get it running with simplest trace configuration.

Complete

Your task is to instrument your application, and send traces off to Jaeger.

Verify

You should be able to see the traces of your application in Jaeger.

Upgrading Kubernetes

Kubernetes follows a 4-month release schedule, which means that not constantly upgrading Kubernetes means, you'll be behind in upgrades very soon. As new functionalities and existing software updates (for example Ingress controllers) are written to match with the latest Kubernetes version, you can very quickly find yourself in a situation where you can't install new software to the cluster because you're a version or two behind.

This was especially true with, for example, going from version 1.24 to 1.25, as a very massive and important security feature in Kubernetes, called PodSecurityPolicy, got fully deprecated, as old software that tried to deploy to new cluster couldn't, due to trying to deploy manifests to an API, that did not exist any more. Same with new software deploying to old clusters. This is why, in several Helm charts, you can still see references to PSP, which is short for PodSecurityPolicy.

This is a general theme with Kubernetes - each version might deprecate or remove specific resource definitions and API versions, and it's up to the cluster admins and software developers to manage this.

Generally, it's easiest to handle this by staying on the latest Kubernetes version, but you need to be careful and make sure all the software is capable of running on the latest version. Some software moves slower, either due to complexity or because the developers do not have enough man-power. One such example is Longhorn, which usually doesn't have a new version out by every new Kubernetes minor version release. Installing new version in that case can, potentially irrevocably (without backups), corrupt the data.

Kubernetes follows semantic versioning, where upgrades inside the same minor version are straightforward and compatible, and do not break API. Your cluster is currently on version 1.30.X. You'll be upgrading it to a newer minor version 1.31.Y (latest version currently), to go through and learn the process.

To see whether an API gets deprecated, carefully read through the Kubernetes v1.31: Planternetes release notes.

The documentation for upgrading in Kubernetes documentation is here: kubernetes.io

You're recommended to try to upgrade using this document, instead of the lab materials, as it gives a better real-world experience. Also, it might explain some concepts better.

The upgrade workflow at high level is the following:

  1. Upgrade a primary control plane node.
  2. Upgrade additional control plane nodes.
  3. Upgrade worker nodes.

You need to only do steps 1. and 3., as you only have a single control plane node.

Complete

First, convert your machine's yum to search for newer minor version, by changing the repository file at /etc/yum.repos.d/kubernetes.repo. Change the references to v1.30 to v1.31, and trigger a cache rebuild yum clean all && yum makecache.

Starting with the control plane node, find the version you'll be upgrading to:

yum list --showduplicates kubeadm --disableexcludes=kubernetes

From this list, find the latest kubernetes version. At time of writing, the latest patch version was 1.31.2. This will be referred to as <version> from now on.

Upgrade the node utilities to the new version:

yum install -y kubeadm-'<version>-*' --disableexcludes=kubernetes

Check whether kubeadm got updated: kubeadm version And start planning the upgrade: kubeadm upgrade plan.

Make sure kubeadm doesn't tell you that you need to manually upgrade something (except Kublet), and choose the version to upgrade to, with the kubeadm upgrade apply v<version> command.

Verify

It will process for a few minutes, but in the end, you should receive a success message, similar to this:

[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Because your cluster has only one control plane node, your cluster will have downtime. In a multi-control-plane situation, the kubeadm tool would upgrade only the current node, and you'd have to do a kubeadm upgrade apply on the other nodes, one by one.

These were the necessary steps to upgrade the control plane components. You'd run this on all the control plane nodes, before continuing forward, but this did nothing for the nodes themselves - your kubectl and kubelet are still not upgraded on the nodes.

Upgrading the node-level components in a way, where workloads don't experience downtime, requires all workloads and system level components to have at least 2 replicas. On top of this, you'd need at least multiple control planes, and enough space on other workers to completely be able to migrate the workloads from one node to the other ones.

The steps are usually the following:

  1. Drain the node with kubectl drain <nodename> --ignore daemonsets. This is the so-called migration step, which just kills the pods on the node and starts them up on another. This is why workloads need at least 2 replicas, which are on different nodes.
  2. Update the binaries, restart the kubelet. Verify everything works.
  3. Uncordon (resume) the node with kubectl uncordon <nodename>.

You can play this through on the worker node, but in your case, upgrading control plane node is going to make the cluster experience downtime.

Complete

Start with the control plane node, install new version of kubelet and kubectl:

yum install -y kubelet-'<version>-*' kubectl-'<version>-*' --disableexcludes=kubernetes

Then, reload the kubelet on the control plane node. This will restart all the pods on the node.

systemctl daemon-reload
systemctl restart kubelet

Verify

Wait for the clsuter to resume to working state, and play this through properly on the worker node. You can check the version of control plane node with kubectl get nodes.

Make sure all the control plane components are running before continuing.

To check what are the current versions of control plane components you can check the container image versions of pods in kube-system namespace: kubectl describe pods -n kube-system | grep "Image:" | grep "kube-".

Now, upgrade the worker node properly.

Complete

First, make sure the worker node has new repository version configured.

Drain the node with kubectl drain <worker-node-name> --ignore-daemonsets --delete-emptydir-data. This might take time. As you can probably tell, it will destroy all the pods on the node, and their emptydir data. Experience shows, that some very stateful applications might have difficulties getting deleted with commands like this, in which case, at some point, you can cancel the command - most of the most important workloads have left already, and you can continue with the upgrade.

Install the new kubelet and kubectl versions:

yum install -y kubelet-'<version>-*' kubectl-'<version>-*' --disableexcludes=kubernetes

And reload the kubelet the same way as you did with control plane:

systemctl daemon-reload
systemctl restart kubelet

Something you should notice here, is that you can still use the cluster during this time. This might also be a good time to upgrade the OS level packages, if needed, and reboot the host.

Verify

If everything worked out, the kubectl get nodes command should show the worker node on the new version. If that is the case, you can continue and uncordon the worker node.

kubectl uncordon <worker-node-name>

Kubernetes should start migrating workloads back to the worker node fairly quickly, especially Longhorn.

Also update the second worker node the same way.

The Kubernetes cluster is now on a newer and better version. This would be a good idea to have a look at the workloads and logs, to make sure everything is working as intended.