Lab 11 - Cluster administration
Overview¶
- Tearing down existing monitoring
- Installing Helm versions of Prometheus, Grafana and Loki
- Kubernetes Audit logging
- Upgrading the cluster
- Using Kubeadm patches
Introduction¶
In this, Cluster administration lab, you'll delve into key aspects of monitoring, logging, and cluster management. You'll be destroying the manually installed monitoring layer, and leverage the power of new, fancy tools like operators and Helm to setup new, better versions. You'll be doing this, as understanding proper visibility into your cluster and applications is a large portion of complexity in microservices environments, and without it, it's impossible to understand your infrastructure.
This lab is going to also guide you through the process of upgrading a Kubernetes cluster, while trying to rely on best practices and real-world experience. While updates are generally important, they're especially so in Kubernetes, due to the frequency of new releases, and the way the whole Kubernetes ecosystem moves together with the versions.
Tearing down existing monitoring¶
As you are going to reinstall your monitoring stack in this lab, you can delete the old ones to make some space.
Complete
Delete the existing grafana, prometheus, loki, node exporter, kube-state-metrics
and promtail
deployments using kubectl delete
.
Verify
Make sure everything has been deleted by checking whether the PVC
-s have been deleted, and the whole lab 9 turns red in scoring.
Installing Prometheus and Grafana with Helm¶
Initial setup of the Prometheus stack using Helm is fairly straightforward. You'll be installing a version called kube-prometheus-stack
, which is a fairly complex system of Prometheus, Grafana, Alertmanager, node-exporter
, kube-state-metrics
and an adapter for Kubernetes Metrics API-s, last of which you'll need for Horizontal Pod Autoscaler
, or HPA, in the last lab in the course.
The repository for the stack is located here: https://github.com/prometheus-operator/kube-prometheus
You can also take a look at the Helm repository: https://prometheus-community.github.io/helm-charts
Complete
Setup the Helm repository as shown on the Helm repository site, using the helm repo add
command.
You should then be able to find a list of charts with the helm search repo prometheus-community
command. You'll be using the kube-prometheus-stack
chart.
First, make sure to read the Helm values to a file, so you can edit and view the settings:
helm show values prometheus-community/kube-prometheus-stack > prometheus.yaml
Make sure to change the following settings:
- The Prometheus volume setting
prometheus.prometheusSpec.storageSpec
: - Make sure to enable the
PersistentVolumeClaim
usage by uncommenting the relevant part, and removing the{}
from behind thestorageSpec
. - Change the
storageClassName
tolonghorn
, which is what you use in the cluster. - Change the
resources.requests.storage
to10Gi
. - Set the grafana admin password with
grafana.adminPassword
setting. - Set the
podMonitorSelectorNilUsesHelmValues
andserviceMonitorSelectorNilUsesHelmValues
tofalse
. This is necessary for this Prometheus to pick up all the Pod and Service monitoring across the cluster. - Setup an ingress for Grafana by configuring the
grafana.ingress
setting. You can use the hostnamegrafana.<NODE_IP>.nip.io
.
And then make sure to install the kube-prometheus-stack
with helm install
. You can use the changed values file as options with the --values prometheus.yaml
addition. Make sure you install it to the prometheus
namespace. And use the name prometheus
for the helm deployment (otherwise scoring server might not find your resources).
Verify
You can verify the initial installation with the Helm output, it should say something like:
NAME: prometheus
LAST DEPLOYED: Mon Nov 20 09:12:12 2023
NAMESPACE: prometheus
STATUS: deployed
REVISION: 1
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
kubectl --namespace prometheus get pods -l "release=prometheus"
Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.
You can try accessing Prometheus and Grafana via port forwarding, or Grafana via the ingress. There's no need for a Grafana NodePort
service this time around, as you'll be using ingress.
Taking a look inside¶
When you now take a look inside your new Grafana instance, something you'll see is that Helm automatically setup quite a few rules, dashboards and alerts.
This is the base monitoring capacity a Kubernetes cluster should have nowadays, due to the simplicity of setting it up.
And this should be useful for answering any questions related to Kubernetes and it's related hosts.
As an exercise, you can try optimizing the workloads running in your cluster, to reduce CPU and memory consumption, by comparing actual resource consumption to the requests and limits.
Replicating previous lab's configuration with new stack¶
In the previous lab, you setup pod scrape monitoring with Prometheus to be able to detect and view how your application updates. As deleting the stack also deleted this configuration, replicate this with the new system.
In the new system, you could just take the Prometheus scrape configuration given to you in the previous lab, and shove it into the Helm values file. While this would work, the new Prometheus stack provides interfaces for doing this without writing any scrape configuration.
The kube-prometheus-stack
introduces two new CRD-s for this purpose - ServiceMonitor
and PodMonitor
. These are automatically configured probes, that either go to probe services or pods. In previous lab's case, you configured Prometheus to directly scrape Pods, so this is what you need to configure now.
There's a middle step as well - because the kube-prometheus-stack
is slightly opinionated on how things are supposed to work, you need to specify the scrape port with a port name, and move the scrape annotation to labels. In the previous lab, you did not configure a name for the scrape port (oversight on teachers part), so you'll need to do it now.
Configure Prometheus to scrape your application, by creating the PodMonitor
object.
Complete
Setup the new PodMonitor
manifest, and apply it to the cluster.
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: port-9101
spec:
selector:
matchLabels:
"prometheus.io/scrape": "true"
podMetricsEndpoints:
- port: metrics
interval: 10s
path: /metrics
relabelings:
- action: replace
sourceLabels:
- __meta_kubernetes_pod_label_app
targetLabel: app
- action: replace
sourceLabels:
- __meta_kubernetes_pod_label_version
targetLabel: version
namespaceSelector:
any: true
We add two relabel configurations to add the version
and app
labels to the time series, so the given Grafana dashboard would work, which uses these labels for aggregation.
As you can see, this configuration expects to find a Pod with label prometheus.io/scrape: true
, so make sure to add these to your workloads. This will cause Prometheus to start colecting metrics from the /metrics
endpoint.
On top of this, add a port named metrics
to your workloads. In case of the last week's workloads, you'll need to add this part to spec.containers.ports
:
- name: metrics
containerPort: 9101
But this will now make it so, that any pod with correct label and port metrics
defined, will be automatically scraped.
Verify
You can verify whether last weeks dashboards work, and whether Prometheus sees the targets under it's service discovery and targets list.
Install Loki with Helm¶
Loki is also much more straightforward to setup using Helm. It has been automated to install in a sensible fashion, and takes a lot of the difficult configuration out of the equation. From the Loki Helm chart documentation https://grafana.com/docs/loki/latest/setup/install/helm/install-monolithic/:
Info
If you set the singleBinary.replicas value to 1, this chart configures Loki to run the all target in a monolithic mode, designed to work with a filesystem storage. It will also configure meta-monitoring of metrics and logs. If you set the singleBinary.replicas value to 2 or more, this chart configures Loki to run a single binary in a replicated, highly available mode. When running replicas of a single binary, you must configure object storage.
Complete
Setup the Helm repository as shown on the Helm documentation site, using the helm repo add
command.
You should then be able to find a list of charts with the helm search repo grafana
command. You'll be using the loki
chart.
Configure it with the following options:
loki.auth_enabled: false
- this means Grafana won't need to authenticate to Loki to query logs. We will use Kubernetes security in the next lab, instead.loki.persistence.enabled: true
,loki.persistence.storageClassName: longhorn
,loki.persistence.size: 10Gi
- these make Loki use a PVC and persistent storage.deploymentMode: SingleBinary
- this tells Loki to use SingleBinary deployment modesingleBinary.replicas: 1
- this makes Loki be installed in a single-binary, non-distributed mode, which can use filesystem storage for backing storage. Otherwise we would require S3 storage.loki.storage.type: filesystem
- make it use the filesystem storage.loki.commonConfig.replication_factor: 1
- as we are installing it in one instance, on one filesystem, we won't need to replicate it, and waste space on replication.test.enabled: false
,serviceMonitor.enabled: false
andmonitoring.selfMonitoring.enabled: false
- these settings turn on self-monitoring of the Loki instance. This would be important and useful in production, but in your case - where you don't have any alerting capability - it's just going to eat up resources, so you can disable them.write.replicas: 0
read.replicas: 0
backend.replicas: 0
chunksCache.allocatedMemory: 1500
- Initial size of 8000 is too large for our cluster. Adjust this size larger if you run into issues with not enough memory in Loki pods.
You will also need to set up a schema config at loki.schemaConfig
, which tells Loki what format to use for storing log files:
schemaConfig:
configs:
- from: 2024-01-01
object_store: filesystem
store: tsdb
schema: v13
index:
prefix: index_
period: 24h
A useful fact about Helm values file is that you do not need to get the full values file, change the specific parts, and then use the full, 1600 line file, as the values file.
You can also just change the important bits, and all others will be defaulted to by Helm automatically. So, for an example, the correct values file for above settings would be:
loki:
persistence:
enabled: true
storageClassName: longhorn
size: 10Gi
auth_enabled: false
commonConfig:
replication_factor: 1
storage:
type: 'filesystem'
schemaConfig:
configs:
- from: 2024-01-01
object_store: filesystem
store: tsdb
schema: v13
index:
prefix: index_
period: 24h
deploymentMode: SingleBinary
singleBinary:
replicas: 1
monitoring:
selfMonitoring:
enabled: false
test:
enabled: false
serviceMonitor:
enabled: false
write:
# -- Number of replicas for the read
replicas: 0
read:
# -- Number of replicas for the read
replicas: 0
backend:
# -- Number of replicas for the backend
replicas: 0
chunksCache:
allocatedMemory: 1500
You can now install the chart to the loki
namespace using the appropriate values.
Verify
You should see a bunch of containers starting up out of the gate:
-
loki-0
: This is the core Loki pod. This pod is responsible for storing and querying your log data. -
loki-canary-*
: These are instances of Loki Canary, a tool used for monitoring and alerting on the logging pipeline's integrity. Loki Canary writes a log to Loki and then ensures it can query back those logs within a certain time frame, alerting if this is not the case. -
loki-gateway-*
: This is a gateway for the Loki service, handling incoming HTTP requests. It is responsible for distributing the queries to the Loki server backend, especially if there are multiple Loki instances. -
loki-grafana-agent-operator-*
: This is the Grafana Agent Operator, a component that automates the deployment and management of Grafana Agents. Grafana Agent is a telemetry collector sending metrics, logs, and trace data to a Grafana Cloud or other monitoring systems. We won't be using this operator.
As with the first time when you installed Loki, there's two things you need to do - setup promtail
and Grafana
to talk with Loki. Thankfully, this is made easier by Helm as well.
Installing Promtail¶
Installing Promtail is made simple with Helm, as well. The chart is in the same repository as Loki's, so it'll be very easy to setup.
Complete
Install the chart grafana/promtail
to the loki
namespace. This time around, you won't need to specify any extra values, as the default ones work nicely for our purpose, currently.
Verify
After installation, you should have two promtail
pods, that instantly start scraping logs. They might complain something like:
level=error ts=2023-11-20T14:34:29.141649402Z caller=client.go:430 component=client host=loki-gateway msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): 12 errors like: entry for stream '{app=\"teacher-test.cloud.ut.ee\", component=\"kube-apiserver\", container=\"kube-apiserver\", filename=\"/var/log/pods/kube-system_kube-apiserver-teacher-test.cloud.ut.ee_ff01467ad8a05d6aad9e55f6149157dc/kube-apiserver/14.log\", job=\"kube-system/teacher-test.cloud.ut.ee\", namespace=\"kube-system\", node_name=\"teacher-test.cloud.ut.ee\", pod=\"kube-apiserver-teacher-test.cloud.ut.ee\", stream=\"stderr\"}' has timestamp too old: 2023-09-19T05:44:10Z, oldest acceptable timestamp is: 2023-11-13T14:34:29Z; 9 errors like: entry for stream '{app=\"teacher-test.cloud.ut.ee\", component=\"kube-apiserver\", container=\"kube-apiserver\", filename=\"/var/log/pods/kube-system_kube-apiserver-teacher-test.cloud.ut.ee_ff01467ad8a05d6aad9e55f6149157dc/kube-apiserver/14.log\", job=\"kube-system/teacher-test.cloud.ut.ee\", namespace=\"kube-system\", node_name=\"teacher-test.cloud.ut.ee\", pod=\"kube-apiserver-teacher-test.cloud.ut.ee\", stream=\"stderr\"}' has timestamp too old: 2023-09-19T00:44:52Z, oldest acceptable timesta"
This is normal and fine - Loki, by default, only accepts data from up to a week ago. Your machines have much older logs on our file system, but they aren't that useful anyways that long ago.
You can also check the UI of promtail
, and see how much configuration gets automatically done for you.
Configuring Grafana to use Loki data source¶
Out of the gate, Grafana does not know how to connect to Loki, to query the logs. Thankfully, the developers of the kube-prometheus-stack
Helm chart have thought about this problem, and have exposed a Helm values file setting called grafana.additionalDataSources
, where you can configure other data sources Grafana should connect to.
Complete
Add a new element to the grafana.additionalDataSources
:
- name: loki
- access: proxy
- isDefault: false
- orgId: 1
- url: DNS and port of the
loki
service inloki
namespace. - version: 1
- type: loki
This configures all the necessary settings for Grafana to connect to Loki, automatically. Upgrade your Prometheus Helm release with the new values file.
Verify
There are two things you can verify.
- Check if the Loki datasource works by going to the data sources configuration in Grafana UI. The Loki data source options should be greyed out, but you still have a test button in the bottom. This test should succeed.
- Check for Loki logs in the explore window. Label browser, in the explore window, should show a lot of sensible labels, like namespace, and they should be filled out with values, which you can easily use for queries. On top of this, all the log lines have proper time connected to them.
If everything is working, you now have a fully working logging and monitoring stack, which is much easier to configure than a custom-made Kubectl manifest bundle.
Setting up audit logs¶
Kubernetes makes some things very easy with it's REST API based architecture, because HTTP protocol is inherently transactional. Transactionality makes it, theoretically, also very auditable, as you can always tell who did what kind of operation in the cluster. This kind of tracking is called auditing or audit logging, and is very useful for both debugging issues for cluster users (for example, why can't I access this?), and also security (for example, which user brought down the production database pod).
Audit logging needs to be configured manually - Kubernetes does not default to any kind of audit logging. This is because Kubernetes API (which is what writes the audit logs) gets a lot of requests, and logging all of them is necessary only in the most secure of installations, and is going to consume a lot of storage and IO.
Instead, Kubernetes allows you to write a policy
, defining what and how to audit. In this section, you'll be setting up audit logging with the Google Kubernetes Engine
audit policy, which is shown here: GitHub.
Now, this policy is definitely not perfect, but through their countless hours of running Kubernetes clusters, this is what they've settled on.
Enabling audit logging requires you to re-configure the cluster partially, as you'll need to feed some settings to the Kubernetes API server. This needs to be done only on control plane nodes.
Complete
First, take the audit policy, and write it into a YAML file in /etc/kubernetes
, for example /etc/kubernetes/audit.yaml
. Make sure it is formatted as a proper YAML file. Also delete the two elements that have a variable substitution (YAML blocks which have variables starting with $
), we're fine with logging them on metadata level.
The second part is editing the ClusterConfiguration
, which, by Kubernetes' own documentation, is edited like this: kubernetes.io.
You'll be doing something similar. First, take the currently used cluster configuration from the appropriate configmap:
kubectl get cm -n kube-system kubeadm-config
Write it into a random file, and remove the metadata
part. For example, like this: kubectl get cm -n kube-system kubeadm-config -o jsonpath='{.data.ClusterConfiguration}' > /root/clusterconfiguration.yaml
Now, edit it. Set the following options:
apiServer.extraArgs.audit-log-path: /var/log/kubernetes/audit.log
- tells the API where to write the audit logs.apiServer.extraArgs.audit-log-maxage: '1'
- tells the API server to keep the logs for 1 day. You won't need more, as you'll be sending them off to your Loki as soon as possible.apiServer.extraArgs.audit-log-maxbackup: '1'
- tells the API server to keep 1 rotation of the logs, if they get full.apiServer.extraArgs.audit-log-maxsize: '1000'
- tells the API server to rotate the logs if the file becomes larger than 1000MB.apiServer.extraArgs.audit-policy-file: <audit_policy_location>
- tells the API server where to find the audit policy file.
Now, while the API now has correct configuration option, if you saved this file and uploaded it to the cluster, the APIserver would start failing. This is because, inside the Kubernetes API container, it does not have the log path and policy file location mounted, so it does not find the files and folders specified. Remedy this by also specifying a apiServer.extraVolumes
option, which is a YAML list.
You need to define two elements in this list:
- An element for the policy file. Give it a
name
, set thehostPath
to the path on the OS (where you added the policy file), and setmountPath
to the same path. It's also a good idea to setreadOnly: true
, as API does not need to write to this file, and as this is a single file, setpathType: File
. - An element for the log directory. Give it a
name
, set thehostPath
to the logs path on the OS (the directory, for example:/var/log/kubernetes
), andmountPath
to the same. Do not setreadOnly
as API needs to write logs to this directory. As you're dealing with a directory, which is the default, you won't need to set thepathType
either.
Make sure that the log folder /var/log/kubernetes/
actually exists in the controller node.
Now that this has been done, you can update your control plane configuration. If you had multiple control planes, you would have to configure all of them one-by-one. Example:
kubeadm init phase control-plane apiserver --config /root/clusterconfiguration.yaml
Make sure to also load this configuration back up to the cluster, as the ConfigMap
values are what are used for future upgrades.
kubectl create cm kubeadm-config --from-file=ClusterConfiguration=/root/clusterconfiguration.yaml -n kube-system --dry-run=client -o yaml | kubectl apply -f -
This command renders the ClusterConfiguration
file as a ConfigMap
, and then applies it to the cluster, overwriting the old version.
Verify
If the format is correct, you'll get the following line [control-plane] Creating static Pod manifest for "kube-apiserver"
. You might also, for a minute or two, lose the connectivity to the API, as the API will be rebooting. You can check the logs from Loki, or from /var/log/pods
.
If the API does not come up, check the logs why. Edit the configuration, and upload it, until it'll work. Your cluster during that time is effectively down for changes, but workloads will continue running. You can also check the running containers with nerdctl
if kubectl
is not working, for example: sudo nerdctl -n k8s.io ps
If you did everything correctly, you should see a lot of logs in /var/log/kubernetes/audit.log
in JSON format. You'll see how much logs a fairly empty and small Kubernetes cluster creates, which demonstrates why this functionality is not default. Each noteworthy event gets its own logline, which is easiest to query using jq
over the command line.
Sending audit logs to Loki¶
These kinds of logs are only mildly useful when they stay on the host that they were emitted from. This is both because of security reasons - where if your node gets compromised, the attacker can delete the logs - and because of the fact, that these kinds of logs usually need to be stored for years.
In a real-world environment, you'd also not run the log aggregation system on the same system which produces the logs, for the same reasons as listed.
Configure Promtail to send the Kubernetes audit logs to Loki.
Complete
You installed Promtail without any specific configuration before. Now you'll need to improve it, so that it can access and forward the logs.
The first step is to allow Promtail access to the audit logs, by configuring extra volumes and mounts for the Promtail containers. Create a new Helm values file for promtail, and fill in extraVolumes
and extraVolumeMounts
to mount the path /var/log/kubernetes
to the Promtail containers.
You can use the defaultVolumes
and defaultVolumeMounts
as reference from the default values file: Promtail Helm values
The second step is to configure Promtail to actually scrape the files. We can extend the scrape configuration with config.snippets.extraScrapeConfigs
setting.
Set the config.snippets.extraScrapeConfigs
to be the following:
config:
snippets:
extraScrapeConfigs: |
- job_name: kubernetes-audit
pipeline_stages:
- json:
expressions:
user: user
verb: verb
objectRef:
annotations:
timestamp: requestReceivedTimestamp
- timestamp:
source: timestamp
format: RFC3339
- json:
expressions:
resource:
namespace:
apiGroup:
source: objectRef
- json:
expressions:
decision: '"authorization.k8s.io/decision"'
reason: '"authorization.k8s.io/reason"'
source: annotations
- json:
expressions:
username:
groups:
source: user
- labels:
username:
groups:
verb:
resource:
namespace:
apiGroup:
decision:
reason:
static_configs:
- targets:
- localhost
labels:
job: auditlog
__path__: /var/log/kubernetes/audit.log
A lot of work in this configuration goes to parsing the JSON values as labels for better readability on Loki side. This allows to understand the logs better later on, and categorize or visualize the data. You can check which other values are available by reading the Kubernetes audit logs with jq
.
Upgrade the Promtail with these Helm values.
Verify
You can check the Promtail UI, whether the proper files are being scraped.
You can also check the Grafana UI, in Loki explorer. The audit logs should have a label job=auditlog
. The logs, when opened up, should also have the different labels like verb
or username
available.
Tracing (optional)¶
Important
This part of the lab is optional. You can do it to learn new technologies and concepts, but there won't be any scoring checks to see, if you've completed it.
The point of this is to evaluate whether it makes this lab too long or difficult.
Tracing is a method of monitoring and observing the flow of requests as they traverse through a distributed system. It records details of individual operations (called spans) that collectively form a trace, showing the entire lifecycle of a request across multiple services.
So, to reiterate:
- Span: A single unit of work in a system (For example an HTTP request, database query).
- Trace: A collection of spans that together represent the full lifecycle of a request.
Tracing is usually done in the context of an application, but more and more infrastructure level software also provide interfaces to trace their activities nowadays. Utilizing tracing requires, similar to other monitoring, to have something that sends or publishes the data, and a server that receives and allows to query the data.
There are a lot of systems that provide this functionality, but the one you'll be setting up is Jaeger, an open source, distributed tracing platform. The benefits of Jaeger for us is that it is meant to do tracing only, not a million other things on top of it, which keeps it simple and understandable.
Complete
Jaeger has multiple ways of running it, some a bit more complex than others. You need to only run it in the AllInOne
mode, where it provides a fully functioning Jaeger without replication or saving the traces into other systems, which would be important in production grade deployments.
Installing Jaeger into Kubernetes is simplest done via the Jaeger operator, which handles the running and lifecycle management of the Jaeger server for you: https://www.jaegertracing.io/docs/1.63/operator/
Also, keep in mind the Prerequisite
section - Jaeger requires cert-manager
to operate, so you will need to install that first.
Do the following tasks:
- Install
cert-manager
. You can use the Default static install method. - Install
jaeger-operator
. You will need to install it in the cluster wide mode into theobservability
namespace.
Once this is done, you should have a single pod running inside the observability
namespace. If it's green, it means everything is working.
Then, it's time to tell the operator to deploy Jaeger in your application namespace. To do this, deploy the CRD into the namespace where your application server runs:
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: simplest
This deploys the simplest Jaeger instance, which can then be used to collect traces from your applications.
Verify
You can verify whether Jaeger started up by checking if a pod named simplest
was created, and is running in your namespace.
The container also has a plethora of ports, which the operator created services for. You might want to create your own services for easier access, or use the port forwarding feature to access.
- Port 16686 is the UI of Jaeger, where you will be able to see traces.
- Port 4317 or 4318 are the ports where you will need to send your traces.
Once you have a running Jaeger server, the other step is to configure your applications to send data to Jaeger. This is called instrumenting your application, which is usually done via some libraries, depending on the programming language used: https://www.jaegertracing.io/docs/1.63/client-features/
Jaeger subscribes to the OpenTelemetry telemetry data model, which is a set of API, SDK, and tools to improve interoperability between different telemetry systems, including tracing. This means you need to use OpenTelemetry libraries to send traces to Jaeger.
OpenTelemetry supports two main instrumentation methods - manual and automatic:
- Manual is used to instrument your application, and export custom spans with custom data. Very useful in production environments where you want to have a lot of visibility into the context of an application, as you can control it very well.
- Automatic is an easier way to get the baseline amount of traces out of the system. It's easier to setup and get running, and provides a lot of information out of the gate, but providing custom visibility is a bit more difficult.
You will need to see whether automatic instrumentation is provided for your language, and decide whether to use that. You will need to import the correct libraries, and set the environment variables to export your traces. For example, in Python and Flask:
With automatic instrumentation, you require the following libraries:
opentelemetry-instrumentation
opentelemetry-distro
opentelemetry-exporter-otlp
opentelemetry-instrumentation-flask
opentelemetry-exporter-jaeger
And need to configure at least these environment variables:
OTEL_SERVICE_NAME
- Name of your service when viewed from Jaeger.OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
- Where to send the traces. Do keep in mind there's one port for HTTP and one for gRPC.OTEL_TRACES_EXPORTER
- Configure this asconsole,otlp
so that traces are written to both console, and sent off via OTLP.
And finally, you will need to execute your python program via opentelemetry-instrument python myapp.py
, instead of calling the app directly.
Manual instrumentation is more complex, and depends on how you're trying to instrument the application precicely. There are a lot of examples in the opentelemetry-python repository, which solve it in different ways.
Our recommendation is to keep it simple at first by using the opentelemetry-api
and opentelemetry-sdk
libraries in Python, and get it running with simplest trace configuration.
Complete
Your task is to instrument your application, and send traces off to Jaeger.
Verify
You should be able to see the traces of your application in Jaeger.
Upgrading Kubernetes¶
Kubernetes follows a 4-month release schedule, which means that not constantly upgrading Kubernetes means, you'll be behind in upgrades very soon. As new functionalities and existing software updates (for example Ingress controllers) are written to match with the latest Kubernetes version, you can very quickly find yourself in a situation where you can't install new software to the cluster because you're a version or two behind.
This was especially true with, for example, going from version 1.24
to 1.25
, as a very massive and important security feature in Kubernetes, called PodSecurityPolicy
, got fully deprecated, as old software that tried to deploy to new cluster couldn't, due to trying to deploy manifests to an API, that did not exist any more. Same with new software deploying to old clusters. This is why, in several Helm charts, you can still see references to PSP
, which is short for PodSecurityPolicy
.
This is a general theme with Kubernetes - each version might deprecate or remove specific resource definitions and API versions, and it's up to the cluster admins and software developers to manage this.
Generally, it's easiest to handle this by staying on the latest Kubernetes version, but you need to be careful and make sure all the software is capable of running on the latest version. Some software moves slower, either due to complexity or because the developers do not have enough man-power. One such example is Longhorn, which usually doesn't have a new version out by every new Kubernetes minor version release. Installing new version in that case can, potentially irrevocably (without backups), corrupt the data.
Kubernetes follows semantic versioning, where upgrades inside the same minor version are straightforward and compatible, and do not break API. Your cluster is currently on version 1.30.X
. You'll be upgrading it to a newer minor version 1.31.Y
(latest version currently), to go through and learn the process.
To see whether an API gets deprecated, carefully read through the Kubernetes v1.31: Planternetes release notes.
The documentation for upgrading in Kubernetes documentation is here: kubernetes.io
You're recommended to try to upgrade using this document, instead of the lab materials, as it gives a better real-world experience. Also, it might explain some concepts better.
The upgrade workflow at high level is the following:
- Upgrade a primary control plane node.
- Upgrade additional control plane nodes.
- Upgrade worker nodes.
You need to only do steps 1. and 3., as you only have a single control plane node.
Complete
First, convert your machine's yum
to search for newer minor version, by changing the repository file at /etc/yum.repos.d/kubernetes.repo
. Change the references to v1.30
to v1.31
, and trigger a cache rebuild yum clean all && yum makecache
.
Starting with the control plane node, find the version you'll be upgrading to:
yum list --showduplicates kubeadm --disableexcludes=kubernetes
From this list, find the latest kubernetes version. At time of writing, the latest patch version was 1.31.2
. This will be referred to as <version>
from now on.
Upgrade the node utilities to the new version:
yum install -y kubeadm-'<version>-*' --disableexcludes=kubernetes
Check whether kubeadm
got updated: kubeadm version
And start planning the upgrade: kubeadm upgrade plan
.
Make sure kubeadm
doesn't tell you that you need to manually upgrade something (except Kublet), and choose the version to upgrade to, with the kubeadm upgrade apply v<version>
command.
Verify
It will process for a few minutes, but in the end, you should receive a success message, similar to this:
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Because your cluster has only one control plane node, your cluster will have downtime. In a multi-control-plane situation, the kubeadm tool would upgrade only the current node, and you'd have to do a kubeadm upgrade apply
on the other nodes, one by one.
These were the necessary steps to upgrade the control plane components. You'd run this on all the control plane nodes, before continuing forward, but this did nothing for the nodes themselves - your kubectl
and kubelet
are still not upgraded on the nodes.
Upgrading the node-level components in a way, where workloads don't experience downtime, requires all workloads and system level components to have at least 2 replicas. On top of this, you'd need at least multiple control planes, and enough space on other workers to completely be able to migrate the workloads from one node to the other ones.
The steps are usually the following:
- Drain the node with
kubectl drain <nodename> --ignore daemonsets
. This is the so-called migration step, which just kills the pods on the node and starts them up on another. This is why workloads need at least 2 replicas, which are on different nodes. - Update the binaries, restart the kubelet. Verify everything works.
- Uncordon (resume) the node with
kubectl uncordon <nodename>
.
You can play this through on the worker node, but in your case, upgrading control plane node is going to make the cluster experience downtime.
Complete
Start with the control plane node, install new version of kubelet
and kubectl
:
yum install -y kubelet-'<version>-*' kubectl-'<version>-*' --disableexcludes=kubernetes
Then, reload the kubelet
on the control plane node. This will restart all the pods on the node.
systemctl daemon-reload
systemctl restart kubelet
Verify
Wait for the clsuter to resume to working state, and play this through properly on the worker node. You can check the version of control plane node with kubectl get nodes
.
Make sure all the control plane components are running before continuing.
To check what are the current versions of control plane components you can check the container image versions of pods in kube-system
namespace: kubectl describe pods -n kube-system | grep "Image:" | grep "kube-"
.
Now, upgrade the worker node properly.
Complete
First, make sure the worker node has new repository version configured.
Drain the node with kubectl drain <worker-node-name> --ignore-daemonsets --delete-emptydir-data
. This might take time. As you can probably tell, it will destroy all the pods on the node, and their emptydir data. Experience shows, that some very stateful applications might have difficulties getting deleted with commands like this, in which case, at some point, you can cancel the command - most of the most important workloads have left already, and you can continue with the upgrade.
Install the new kubelet
and kubectl
versions:
yum install -y kubelet-'<version>-*' kubectl-'<version>-*' --disableexcludes=kubernetes
And reload the kubelet
the same way as you did with control plane:
systemctl daemon-reload
systemctl restart kubelet
Something you should notice here, is that you can still use the cluster during this time. This might also be a good time to upgrade the OS level packages, if needed, and reboot the host.
Verify
If everything worked out, the kubectl get nodes
command should show the worker node on the new version. If that is the case, you can continue and uncordon the worker node.
kubectl uncordon <worker-node-name>
Kubernetes should start migrating workloads back to the worker node fairly quickly, especially Longhorn.
Also update the second worker node the same way.
The Kubernetes cluster is now on a newer and better version. This would be a good idea to have a look at the workloads and logs, to make sure everything is working as intended.