Kubernetes Monitoring

Kubernetes monitoring is crucial for maintaining the health of your cluster and its services. Kubernetes is an open source platform for managing containerized applications. It supports monitoring of application containers via probe agents that can be deployed in third-party clusters.

Monitoring Kubernetes itself as well as its underlying components (such as services, nodes, and pods) requires using additional tools.

In this article, we list the top 10 open source Kubernetes monitoring tools to monitor your Kubernetes cluster and its subcomponents effectively.

List of Open Source Kubernetes Monitoring Tools

Prometheus

Prometheus is an open source systems monitoring and alerting tool and the most recommended monitoring tool for Kubernetes.

It is highly scalable, and its architecture is pluggable. You can write custom exporters and integrations to monitor metrics from any application or component.

To monitor Kubernetes clusters, you can write custom exporters for the Kubernetes API endpoints or use the existing ones: kube-state-metrics, cAdvisor, and kube-api-mounters (KAM). KAM is an API server that exposes endpoints similar to Kubernetes’s own API server. You can use these endpoints to integrate with monitoring systems such as Prometheus.

Kubelet

Kubelet is the primary component of Kubernetes that runs on each node and is responsible for running pods and interacting with the nodes. Any component that interacts with the nodes is an excellent starting point for monitoring.

Kubelet can be monitored via the native metrics endpoint. The metrics have been collected from the various components such as runtime, cgroups, and pod stats. The metrics are published over HTTP and can be accessed by sending a GET request.

Some of the key metrics to monitor are:

Pod CPU, memory, and network usage – to ensure that the containers are not using too much memory or CPU.
Containers: Running and termination events – to ensure that containers are running as expected.
Events related to the lifecycle of pods – to ensure that pods are running as expected.

Grafana

Grafana is an open source, feature-rich visualization and dashboarding tool for monitoring systems. It is designed for time series data and is tightly integrated with Prometheus to visualize and create real-time alerts. Grafana provides a user-friendly, web-based interface to view your graphs, charts, and statistics.

To monitor Kubernetes clusters, you can configure Grafana to scrape Kubernetes API server endpoints and collect the cluster-level metrics. Kubernetes provides a set of dashboards that can be used as is or customized to fit your requirements.

Grafana works equally well with other monitoring tools such as Zabbix, Graphite, and others. Grafana allows you to create custom graphs and customize the dashboards to monitor Kubernetes components such as pods, services, and node.

Grafana can be used to visualize the following Kubernetes cluster-level metrics:

Pods – The average, maximum, minimum pod-level metrics such as CPU, memory, and disk usage can be visualized.
Service – Visualize the number of requests, average service response time, and average error rate.
Node – This includes metrics such as the CPU usage and memory consumption.
Kubernetes – This includes metrics such as the number of pods created and deleted, the number of replications, and the number of nodes.

Container Advisor (cAdvisor)

cAdvisor is a component of Google’s container visualization tool, gVisor. It collects resource usage such as CPU, memory, and disk usage of all containers running in your cluster.

The cAdvisor metrics are published over HTTP and can be retrieved using a cURL request. You can use these metrics to get insights into the overall resource usage in the cluster.
You can also use the cAdvisor metrics to create alerts for when a particular resource crosses a threshold.
You can configure a resource threshold on any chart in the dashboard and set an email alert when the threshold value is crossed.
You can also set alerts when the total number of containers crosses a particular threshold.
You can set resource thresholds by navigating to the Resource usage tab and configuring the thresholds.

There are multiple dashboards such as CPU, Memory, Network, and Pod that can be accessed by clicking the corresponding tab.

Kube-state-metrics

Kube-state-metrics is a simple Go application that collects and aggregates metrics from the Kubernetes API server and containers running on the node. You can configure the application to collect pods, services, and namespaces as a function of resources such as CPU, memory, and disk usage.

Kube-state-metrics can be used to monitor the following Kubernetes cluster-level metrics:

Pods – The average, maximum, and minimum pod-level metrics such as CPU, memory, and disk usage can be visualized.
Service – Visualize the number of requests, average service response time, and average error rate.
Namespaces – The number of pods, average CPU, memory, and disk usage, and the number of containers can be visualized.
Kubernetes – This includes metrics such as the number of pods created and deleted, the number of replications, and the number of nodes.

Kubewatch

Kubewatch is an open source tool that collects Kubernetes operational data such as memory and CPU usage of pods, node, and other components in the cluster.

It provides a detailed view of the health of the cluster and its individual nodes. Kubewatch can be used to visualize the health score and node stats from the collected data. It can also be used to trigger alerts when a particular metric exceeds a threshold value.

To monitor Kubernetes, you can configure Kubewatch to scrape Kubernetes API server endpoints and collect metrics such as pods, services, and nodes.

The ELK Stack

The ELK stack is a popular open source log-analytics platform that can also be used for Kubernetes monitoring. You can use the ELK stack to collect logs and metrics from multiple Kubernetes components such as kubelet, api-server, and kube-controller-manager.

The collected logs and metrics can be sent to a centralized log server such as Elasticsearch, Logstash, or a Kafka server. You can use the centralized log server to collect logs from different clusters and applications, and use them for analysis.

The ELK stack can be used to visualize the following Kubernetes cluster-level metrics:

Pods – Visualize the number of pods created and deleted as well as the average CPU and memory usage.
Services – Visualize the number of requests, average service response time, and average error rate.
Kubernetes – This includes metrics such as the number of pods created and deleted, the number of replications, and the number of nodes.

Jaeger

Jaeger is an open source distributed tracing system that can be used to monitor and troubleshoot applications running in Kubernetes.

It is a distributed, scalable tracing system that can be used to trace the end-to-end application flow and generate reports.
You can use Jaeger to visualize the application flow and understand where the latency is happening.
You can also use it to create alerts based on the metrics that are collected.
To monitor Kubernetes, you can configure Jaeger to send HTTP requests to the Kubernetes API server endpoints.
You can also collect metrics directly from the components such as kubelet and kube-controller-manager.

Kubernetes Dashboard

Kubernetes Dashboard provides several different views for monitoring CPU and memory utilization across all nodes. The health status of workloads (pods, deployments, replica sets, cron jobs, etc.) can also be monitored. You can utilize ready-to-go YAML files to install Kubernetes Dashboard.

Kubernetes Dashboard is a web-based UI add-on for Kubernetes clusters. It offers a variety of features that make it possible to manage workloads, discover, balance load, configure, store, and monitor clusters. It is ideal for small clusters and for people who are just getting started with Kubernetes.

Kubernetes Dashboard provides a wide range of views for CPU and memory utilization across all nodes. It may also be used to monitor the health of workloads (pods, deployments, replica sets, cron jobs, etc.). Kubernetes Dashboard may be deployed using ready-to-use YAML files.

Weave Scope

It is a zero-config tool that generates a map of processes, containers, and hosts in a Kubernetes cluster. This helps to understand the Docker containers in real time. Weave Scope provides a nice user interface and allows to manage containers and run diagnostic commands on them from within this interface.

It’s a good tool to analyse the application, the infrastructure, and the connections among the cluster nodes.

Top 10 Open Source Kubernetes Monitoring Tools 2024

Table of Contents