Kubernetes Monitoring

Learn more about the kube-monitoring plugin. Use it to activate Kubernetes monitoring for your Greenhouse cluster.

The main terminologies used in this document can be found in core-concepts.

Overview

Observability is often required for operation and automation of service offerings. To get the insights provided by an application and the container runtime environment, you need telemetry data in the form of metrics or logs sent to backends such as Prometheus or OpenSearch. With the kube-monitoring Plugin, you will be able to cover the metrics part of the observability stack.

This Plugin includes a pre-configured package of components that help make getting started easy and efficient. At its core, an automated and managed Prometheus installation is provided using the prometheus-operator. This is complemented by Prometheus target configuration for the most common Kubernetes components providing metrics by default. In addition, Cloud operators curated Prometheus alerting rules and Plutono dashboards are included to provide a comprehensive monitoring solution out of the box.

kube-monitoring

Components included in this Plugin:

Disclaimer

It is not meant to be a comprehensive package that covers all scenarios. If you are an expert, feel free to configure the plugin according to your needs.

The Plugin is a deeply configured kube-prometheus-stack Helm chart which helps to keep track of versions and community updates.

It is intended as a platform that can be extended by following the guide.

Contribution is highly appreciated. If you discover bugs or want to add functionality to the plugin, then pull requests are always welcome.

Quick start

This guide provides a quick and straightforward way to use kube-monitoring as a Greenhouse Plugin on your Kubernetes cluster.

Prerequisites

  • A running and Greenhouse-onboarded Kubernetes cluster. If you don’t have one, follow the Cluster onboarding guide.

Step 1:

You can install the kube-monitoring package in your cluster by installing it with Helm manually or let the Greenhouse platform lifecycle it for you automatically. For the latter, you can either:

  1. Go to Greenhouse dashboard and select the Kubernetes Monitoring plugin from the catalog. Specify the cluster and required option values.
  2. Create and specify a Plugin resource in your Greenhouse central cluster according to the examples.

Step 2:

After installation, Greenhouse will provide a generated link to the Prometheus user interface. This is done via the annotation greenhouse.sap/expose: “true” at the Prometheus Service resource.

Step 3:

Greenhouse regularly performs integration tests that are bundled with kube-monitoring. These provide feedback on whether all the necessary resources are installed and continuously up and running. You will find messages about this in the plugin status and also in the Greenhouse dashboard.

Values

Alertmanager options

KeyTypeDefaultDescription
alerts.alertmanagers.hostslist[]List of Alertmanager hostsd alerts to
alerts.alertmanagers.tlsConfig.certstring""TLS certificate for communication with Alertmanager
alerts.alertmanagers.tlsConfig.keystring""TLS key for communication with Alertmanager
alerts.enabledboolfalseTo send alerts to Alertmanager

Global options

KeyTypeDefaultDescription
global.commonLabelsobject{}Labels to apply to all resources This can be used to add a support_group or service label to all resources and alerting rules.

Kubernetes component scraper options

KeyTypeDefaultDescription
kubeMonitoring.coreDns.enabledbooltrueComponent scraping coreDns. Use either this or kubeDns
kubeMonitoring.kubeApiServer.enabledbooltrueComponent scraping the kube API server
kubeMonitoring.kubeControllerManager.enabledboolfalseComponent scraping the kube controller manager
kubeMonitoring.kubeDns.enabledboolfalseComponent scraping kubeDns. Use either this or coreDns
kubeMonitoring.kubeEtcd.enabledbooltrueComponent scraping etcd
kubeMonitoring.kubeProxy.enabledboolfalseComponent scraping kube proxy
kubeMonitoring.kubeScheduler.enabledboolfalseComponent scraping kube scheduler
kubeMonitoring.kubeStateMetrics.enabledbooltrueComponent scraping kube state metrics
kubeMonitoring.kubelet.enabledbooltrueComponent scraping the kubelet and kubelet-hosted cAdvisor
kubeMonitoring.kubernetesServiceMonitors.enabledbooltrueFlag to disable all the Kubernetes component scrapers
kubeMonitoring.nodeExporter.enabledbooltrueDeploy node exporter as a daemonset to all nodes

Prometheus options

KeyTypeDefaultDescription
kubeMonitoring.prometheus.annotationsobject{}Annotations for Prometheus
kubeMonitoring.prometheus.enabledbooltrueDeploy a Prometheus instance
kubeMonitoring.prometheus.ingress.enabledboolfalseDeploy Prometheus Ingress
kubeMonitoring.prometheus.ingress.hostslist[]Must be provided if Ingress is enabled
kubeMonitoring.prometheus.ingress.ingressClassnamestring"nginx"Specifies the ingress-controller
kubeMonitoring.prometheus.prometheusSpec.additionalArgslist[]Allows setting additional arguments for the Prometheus container
kubeMonitoring.prometheus.prometheusSpec.additionalScrapeConfigsstring""Next to ScrapeConfig CRD, you can use AdditionalScrapeConfigs, which allows specifying additional Prometheus scrape configurations
kubeMonitoring.prometheus.prometheusSpec.evaluationIntervalstring""Interval between consecutive evaluations
kubeMonitoring.prometheus.prometheusSpec.externalLabelsobject{}External labels to add to any time series or alerts when communicating with external systems like Alertmanager
kubeMonitoring.prometheus.prometheusSpec.logLevelstring""Log level to be configured for Prometheus
kubeMonitoring.prometheus.prometheusSpec.podMonitorSelectorobject{"matchLabels":{"plugin":"{{ $.Release.Name }}"}}PodMonitors to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } }
kubeMonitoring.prometheus.prometheusSpec.probeSelectorobject{"matchLabels":{"plugin":"{{ $.Release.Name }}"}}Probes to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } }
kubeMonitoring.prometheus.prometheusSpec.retentionstring""How long to retain metrics
kubeMonitoring.prometheus.prometheusSpec.ruleSelectorobject{"matchLabels":{"plugin":"{{ $.Release.Name }}"}}PrometheusRules to be selected for target discovery. If {}, select all PrometheusRules @default { matchLabels: { plugin: <metadata.name> } }
kubeMonitoring.prometheus.prometheusSpec.scrapeConfigSelectorobject{"matchLabels":{"plugin":"{{ $.Release.Name }}"}}scrapeConfigs to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } }
kubeMonitoring.prometheus.prometheusSpec.scrapeIntervalstring""Interval between consecutive scrapes. Defaults to 30s
kubeMonitoring.prometheus.prometheusSpec.scrapeTimeoutstring""Number of seconds to wait for target to respond before erroring
kubeMonitoring.prometheus.prometheusSpec.serviceMonitorSelectorobject{"matchLabels":{"plugin":"{{ $.Release.Name }}"}}ServiceMonitors to be selected for target discovery. If {}, select all ServiceMonitors @default { matchLabels: { plugin: <metadata.name> } }
kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resourcesobject{"requests":{"storage":"50Gi"}}How large the persistent volume should be to house the Prometheus database. Default 50Gi.
kubeMonitoring.prometheus.tlsConfig.caCertstring"Secret"CA certificate to verify technical clients at Prometheus Ingress

Prometheus-operator options

KeyTypeDefaultDescription
kubeMonitoring.prometheusOperator.alertmanagerConfigNamespaceslist[]Filter namespaces to look for prometheus-operator AlertmanagerConfig resources
kubeMonitoring.prometheusOperator.alertmanagerInstanceNamespaceslist[]Filter namespaces to look for prometheus-operator Alertmanager resources
kubeMonitoring.prometheusOperator.enabledbooltrueManages Prometheus and Alertmanager components
kubeMonitoring.prometheusOperator.prometheusInstanceNamespaceslist[]Filter namespaces to look for prometheus-operator Prometheus resources

Service Discovery

The kube-monitoring Plugin provides a PodMonitor to automatically discover the Prometheus metrics of the Kubernetes Pods in any Namespace. The PodMonitor is configured to detect the metrics endpoint of the Pods if the following annotations are set:

metadata:
  annotations:
    greenhouse/scrape: “true”
    greenhouse/target: <kube-monitoring plugin name>

Note: The annotations needs to be added manually to have the pod scraped and the port name needs to match.

Examples

Deploy kube-monitoring into a remote cluster

apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
  name: kube-monitoring
spec:
  pluginDefinition: kube-monitoring
  disabled: false
  optionValues:
    - name: kubeMonitoring.prometheus.prometheusSpec.retention
      value: 30d
    - name: kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage
      value: 100Gi
    - name: kubeMonitoring.prometheus.service.labels
      value:
        greenhouse.sap/expose: "true"
    - name: kubeMonitoring.prometheus.prometheusSpec.externalLabels
      value:
        cluster: example-cluster
        organization: example-org
        region: example-region
    - name: alerts.enabled
      value: true
    - name: alerts.alertmanagers.hosts
      value:
        - alertmanager.dns.example.com
    - name: alerts.alertmanagers.tlsConfig.cert
      valueFrom:
        secret:
          key: tls.crt
          name: tls-<org-name>-prometheus-auth
    - name: alerts.alertmanagers.tlsConfig.key
      valueFrom:
        secret:
          key: tls.key
          name: tls-<org-name>-prometheus-auth

Deploy Prometheus only

Example Plugin to deploy Prometheus with the kube-monitoring Plugin.

NOTE: If you are using kube-monitoring for the first time in your cluster, it is necessary to set kubeMonitoring.prometheusOperator.enabled to true.

apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
  name: example-prometheus-name
spec:
  pluginDefinition: kube-monitoring
  disabled: false
  optionValues:
    - name: kubeMonitoring.defaultRules.create
      value: false
    - name: kubeMonitoring.kubernetesServiceMonitors.enabled
      value: false
    - name: kubeMonitoring.prometheusOperator.enabled
      value: false
    - name: kubeMonitoring.kubeStateMetrics.enabled
      value: false
    - name: kubeMonitoring.nodeExporter.enabled
      value: false
    - name: kubeMonitoring.prometheus.prometheusSpec.retention
      value: 30d
    - name: kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage
      value: 100Gi
    - name: kubeMonitoring.prometheus.service.labels
      value:
        greenhouse.sap/expose: "true"
    - name: kubeMonitoring.prometheus.prometheusSpec.externalLabels
      value:
        cluster: example-cluster
        organization: example-org
        region: example-region
    - name: alerts.enabled
      value: true
    - name: alerts.alertmanagers.hosts
      value:
        - alertmanager.dns.example.com
    - name: alerts.alertmanagers.tlsConfig.cert
      valueFrom:
        secret:
          key: tls.crt
          name: tls-<org-name>-prometheus-auth
    - name: alerts.alertmanagers.tlsConfig.key
      valueFrom:
        secret:
          key: tls.key
          name: tls-<org-name>-prometheus-auth

Extension of the plugin

kube-monitoring can be extended with your own Prometheus alerting rules and target configurations via the Custom Resource Definitions (CRDs) of the Prometheus operator. The user-defined resources to be incorporated with the desired configuration are defined via label selections.

The CRD PrometheusRule enables the definition of alerting and recording rules that can be used by Prometheus or Thanos Rule instances. Alerts and recording rules are reconciled and dynamically loaded by the operator without having to restart Prometheus or Thanos Rule.

kube-monitoring Prometheus will automatically discover and load the rules that match labels plugin: <plugin-name>.

Example:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: example-prometheus-rule
  labels:
    plugin: <metadata.name>
    ## e.g plugin: kube-monitoring
spec:
 groups:
   - name: example-group
     rules:
     ...

The CRDs PodMonitor, ServiceMonitor, Probe and ScrapeConfig allow the definition of a set of target endpoints to be scraped by Prometheus. The operator will automatically discover and load the configurations that match labels plugin: <plugin-name>.

Example:

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: example-pod-monitor
  labels:
    plugin: <metadata.name>
    ## e.g plugin: kube-monitoring
spec:
  selector:
    matchLabels:
      app: example-app
  namespaceSelector:
    matchNames:
      - example-namespace
  podMetricsEndpoints:
    - port: http
  ...