Kubernetes Monitoring

Learn more about the kube-monitoring plugin. Use it to activate Kubernetes monitoring for your Greenhouse cluster.

The main terminologies used in this document can be found in core-concepts.

Overview

Observability is often required for operation and automation of service offerings. To get the insights provided by an application and the container runtime environment, you need telemetry data in the form of metrics or logs sent to backends such as Prometheus or OpenSearch. With the kube-monitoring Plugin, you will be able to cover the metrics part of the observability stack.

This Plugin includes a pre-configured package of components that help make getting started easy and efficient. At its core, an automated and managed Prometheus installation is provided using the prometheus-operator. This is complemented by Prometheus target configuration for the most common Kubernetes components providing metrics by default. In addition, Cloud operators curated Prometheus alerting rules and Plutono dashboards are included to provide a comprehensive monitoring solution out of the box.

kube-monitoring

Components included in this Plugin:

Disclaimer

It is not meant to be a comprehensive package that covers all scenarios. If you are an expert, feel free to configure the plugin according to your needs.

The Plugin is a deeply configured kube-prometheus-stack Helm chart which helps to keep track of versions and community updates.

It is intended as a platform that can be extended by following the guide.

Contribution is highly appreciated. If you discover bugs or want to add functionality to the plugin, then pull requests are always welcome.

Quick start

This guide provides a quick and straightforward way to use kube-monitoring as a Greenhouse Plugin on your Kubernetes cluster.

Prerequisites

  • A running and Greenhouse-onboarded Kubernetes cluster. If you don’t have one, follow the Cluster onboarding guide.

Step 1:

You can install the kube-monitoring package in your cluster by installing it with Helm manually or let the Greenhouse platform lifecycle it for you automatically. For the latter, you can either:

  1. Go to Greenhouse dashboard and select the Kubernetes Monitoring plugin from the catalog. Specify the cluster and required option values.
  2. Create and specify a Plugin resource in your Greenhouse central cluster according to the examples.

Step 2:

After installation, Greenhouse will provide a generated link to the Prometheus user interface. This is done via the annotation greenhouse.sap/expose: “true” at the Prometheus Service resource.

Step 3:

Greenhouse regularly performs integration tests that are bundled with kube-monitoring. These provide feedback on whether all the necessary resources are installed and continuously up and running. You will find messages about this in the plugin status and also in the Greenhouse dashboard.

Configuration

Global options

NameDescriptionValue
global.commonLabelsLabels to add to all resources. This can be used to add a support_group or service label to all resources and alerting rules.true

Prometheus-operator options

NameDescriptionValue
kubeMonitoring.prometheusOperator.enabledManages Prometheus and Alertmanager componentstrue
kubeMonitoring.prometheusOperator.alertmanagerInstanceNamespacesFilter namespaces to look for prometheus-operator Alertmanager resources[]
kubeMonitoring.prometheusOperator.alertmanagerConfigNamespacesFilter namespaces to look for prometheus-operator AlertmanagerConfig resources[]
kubeMonitoring.prometheusOperator.prometheusInstanceNamespacesFilter namespaces to look for prometheus-operator Prometheus resources[]

Kubernetes component scraper options

NameDescriptionValue
kubeMonitoring.kubernetesServiceMonitors.enabledFlag to disable all the kubernetes component scraperstrue
kubeMonitoring.kubeApiServer.enabledComponent scraping the kube api servertrue
kubeMonitoring.kubelet.enabledComponent scraping the kubelet and kubelet-hosted cAdvisortrue
kubeMonitoring.coreDns.enabledComponent scraping coreDns. Use either this or kubeDnstrue
kubeMonitoring.kubeEtcd.enabledComponent scraping etcdtrue
kubeMonitoring.kubeStateMetrics.enabledComponent scraping kube state metricstrue
kubeMonitoring.nodeExporter.enabledDeploy node exporter as a daemonset to all nodestrue
kubeMonitoring.kubeControllerManager.enabledComponent scraping the kube controller managerfalse
kubeMonitoring.kubeScheduler.enabledComponent scraping kube schedulerfalse
kubeMonitoring.kubeProxy.enabledComponent scraping kube proxyfalse
kubeMonitoring.kubeDns.enabledComponent scraping kubeDns. Use either this or coreDnsfalse

Prometheus options

NameDescriptionValue
kubeMonitoring.prometheus.enabledDeploy a Prometheus instancetrue
kubeMonitoring.prometheus.annotationsAnnotations for Prometheus{}
kubeMonitoring.prometheus.tlsConfig.caCertCA certificate to verify technical clients at Prometheus IngressSecret
kubeMonitoring.prometheus.ingress.enabledDeploy Prometheus Ingresstrue
kubeMonitoring.prometheus.ingress.hostsMust be provided if Ingress is enabled.[]
kubeMonitoring.prometheus.ingress.ingressClassnameSpecifies the ingress-controllernginx
kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storageHow large the persistent volume should be to house the prometheus database. Default 50Gi.""
kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.storageClassNameThe storage class to use for the persistent volume.""
kubeMonitoring.prometheus.prometheusSpec.scrapeIntervalInterval between consecutive scrapes. Defaults to 30s""
kubeMonitoring.prometheus.prometheusSpec.scrapeTimeoutNumber of seconds to wait for target to respond before erroring""
kubeMonitoring.prometheus.prometheusSpec.evaluationIntervalInterval between consecutive evaluations""
kubeMonitoring.prometheus.prometheusSpec.externalLabelsExternal labels to add to any time series or alerts when communicating with external systems like Alertmanager{}
kubeMonitoring.prometheus.prometheusSpec.ruleSelectorPrometheusRules to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } }{}
kubeMonitoring.prometheus.prometheusSpec.serviceMonitorSelectorServiceMonitors to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } }{}
kubeMonitoring.prometheus.prometheusSpec.podMonitorSelectorPodMonitors to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } }{}
kubeMonitoring.prometheus.prometheusSpec.probeSelectorProbes to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } }{}
kubeMonitoring.prometheus.prometheusSpec.scrapeConfigSelectorscrapeConfigs to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } }{}
kubeMonitoring.prometheus.prometheusSpec.retentionHow long to retain metrics""
kubeMonitoring.prometheus.prometheusSpec.logLevelLog level to be configured for Prometheus""
kubeMonitoring.prometheus.prometheusSpec.additionalScrapeConfigsNext to ScrapeConfig CRD, you can use AdditionalScrapeConfigs, which allows specifying additional Prometheus scrape configurations""
kubeMonitoring.prometheus.prometheusSpec.additionalArgsAllows setting additional arguments for the Prometheus container[]

Alertmanager options

NameDescriptionValue
alerts.enabledTo send alerts to Alertmanagerfalse
alerts.alertmanager.hostsList of Alertmanager hosts Prometheus can send alerts to[]
alerts.alertmanager.tlsConfig.certTLS certificate for communication with AlertmanagerSecret
alerts.alertmanager.tlsConfig.keyTLS key for communication with AlertmanagerSecret

Examples

Deploy kube-monitoring into a remote cluster

apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
  name: kube-monitoring
spec:
  pluginDefinition: kube-monitoring
  disabled: false
  optionValues:
    - name: kubeMonitoring.prometheus.prometheusSpec.retention
      value: 30d
    - name: kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage
      value: 100Gi
    - name: kubeMonitoring.prometheus.service.labels
      value:
        greenhouse.sap/expose: "true"
    - name: kubeMonitoring.prometheus.prometheusSpec.externalLabels
      value:
        cluster: example-cluster
        organization: example-org
        region: example-region
    - name: alerts.enabled
      value: true
    - name: alerts.alertmanagers.hosts
      value:
        - alertmanager.dns.example.com
    - name: alerts.alertmanagers.tlsConfig.cert
      valueFrom:
        secret:
          key: tls.crt
          name: tls-<org-name>-prometheus-auth
    - name: alerts.alertmanagers.tlsConfig.key
      valueFrom:
        secret:
          key: tls.key
          name: tls-<org-name>-prometheus-auth

Deploy Prometheus only

Example Plugin to deploy Prometheus with the kube-monitoring Plugin.

NOTE: If you are using kube-monitoring for the first time in your cluster, it is necessary to set kubeMonitoring.prometheusOperator.enabled to true.

apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
  name: example-prometheus-name
spec:
  pluginDefinition: kube-monitoring
  disabled: false
  optionValues:
    - name: kubeMonitoring.defaultRules.create
      value: false
    - name: kubeMonitoring.kubernetesServiceMonitors.enabled
      value: false
    - name: kubeMonitoring.prometheusOperator.enabled
      value: false
    - name: kubeMonitoring.kubeStateMetrics.enabled
      value: false
    - name: kubeMonitoring.nodeExporter.enabled
      value: false
    - name: kubeMonitoring.prometheus.prometheusSpec.retention
      value: 30d
    - name: kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage
      value: 100Gi
    - name: kubeMonitoring.prometheus.service.labels
      value:
        greenhouse.sap/expose: "true"
    - name: kubeMonitoring.prometheus.prometheusSpec.externalLabels
      value:
        cluster: example-cluster
        organization: example-org
        region: example-region
    - name: alerts.enabled
      value: true
    - name: alerts.alertmanagers.hosts
      value:
        - alertmanager.dns.example.com
    - name: alerts.alertmanagers.tlsConfig.cert
      valueFrom:
        secret:
          key: tls.crt
          name: tls-<org-name>-prometheus-auth
    - name: alerts.alertmanagers.tlsConfig.key
      valueFrom:
        secret:
          key: tls.key
          name: tls-<org-name>-prometheus-auth

Extension of the plugin

kube-monitoring can be extended with your own Prometheus alerting rules and target configurations via the Custom Resource Definitions (CRDs) of the Prometheus operator. The user-defined resources to be incorporated with the desired configuration are defined via label selections.

The CRD PrometheusRule enables the definition of alerting and recording rules that can be used by Prometheus or Thanos Rule instances. Alerts and recording rules are reconciled and dynamically loaded by the operator without having to restart Prometheus or Thanos Rule.

kube-monitoring Prometheus will automatically discover and load the rules that match labels plugin: <plugin-name>.

Example:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: example-prometheus-rule
  labels:
    plugin: <metadata.name> 
    ## e.g plugin: kube-monitoring
spec:
 groups:
   - name: example-group
     rules:
     ...

The CRDs PodMonitor, ServiceMonitor, Probe and ScrapeConfig allow the definition of a set of target endpoints to be scraped by Prometheus. The operator will automatically discover and load the configurations that match labels plugin: <plugin-name>.

Example:

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: example-pod-monitor
  labels:
    plugin: <metadata.name> 
    ## e.g plugin: kube-monitoring
spec:
  selector:
    matchLabels:
      app: example-app
  namespaceSelector:
    matchNames:
      - example-namespace
  podMetricsEndpoints:
    - port: http
  ...