This is the multi-page printable view of this section. Click here to print.
Greenhouse documentation
- 1: Getting started
- 1.1: Core Concepts
- 1.1.1: Organizations
- 1.1.2: Teams
- 1.1.3: Clusters
- 1.1.4: PluginDefinitions, Plugins and PluginPresets
- 1.2: Overview
- 1.3: Quick start guide
- 1.4: Installation
- 2: User guides
- 2.1: Organization management
- 2.1.1: SAP ID Service
- 2.1.2: Creating an organization
- 2.2: Cluster management
- 2.2.1: Cluster onboarding
- 2.2.2: Cluster offboarding
- 2.3: Plugin management
- 2.3.1: Local Plugin Development
- 2.3.2: Testing a Plugin
- 2.3.3: Plugin deployment
- 2.3.4: Managing Plugins for multiple clusters
- 2.3.5: Plugin Catalog
- 2.4: Team management
- 2.4.1: Role-based access control
- 2.4.2: Team creation
- 3: Architecture
- 4: Reference
- 4.1: API
- 4.2: Plugin Catalog
- 4.2.1: Alerts
- 4.2.2: Cert-manager
- 4.2.3: Decentralized Observer of Policies (Violations)
- 4.2.4: Designate Ingress CNAME operator (DISCO)
- 4.2.5: DigiCert issuer
- 4.2.6: External DNS
- 4.2.7: Github Guard
- 4.2.8: Ingress NGINX
- 4.2.9: Kubernetes Monitoring
- 4.2.10: Logshipper
- 4.2.11: OpenTelemetry
- 4.2.12: Perses
- 4.2.13: Plutono
- 4.2.14: Service exposure test
- 4.2.15: Teams2Slack
- 4.2.16: Thanos
- 5: Contribute
- 6:
- 7:
1 - Getting started
1.1 - Core Concepts
Feature | Description | API | UI | Comments |
---|---|---|---|---|
Organizations | Organizations are the top-level entities in Greenhouse. | π’ | π’ | |
Teams | Teams are used to manage access and ownership of resources in Greenhouse. | π’ | π‘ | Read-only access to Teams via the UI |
Clusters | Clusters represent a Kubernetes cluster that is managed by Greenhouse. | π‘ | π‘ | Limited modification of Clusters via UI, CLI for KubeConfig registry planned. |
Plugin Definitions & Plugins | Plugins are software components that extend and integrate with Greenhouse . | π‘ | π‘ | Read-only access via UI, a native Plugin Catalog is planned. |
1.1.1 - Organizations
What are Organizations?
Organizations are the top-level entities in Greenhouse. Each Organization gets a dedicated Namespace, that contains all resources bound to the Organization. Greenhouse expects an Organization to provide it’s own Identity Provider and currently supports OIDC Identity Providers. Greenhouse also supports SCIM for syncing users and groups from an Identity Provider.
See creating an Organization for more details.
Organization Namespace and Permissions
The Organization’s Namespace in the Greenhouse cluster contains all resources bound to the Organization. This Namespace is automatically provisioned when a new Organization is created and shares the Organization’s name. Once the Namespace is created, Greenhouse will automatically seed RBAC Roles and ClusterRoles for the Organization. These are used to grant permissions for the Organization’s resources to Teams.
- The Administrators of an Organization are specified via a identity provider (IDP) group during the creation of the Organization.
- The Administrators for Plugins and Clusters need to be defined by the Organization Admins via
RoleBindings
for the seeded Rolesrole:<org-name>:cluster-admin
androle:<org-name>:plugin-admin
. - All authenticated users are considered members of the Organization and are granted the
organization:<org-name>
Role.
The following roles are seeded for each Organization:
Name | Description | ApiGroups | Resources | Verbs | Cluster scoped |
---|---|---|---|---|---|
role:<org-name>:admin | An admin of a Greenhouse Organization . This entails the permissions of role:<org-name>:cluster-admin and role:<org-name>:plugin-admin | greenhouse.sap/v1alpha1 | * | * | - |
v1 | secrets | * | - | ||
"" | pods , replicasets , deployments , statefulsets , daemonsets , cronjobs , jobs , configmaps | get , list , watch | - | ||
monitoring.coreos.com | alertmanagers , alertmanagerconfigs | get , list , watch | - | ||
role:<org-name>:cluster-admin | An admin of Greenhouse Clusters within an Organization | greenhouse.sap/v1alpha1 | clusters , teamrolebindings | * | - |
v1 | secrets | create , update , patch | - | ||
role:<org-name>:plugin-admin | An admin of Greenhouse Plugins within an Organization | greenhouse.sap/v1alpha1 | plugins , pluginpresets | * | - |
v1 | secrets | create , update , patch | - | ||
organization:<org-name> | A member of a Greenhouse Organization | greenhouse.sap/v1alpha1 | * | get , list , watch | - |
organization:<org-name> | A member of a Greenhouse Organization | greenhouse.sap/v1alpha1 | organizations , plugindefinitions | get , list , watch | x |
OIDC
Each Organization must specify the OIDC configuration for the Organization’s IDP. This configuration is used together with DEXIDP to authenticate users in the Organization.
SCIM
Each Organization can specify SCIM credentials which are used to syncronize users and groups from an Identity Provider. This makes it possible to view the members of a Team in the Greenhouse dashboard.
1.1.2 - Teams
What are Teams?
Teams are used to manage access to resources in Greenhouse and managed Kubernetes clusters. Each Team must be backed by a group in the identity provider (IdP) of the Organization. Teams are used to structure members of your Organization and assign fine-grained access and permission levels. The Greenhouse Dashboard is showing the members of a Team.
Team RBAC
TeamRoles and TeamRoleBindings provide a mechanism to control the permissions of Teams to onboarded Clusters of an Organization.
Team role-based access control (RBAC) wraps the concept of Kubernetes RBAC in TeamRoles and TeamRoleBindings. TeamRoles are used to define a set of RBAC permissions. These permissions can be granted to Teams with TeamRoleBindings. A TeamRoleBinding refers to a Team, a TeamRole, Cluster(s) and optional Namespaces. Depending on the latter, Greenhouse will create the appropriate rbacv1
resources on the targeted cluster(s) in either Cluster or Namespace scope.
More information about how this can be configured is mentioned in this user guide.
Example of a TeamRoleBinding for a observability-admin
which grants the cluster-admin
role on the observability
cluster in the logs
and metrics
namespaces. The TeamRoleBinding contains a list of namespaces and a label selector to select the cluster(s) to target. If no Namespaces are provided, then Greenhouse will create a ClusterRoleBinding instead of a RoleBinding.
1.1.3 - Clusters
What are Clusters?
In the context of Greenhouse a Cluster represents a Kubernetes cluster that is onboarded to Greenhouse. Onboarded in this context means that Greenhouse can handle the management of role-based access control (RBAC) and the provisioning of operating tools (e.g. logging, monitoring, ingress etc.). The Greenhouse dashboard provides an overview of all onboarded clusters. Throughout Greenhouse the reference to a Cluster is used to target it for configuration and deployments.
Cluster access
During the initial onboarding of a cluster, Greenhouse will create a dedicated ServiceAccount inside the onboarded cluster. This ServiceAccount’s token is rotated automatically by Greenhouse.
Cluster registry (coming soon)
Once a Cluster is onboarded to Greenhouse a ClusterKubeConfig is generated for the Cluster based on the OIDC configuration of the Organization. This enables members of an Organization to access the fleet of onboarded Clusters via the common Identity Provider. on the respective Clusters can be managed via Greenhouse Team RBAC.
In order to make it convenient to use these ClusterKubeConfigs and to easily switch between multiple context locally there will be a CLI provided by Greenhouse.
1.1.4 - PluginDefinitions, Plugins and PluginPresets
What are PluginDefinitions and Plugins?
PluginDefinitons and Plugins are the Greenhouse way to extend the core functionality with domain specific features. PluginDefinitions, as the name suggests, are the definition of a Plugin, whereas a Plugin is a concrete instance of a PluginDefinition that is deployed to a Cluster.
The PluginDefinitions are shared between all Organizations in Greenhouse. A PluginDefinition can include a frontend, that is displayed in the Greenhouse dashboard and/or a backend component. The frontend is expected to be a standalone microfrontend created with the Juno framework. The backend components of a PluginDefinition are packaged as a Helm Chart and provide sane and opinionated default values. This allows Greenhouse to package and distribute tools such as Prometheus with a sensible default configuration, as well as giving the user a list of configurable values.
A Plugin is used to deploy the Helm Chart referenced in a PluginDefinition to a Cluster. The Plugin can be considered as an instance of a PluginDefinition, this instance specifies the PluginDefinition, the desired Cluster and additional values to set. Depending on the PluginDefinition, it may be necessary to specify required values (e.g. credentials, endpoints, etc.), but in general the PluginDefinition provides well-established default values.
[!NOTE] In this example the Plugin ‘openTelemetry-cluster-a’ is used to deploy the PluginDefinition ‘openTelemetry’ to the cluster ‘cluster-a’.
PluginPresets
PluginPresets are a mechanism to configure Plugins for multiple clusters at once. They are used to define a common configuration for a PluginDefinition that can be applied to multiple clusters, while allowing to override the configuration for individual clusters.
[!NOTE] In this example the PluginPreset ’example-obs’ references the PluginDefinition ’example’ and contains a clusterSelector that matches the clusters ‘cluster-a’ and ‘cluster-b’. The PluginPreset creates two Plugins ’example-obs-cluster-a’ and ’example-obs-cluster-b’ for the respective clusters.
1.2 - Overview
What is Greenhouse?
Greenhouse is a cloud operations platform designed to streamline and simplify the management of a large-scale, distributed infrastructure.
It offers a unified interface for organizations to manage various operational aspects efficiently and transparently and operate their cloud infrastructure in compliance with industry standards.
The platform addresses common challenges such as the fragmentation of tools, visibility of application-specific permission concepts and the management of organizational groups.
It also emphasizes the harmonization and standardization of authorization concepts to enhance security and scalability.
With its operator-friendly dashboard, features and extensive automation capabilities, Greenhouse empowers organizations to optimize their cloud operations, reduce manual efforts, and achieve greater operational efficiency.
Value Propositions
Roadmap
The Roadmap Kanban board provides an overview of ongoing and planned efforts.
Architecture & Design
The Greenhouse design and architecture document describes the various use-cases and user stories.
1.3 - Quick start guide
This section provides a step-by-step walkthrough for new users to navigate the initial stages of the Greenhouse platform.
Sign up
You’re an administrator and organization lead and want to register for Greenhouse?
We got you covered!
During phase 1 and 2 of the roadmap Greenhouse is only open to selected early adopters.
Please reach out to the Greenhouse team to register and create your organization via Slack or DL Greenhouse.
Prerequisites:
CAM Profile
A CAM profile is required to configure the administrators of the organization.
Please include the name of the profile in the message to the Greenhouse team when signing up.SAP ID service
The authentication for the users belonging to your organization is based on the OpenID Connect (OIDC) standard.
For SAP, we recommend using a SAP ID service (IDS) tenant.
Please include the parameters for your tenant in the message to the Greenhouse team when signing up.If you don’t have a SAP ID Service tenant yet, please refer to the SAP ID Service section for more information.
Already a member
You’re a member of an existing organization and want to manage your teams, clusters or plugins?
Please refer to the user guides to find out more.
1.4 - Installation
This section provides a step-by-step guide to install Greenhouse on a Gardner shoot cluster.
Prerequisites
Before you start the installation, make sure you have the following prerequisites:
- Helm & Kubernetes CLI
- OAuth2/OpenID provider (see Authentik)
- Gardener Shoot Cluster configured to use the OIDC provider
- nginx-ingress deployed in the cluster
Installation
To install Greenhouse on your Gardener shoot cluster, follow these steps:
Create a values file called
values.yaml
with the following content:global: dnsDomain: tld.domain # Shoot.spec.dns.domain kubeAPISubDomain: myapi # api is already used by Gardener oidc: enabled: true issuer: <issuer-url> clientID: <client-ID> clientSecret: <top-secret> organization: enabled: false # disable, because the greenhouse webhook is not running yet teams: admin: mappedIdPGroup: greenhouse-admins # gardener specifics dashboard: ingress: annotations: dns.gardener.cloud/dnsnames: "*" dns.gardener.cloud/ttl: "600" dns.gardener.cloud/class: garden cert.gardener.cloud/purpose: managed idproxy: enabled: false # disable because no organization is created yet ingress: annotations: dns.gardener.cloud/dnsnames: "*" dns.gardener.cloud/ttl: "600" dns.gardener.cloud/class: garden cert.gardener.cloud/purpose: managed cors-proxy: ingress: annotations: dns.gardener.cloud/dnsnames: "*" dns.gardener.cloud/ttl: "600" dns.gardener.cloud/class: garden cert.gardener.cloud/purpose: managed # disable Plugins for the greenhouse organization, PluginDefinitions are missing plugins: enabled: false # disable, Prometheus CRDs are missing manager: alerts: enabled: false
Install the Greenhouse Helm chart:
helm install greenhouse oci://ghcr.io/cloudoperators/greenhouse/charts/greenhouse --version <greenhouse-release-version> -f values.yaml
Enable Greenhouse OIDC
Now set
organization.enabled
andidproxy.enabled
totrue
in thevalues.yaml
file and upgrade the Helm release:helm upgrade greenhouse oci://ghcr.io/cloudoperators/greenhouse/charts/greenhouse --version <greenhouse-release-version> -f values.yaml
This will create the initial Greenhouse Organization and the Greenhouse Admin Team. This Organization will receive the
greenhouse
namespace, which is used to manage the Greenhouse installation and allows to administer other organizations. Enabling the idproxy will deploy the idproxy service which handles the OIDC authentication.
2 - User guides
2.1 - Organization management
This section provides guides for the management your organization in Greenhouse.
2.1.1 - SAP ID Service
This section provides a step-by-step walkthrough for new users to request an SAP ID Service (IDS) tenant.
NOTE: This document is only available on the SAP-internal documentation page.
2.1.2 - Creating an organization
Before you begin
This guides describes how to create an organization in Greenhouse.
During phase 1 and 2 of the roadmap Greenhouse is only open to selected early adopters.
Please reach out to the Greenhouse team to register and create your organization via Slack or DL Greenhouse.
Creating an organization
An organization within the Greenhouse cloud operations platform is a separate unit with its own configuration, teams, and resources tailored to their requirements.
These organizations can represent different teams, departments, or projects within an enterprise, and they operate independently within the Greenhouse platform.
They allow for the isolation and management of resources and configurations specific to their needs.
While the Greenhouse is build on the idea of a self-service API and automation driven platform, the workflow to onboard an organization to Greenhouse
currently involves reaching out to the Greenhouse administrators until the official go-live.
This ensures all pre-requisites are met, the organization is configured correctly and the administrators understand the platform capabilities.
:exclamation: Please note that the name of an organization is immutable. |
---|
Steps
CAM Profile
A CAM profile is required to configure the administrators of the organization.
Please include the name of the profile in the message to the Greenhouse team when signing up.SAP ID service
The authentication for the users belonging to your organization is based on the OpenID Connect (OIDC) standard.
For SAP, we recommend using a SAP ID service (IDS) tenant.
Please include the parameters for your tenant in the message to the Greenhouse team when signing up.If you don’t have a SAP ID Service tenant yet, please refer to the SAP ID Service section for more information.
Greenhouse organization
A Greenhouse administrator applies the following configuration to the central Greenhouse cluster.
Bear in mind that the name of the organization is immutable and will be part of all URLs.apiVersion: v1 kind: Namespace metadata: name: my-organization --- apiVersion: v1 kind: Secret metadata: name: oidc-config namespace: my-organization type: Opaque data: clientID: ... clientSecret: ... --- apiVersion: greenhouse.sap/v1alpha1 kind: Organization metadata: name: my-organization spec: authentication: oidc: clientIDReference: key: clientID name: oidc-config clientSecretReference: key: clientSecret name: oidc-config issuer: https://... scim: baseURL: URL to the SCIM server. basicAuthUser: secret: name: Name of the secret in the same namespace. key: Key in the secret holding the user value. basicAuthPw: secret: name: Name of the secret in the same namespace. key: Key in the secret holding the password value. description: My new organization displayName: Short name of the organization mappedOrgAdminIdPGroup: Name of the group in the IDP that should be mapped to the organization admin role.
Setting up Team Membership synchronization with Greenhouse
Team Membership synchronization with Greenhouse requires access to SCIM API.
For the Team Memberships to be created Organization needs to be configured with URL and credentials of the SCIM API. SCIM API is used to get members for teams in the organization based on the IDP groups set for teams.
IDP group for the organization admin team must be set to the mappedOrgAdminIdPGroup
field in the Organization configuration. It is required for the synchronization to work. IDP groups for remaining teams in the organization should be set in their respective configurations.
2.2 - Cluster management
Greenhouse enables organizations to register their Kubernetes clusters within the platform, providing a centralized interface for managing and monitoring these clusters.
Once registered, users can perform tasks related to cluster management, such as deploying applications, scaling resources, and configuring access control, all within the Greenhouse platform.
This section provides guides for the management of Kubernetes clusters within Greenhouse.
2.2.1 - Cluster onboarding
Content Overview
This guides describes how to onboard an existing Kubernetes cluster to your Greenhouse organization.
If you don’t have an organization yet please reach out to the Greenhouse administrators.
While all members of an organization can see existing clusters, their management requires org-admin
or cluster-admin
privileges.
NOTE: The UI is currently in development. For now this guide describes the onboarding workflow via command line.
Preparation
Download the latest greenhousectl
binary from here.
Onboarding a Cluster
to Greenhouse will require you to authenticate to two different Kubernetes clusters via respective kubeconfig
files:
greenhouse
: The cluster your Greenhouse installation is running on. You needorganization-admin
orcluster-admin
privileges.bootstrap
: The cluster you want to onboard. You needsystem:masters
privileges.
For consistency we will refer to those two clusters by their names from now on.
You need to have the kubeconfig
files for both the greenhouse
and the bootstrap
cluster at hand. The kubeconfig
file for the greenhouse
cluster can be downloaded via the Greenhouse dashboard:
Organization > Clusters > Access Greenhouse cluster.
Onboard
For accessing the bootstrap
cluster, the greenhousectl
will expect your default Kubernetes kubeconfig
file and context
to be set to bootstrap
. This can be achieved by passing the --kubeconfig
flag or by setting the KUBECONFIG
env var.
The location of the kubeconfig
file to the greenhouse
cluster is passed via the --greenhouse-kubeconfig
flag.
greenhousectl cluster bootstrap --kubeconfig=<path/to/bootstrap-kubeconfig-file> --greenhouse-kubeconfig <path/to/greenhouse-kubeconfig-file> --org <greenhouse-organization-name> --cluster-name <name>
Since Greenhouse generates URLs which contain the cluster name, we highly recommend to choose a short cluster name.
In particular for Gardener Clusters setting a short name is mandatory, because Gardener has very long cluster names, e.g. garden-greenhouse--monitoring-external
.
A typical output when you run the command looks like
2024-02-01T09:34:55.522+0100 INFO setup Loaded kubeconfig {"context": "default", "host": "https://api.greenhouse-qa.eu-nl-1.cloud.sap"}
2024-02-01T09:34:55.523+0100 INFO setup Loaded client kubeconfig {"host": "https://api.monitoring.greenhouse.shoot.canary.k8s-hana.ondemand.com"}
2024-02-01T09:34:56.579+0100 INFO setup Bootstraping cluster {"clusterName": "monitoring", "orgName": "ccloud"}
2024-02-01T09:34:56.639+0100 INFO setup created namespace {"name": "ccloud"}
2024-02-01T09:34:56.696+0100 INFO setup created serviceAccount {"name": "greenhouse"}
2024-02-01T09:34:56.810+0100 INFO setup created clusterRoleBinding {"name": "greenhouse"}
2024-02-01T09:34:57.189+0100 INFO setup created clusterSecret {"name": "monitoring"}
2024-02-01T09:34:58.309+0100 INFO setup Bootstraping cluster finished {"clusterName": "monitoring", "orgName": "ccloud"}
After onboarding
- List all clusters in your Greenhouse organization:
kubectl --namespace=<greenhouse-organization-name> get clusters
- Show the details of a cluster:
kubectl --namespace=<greenhouse-organization-name> get cluster <name> -o yaml
Example:
apiVersion: greenhouse.sap/v1alpha1
kind: Cluster
metadata:
creationTimestamp: "2024-02-07T10:23:23Z"
finalizers:
- greenhouse.sap/cleanup
generation: 1
name: monitoring
namespace: ccloud
resourceVersion: "282792586"
uid: 0db6e464-ec36-459e-8a05-4ad668b57f42
spec:
accessMode: direct
maxTokenValidity: 72h
status:
bearerTokenExpirationTimestamp: "2024-02-09T06:28:57Z"
kubernetesVersion: v1.27.8
statusConditions:
conditions:
- lastTransitionTime: "2024-02-09T06:28:57Z"
status: "True"
type: Ready
When the status.kubernetesVersion
field shows the correct version of the Kubernetes cluster, the cluster was successfully bootstrapped in Greenhouse.
Then status.conditions
will contain a Condition
with type=Ready
and status="true""
In the remote cluster, a new namespace is created and contains some resources managed by Greenhouse. The namespace has the same name as your organization in Greenhouse.
Troubleshooting
If the bootstrapping failed, you can find details about why it failed in the Cluster.statusConditions
. More precisely there will be a condition of type=KubeConfigValid
which might have hints in the message
field. This is also displayed in the UI on the Cluster
details view.
Reruning the onboarding command with an updated kubeConfig
file will fix these issues.
2.2.2 - Cluster offboarding
Content Overview
This guides describes how to off-board an existing Kubernetes cluster in your Greenhouse organization.
While all members of an organization can see existing clusters, their management requires org-admin
or cluster-admin
privileges.
NOTE: The UI is currently in development. For now this guide describes the onboarding workflow via command line.
Pre-requisites
Offboarding a Cluster
in Greenhouse requires authenticating to the greenhouse
cluster via kubeconfig
file:
greenhouse
: The cluster where Greenhouse installation is running on.organization-admin
orcluster-admin
privileges is needed for deleting aCluster
resource.
Schedule Deletion
By default Cluster
resource deletion is blocked by ValidatingWebhookConfiguration
in Greenhouse.
This is done to prevent accidental deletion of cluster resources.
List the clusters in your Greenhouse organization:
kubectl --namespace=<greenhouse-organization-name> get clusters
A typical output when you run the command looks like
NAME AGE ACCESSMODE READY
mycluster-1 15d direct True
mycluster-2 35d direct True
mycluster-3 108d direct True
Delete a Cluster
resource by annotating it with greenhouse.sap/delete-cluster: "true"
.
Example:
kubectl annotate cluster mycluster-1 greenhouse.sap/delete-cluster=true --namespace=my-org
Once the Cluster
resource is annotated, the Cluster
will be scheduled for deletion in 48 hours (UTC time).
This is reflected in the Cluster
resource annotations and in the status conditions.
View the deletion schedule by inspecting the Cluster
resource:
kubectl get cluster mycluster-1 --namespace=my-org -o yaml
A typical output when you run the command looks like
apiVersion: greenhouse.sap/v1alpha1
kind: Cluster
metadata:
annotations:
greenhouse.sap/delete-cluster: "true"
greenhouse.sap/deletion-schedule: "2025-01-17 11:16:40"
finalizers:
- greenhouse.sap/cleanup
name: mycluster-1
namespace: my-org
spec:
accessMode: direct
kubeConfig:
maxTokenValidity: 72
status:
...
statusConditions:
conditions:
...
- lastTransitionTime: "2025-01-15T11:16:40Z"
message: deletion scheduled at 2025-01-17 11:16:40
reason: ScheduledDeletion
status: "False"
type: Delete
In order to cancel the deletion, you can remove the greenhouse.sap/delete-cluster
annotation:
kubectl annotate cluster mycluster-1 greenhouse.sap/delete-cluster- --namespace=my-org
the
-
at the end of the annotation name is used to remove the annotation.
Impact
When a Cluster
resource is scheduled for deletion, all Plugin
resources associated with the Cluster
resource will skip the reconciliation process.
When the deletion schedule is reached, the Cluster
resource will be deleted and all associated resources Plugin
resources will be deleted as well.
Immediate Deletion
In order to delete a Cluster
resource immediately -
- annotate the
Cluster
resource withgreenhouse.sap/delete-cluster
. (see Schedule Deletion) - update the
greenhouse.sap/deletion-schedule
annotation to the current date and time.
You can also annotate the Cluster
resource with greenhouse.sap/delete-cluster
and greenhouse.sap/deletion-schedule
at the same time and set the current date and time for deletion.
The time and date should be in
YYYY-MM-DD HH:MM:SS
format or golang’stime.DateTime
format. The time should be in UTC timezone.
Troubleshooting
If the cluster deletion has failed, you can troubleshoot the issue by inspecting -
Cluster
resource status conditions, specifically theKubeConfigValid
condition.- status conditions of the
Plugin
resources associated with theCluster
resource. There will be a clear indication of the issue inHelmReconcileFailed
condition.
2.3 - Plugin management
Plugins extends the capabilities of the Greenhouse cloud operations platform, adding specific features or functionalities to tailor and enhance the platform for specific organizational needs.
These plugins are integral to Greenhouse’ extensibility, allowing users to customize their cloud operations environment and address unique requirements while operating within the Greenhouse ecosystem.
This section provides guides for the management of plugins for Kubernetes clusters within Greenhouse.
2.3.1 - Local Plugin Development
Introduction
Let’s illustrate how to leverage Greenhouse Plugins to deploy a Helm Chart into a remote cluster within the local development environment.
This guide will walk you through the process of spinning up the local development environment, creating a new Greenhouse PluginDefinition and deploying it to a local kind cluster.
At the end of the guide you will have spun up the local development environment, onboarded a Cluster, created a PluginDefinition and deployed it as a Plugin to the onboarded Cluster.
[!NOTE] This guide assumes you already have a working Helm chart and will not cover how to create a Helm Chart from scratch. For more information on how to create a Helm Chart, please refer to the Helm documentation.
Requirements
Starting the local develoment environment
Follow the Local Development documentation to spin up the local Greenhouse development environment.
This will provide you with a local Greenhouse instance running, filled with some example Greenhouse resources and the Greenhouse UI running on http://localhost:3000
.
Onboarding a Cluster
In this step we will create and onboard a new Cluster to the local Greenhouse instance. The local cluster will be created utilizing kind.
In order to onboard a kind cluster follow the onboarding a cluster secton of the dev-env README.
After onboarding the cluster you should see the new Cluster in the Greenhouse UI.
Prepare Helm Chart
For this example we will use the bitnami nginx Helm Chart. The packaged chart can be downloaded with:
helm pull oci://registry-1.docker.io/bitnamicharts/nginx --destination ./
After unpacking the *.tgz
file there is a folder named nginx
containing the Helm Chart.
Generating a PluginDefinition from a Helm Chart
Using the files of the Helm Chart we will create a new Greenhouse PluginDefinition using the greenhousectl
CLI.
greenhousectl plugin generate ./nginx ./nginx-plugin
This will create a new folder nginx-plugin
containing the PluginDefinition in a nested structure.
Modifying the PluginDefinition
The generated PluginDefinition contains a plugindefinition.yaml
file which defines the PluginDefinition. But there are still a few steps required to make it work.
Specify the Helm Chart repository
After generating the PluginDefinition the .spec.helmChart.repository
field in the plugindefinition.yaml
contains a TODO comment. This field should be set to the repository where the Helm Chart is stored.
For the bitnami nginx Helm Chart this would be oci://registry-1.docker.io/bitnamicharts
.
Specify the UI application
A PluginDefinition may specify a UI application that will be integrated into the Greenhouse UI. This tutorial does not cover how to create a UI application. Therefore the section .spec.uiApplication
in the plugindefinition.yaml
should be removed.
[!INFORMATION] The UI section of the dev-env readme provides a brief introduction developing a frontend application for Greenhouse.
Modify the Options
The PluginDefinition contains a section .spec.options
which defines options that can be set when deploying the Plugin to a Cluster. These options have been generated based on the Helm Chart values.yaml file. You can modify the options to fit your needs.
In general the options are defined as follows:
options:
- default: true
value: abcd123
description: automountServiceAccountToken
name: automountServiceAccountToken
required: false
type: ""
default specifies if the option should provide a default value. If this is set to true, the value specified will be used as the default value. The Plugin can still provide a different value for this option.
description provides a description for the option.
name specifies the Helm Chart value name, as it is used within the Chart’s template files.
required specifies if the option is required. This will be used by the Greenhouse Controllers to determine if a Plugin is valid.
type specifies the type of the option. This can be any of [string, secret, bool, int, list, map]
. This will be used by the Greenhouse Controllers to validate the provided value.
For this tutorial we will remove all options.
Deploying a Plugin to the Kind Cluster
After modifying the PluginDefinition we can deploy it to the local Greenhouse cluster and create a Plugin that will deploy the nginx
to the onboarded cluster.
kubectl --kubeconfig=./envtest/kubeconfig apply -f ./nginx-plugin/nginx/17.3.2/plugindefinition.yaml
plugindefinition.greenhouse.sap/nginx-17.3.2 created
The Plugin can be configured using the Greenhouse UI running on http://localhost:3000
.
Follow the following steps to deploy a Plugin for the created PluginDefinition into the onboarded kind cluster:
- Navigate on the Greenhouse UI to
Organization>Plugins
. - Click on the
Add Plugin
button. - Select the
nginx-17.3.2
PluginDefinition. - Click on the
Configure Plugin
button. - Select the cluster in the drop-down.
- Click on the
Create Plugin
button.
After the Plugin has been created the Plugin Overview page will show the status of the plugin.
Theh deployment can also be verified in the onboarded cluster by checking the pods in the test-org
namespace of the kind cluster.
kind export kubeconfig --name remote-cluster
Set kubectl context to "kind-remote-cluster"
k get pods -n test-org
NAME READY STATUS RESTARTS AGE
nginx-remote-cluster-758bf47c77-pz72l 1/1 Running 0 2m11s
Development Tips
Local Helm Charts
Instead of uploading the Helm Chart to a chart repository, it is possible to load it from the filesystem of the Greenhouse container.
This can be especially useful if you are developing your own chart for a PluginDefinition, as it speeds up the testing loop.
The Docker compose setup mounts the dev-env/helm-charts
directory and watches for any changes. This means you can point to this local chart in your plugindefinition.yaml
as such:
helmChart:
name: helm-charts/{filename}.tgz
repository:
2.3.2 - Testing a Plugin
Overview
Plugin Testing Requirements
All plugins contributed to plugin-extensions repository should include comprehensive Helm Chart Tests using the bats/bats-detik
testing framework. This ensures our plugins are robust, deployable, and catch potential issues early in the development cycle.
What is bats/bats-detik?
The bats/bats-detik framework simplifies end-to-end (e2e) Testing in Kubernetes. It combines the Bash Automated Testing System (bats
) with Kubernetes-specific assertions (detik
). This allows you to write test cases using natural language-like syntax, making your tests easier to read and maintain.
Implementing Tests
Create a
/tests
folder inside your Plugin’s Helm Charttemplates
folder to store your test resources.ConfigMap definition:
- Create a
test-<plugin-name>-config.yaml
file in thetemplates/tests
directory to define aConfigMap
that will hold your test script. - This
ConfigMap
contains the test scriptrun.sh
that will be executed by the testPod
to run your tests.
- Create a
{{- if .Values.testFramework.enabled -}}
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ .Release.Name }}-test
namespace: {{ .Release.Namespace }}
labels:
type: integration-test
annotations:
"helm.sh/hook": test
"helm.sh/hook-weight": "-5" # Installed and upgraded before the test pod
"helm.sh/hook-delete-policy": "before-hook-creation,hook-succeeded"
data:
run.sh: |-
#!/usr/bin/env bats
load "/usr/lib/bats/bats-detik/utils"
load "/usr/lib/bats/bats-detik/detik"
DETIK_CLIENT_NAME="kubectl"
@test "Verify successful deployment and running status of the {{ .Release.Name }}-operator pod" {
verify "there is 1 deployment named '{{ .Release.Name }}-operator'"
verify "there is 1 service named '{{ .Release.Name }}-operator'"
try "at most 2 times every 5s to get pods named '{{ .Release.Name }}-operator' and verify that '.status.phase' is 'running'"
}
@test "Verify successful creation and bound status of {{ .Release.Name }} persistent volume claims" {
try "at most 3 times every 5s to get persistentvolumeclaims named '{{ .Release.Name }}.*' and verify that '.status.phase' is 'Bound'"
}
@test "Verify successful creation and available replicas of {{ .Release.Name }} Prometheus resource" {
try "at most 3 times every 5s to get prometheuses named '{{ .Release.Name }}' and verify that '.status.availableReplicas' is more than '0'"
}
@test "Verify creation of required custom resource definitions (CRDs) for {{ .Release.Name }}" {
verify "there is 1 customresourcedefinition named 'prometheuses'"
verify "there is 1 customresourcedefinition named 'podmonitors'"
}
{{- end -}}
Note: You can use this guide for reference when writing your test assertions.
Test Pod Definition:
- Create a
test-<plugin-name>.yaml
file in thetemplates/tests
directory to define aPod
that will run your tests. - This test
Pod
will mount theConfigMap
created in the previous step and will execute the test scriptrun.sh
.
- Create a
{{- if .Values.testFramework.enabled -}}
apiVersion: v1
kind: Pod
metadata:
name: {{ .Release.Name }}-test
namespace: {{ .Release.Namespace }}
labels:
type: integration-test
annotations:
"helm.sh/hook": test
"helm.sh/hook-delete-policy": "before-hook-creation,hook-succeeded"
spec:
serviceAccountName: {{ .Release.Name }}-test
containers:
- name: bats-test
image: "{{ .Values.testFramework.image.registry}}/{{ .Values.testFramework.image.repository}}:{{ .Values.testFramework.image.tag }}"
imagePullPolicy: {{ .Values.testFramework.image.pullPolicy }}
command: ["bats", "-t", "/tests/run.sh"]
volumeMounts:
- name: tests
mountPath: /tests
readOnly: true volumes:
- name: tests
configMap:
name: {{ .Release.Name }}-test
restartPolicy: Never
{{- end -}}
- RBAC Permissions:
- Create the necessary RBAC resources in the
templates/tests
folder with a dedicatedServiceAccount
and role authorisations so that the testPod
can cover test the cases. - You can use test-permissions.yaml from the
kube-monitoring
as a reference to configure RBAC permissions for your test Pod.
- Configure the Test Framework in Plugin’s
values.yaml
:- Add the following configuration to your Plugin’s
values.yaml
file:
- Add the following configuration to your Plugin’s
testFramework:
enabled: true
image:
registry: ghcr.io
repository: cloudoperators/greenhouse-extensions-integration-test
tag: main
imagePullPolicy: IfNotPresent
- Running the Tests:
Important: Once you have completed all the steps above, you are ready to run the tests. However, before running the tests, ensure that you perform a fresh Helm installation or upgrade of your Plugin’s Helm release against your test Kubernetes cluster (for example, Minikube or Kind) by executing the following command:
# For a new installation
helm install <Release name> <chart-path>
# For an upgrade
helm upgrade <Release name> <chart-path>
- After the Helm installation or upgrade is successful, run the tests against the same test Kubernetes cluster by executing the following command.
helm test <Release name>
Contribution Checklist
Before submitting a pull request:
- Ensure your Plugin’s Helm Chart includes a
/tests
directory. - Verify the presence of
test-<plugin-name>.yaml
,test-<plugin-name>-config.yaml
, andtest-permissions.yaml
files. - Test your Plugin thoroughly using
helm test <release-name>
and confirm that all tests pass against a test Kubernetes cluster. - Include a brief description of the tests in your pull request.
- Make sure that your Plugin’s Chart Directory and the Plugin’s Upstream Chart Repository are added to this greenhouse-extensions helm test config file. This will ensure that your Plugin’s tests are automatically run in the GitHub Actions workflow when you submit a pull request for this Plugin.
- Note that the dependencies of your Plugin’s helm chart might also have their own tests. If so, ensure that the tests of the dependencies are also passing.
Important Notes
- Test Coverage: Aim for comprehensive test coverage to ensure your Plugin’s reliability.
- Test Isolation: Design tests that don’t interfere with other plugins or production environments.
2.3.3 - Plugin deployment
Before you begin
This guides describes how to configure and deploy a Greenhouse plugin.
apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
name: kube-monitoring-martin
namespace: <organization namespace> # same namespace in remote cluster for resources
spec:
clusterName: <name of the remote cluster > # k get cluster
disabled: false
displayName: <any human readable name>
pluginDefinition: <plugin name> # k get plugin
optionValues:
- name: <from the plugin options>
value: <from the plugin options>
- ...
Exposed services
Plugins deploying Helm Charts into remote clusters support exposed services.
By adding the following label to a service in helm chart it will become accessible from the central greenhouse system via a service proxy:
greenhouse.sap/expose: "true"
Deploying a Plugin
Create the Plugin resource via the command:
kubectl --namespace=<organization name> create -f plugin.yaml
After deployment
Check with
kubectl --namespace=<organization name> get plugin
has been properly created. When all components of the plugin are successfully created, the plugin should show the state configured.Check in the remote cluster that all plugin resources are created in the organization namespace.
URLs for exposed services
After deploying the plugin to a remote cluster, ExposedServices section in Plugin’s status provides an overview of the Plugins services that are centrally exposed. It maps the exposed URL to the service found in the manifest.
- The URLs for exposed services are created in the following pattern:
$https://$cluster--$hash.$organisation.$basedomain
. The$hash
is computed fromservice--$namespace
. - When deploying a plugin to the central cluster, the exposed services won’t have their URLs defined, which will be reflected in the Plugin’s Status.
2.3.4 - Managing Plugins for multiple clusters
Managing Plugins for multiple clusters
This guide describes how to configure and deploy a Greenhouse Plugin with the same configuration into multiple clusters.
The PluginPreset resource is used to create and deploy Plugins with a the identical configuration into multiple clusters. The list of clusters the Plugins will be deployed to is determind by a LabelSelector.
As a result, whenever a cluster, that matches the ClusterSelector is onboarded or offboarded, the Controller for the PluginPresets will take care of the Plugin Lifecycle. This means creating or deleting the Plugin for the respective cluster.
The same validation applies to the PluginPreset as to the Plugin. This includes immutable PluginDefinition and ReleaseNamespace fields, as well as the validation of the OptionValues against the PluginDefinition.
In case the PluginPreset is updated all of the Plugin instances that are managed by the PluginPreset will be updated as well. Each Plugin instance that is created from a PluginPreset has a label greenhouse.sap/pluginpreset: <PluginPreset name>
. Also the name of the Plugin follows the scheme <PluginPreset name>-<cluster name>
.
Changes that are done directly on a Plugin which was created from a PluginPreset will be overwritten immediately by the PluginPreset Controller. All changes must be performed on the PluginPreset itself. If a Plugin already existed with the same name as the PluginPreset would create, this Plugin will be ignored in following reconciliations.
A PluginPreset with the annotation greenhouse.sap/prevent-deletion
may not be deleted. This is to prevent the accidental deletion of a PluginPreset including the managed Plugins and their deployed Helm releases. Only after removing the annotation it is possible to delete a PluginPreset.
Example PluginPreset
apiVersion: greenhouse.sap/v1alpha1
kind: PluginPreset
metadata:
name: kube-monitoring-preset
namespace: <organization namespace>
spec:
plugin: # this embeds the PluginSpec
displayName: <any human readable name>
pluginDefinition: <PluginDefinition name> # k get plugindefinition
releaseNamespace: <namespace> # namespace where the plugin is deployed to on the remote cluster. Will be created if not exists
optionValues:
- name: <from the PluginDefinition options>
value: <from the PluginDefinition options>
- ..
clusterSelector: # LabelSelector for the clusters the Plugin should be deployed to
matchLabels:
<label-key>: <label-value>
clusterOptionOverrides: # allows you to override specific options in a given cluster
- clusterName: <cluster name where we want to override values>
overrides:
- name: <option name to override>
value: <new value>
- ..
- ..
2.3.5 - Plugin Catalog
Before you begin
This guides describes how to explore the catalog of Greenhouse PluginDefinitions.
While all members of an organization can see the Plugin catalog, enabling, disabling and configuration PluginDefinitions for an organization requires organization admin privileges.
Exploring the PluginDefinition catalog
The PluginDefinition resource describes the backend and frontend components as well as mandatory configuration options of a Greenhouse extension.
While the PluginDefinition catalog is managed by the Greenhouse administrators and the respective domain experts, administrators of an organization can configure and tailor Plugins to their specific requirements.
NOTE: The UI also provides a preliminary catalog of Plugins under Organization> Plugin> Add Plugin.
Run the following command to see all available PluginDefinitions.
$ kubectl get plugindefinition NAME VERSION DESCRIPTION AGE cert-manager 1.1.0 Automated certificate management in Kubernetes 182d digicert-issuer 1.2.0 Extensions to the cert-manager for DigiCert support 182d disco 1.0.0 Automated DNS management using the Designate Ingress CNAME operator (DISCO) 179d doop 1.0.0 Holistic overview on Gatekeeper policies and violations 177d external-dns 1.0.0 The kubernetes-sigs/external-dns plugin. 186d heureka 1.0.0 Plugin for Heureka, the patch management system. 177d ingress-nginx 1.1.0 Ingress NGINX controller 187d kube-monitoring 1.0.1 Kubernetes native deployment and management of Prometheus, Alertmanager and related monitoring components. 51d prometheus-alertmanager 1.0.0 Prometheus alertmanager 60d supernova 1.0.0 Supernova, the holistic alert management UI 187d teams2slack 1.1.0 Manage Slack handles and channels based on Greenhouse teams and their members 115d
2.4 - Team management
A team is a group of users with shared responsibilities for managing and operating cloud resources within a Greenhouse organization.
These teams enable efficient collaboration, access control, and task assignment, allowing organizations to effectively organize their users and streamline cloud operations within the Greenhouse platform.
This section provides guides for the management of teams within an organization.
2.4.1 - Role-based access control
Contents
- Before you begin
- Greenhouse Team RBAC user guide
- Overview
- Defining TeamRoles
- Seeded default TeamRoles
- Defining TeamRoleBindings
Before you begin
This guides describes how to manage roles and permissions in Greenhouse with the help of TeamRoles and TeamRoleBindings.
While all members of an organization can see the permissions configured with TeamRoles & TeamRoleBindings, configuration of these requires OrganizationAdmin privileges.
Greenhouse Team RBAC user guide
Role-Based Access Control (RBAC) in Greenhouse allows organization administrators to regulate access to Kubernetes resources in onboarded Clusters based on the roles of individual users within an Organization.
Within Greenhouse the RBAC on remote Clusters is managed using TeamRole
and TeamRoleBinding
. These two Custom Resource Defintions allow for fine-grained control over the permissions of each Team within each Cluster and Namespace.
Overview
- TeamRole: Defines a set of permissions that can be assigned to teams.
- TeamRoleBinding: Assigns a
TeamRole
to a specificTeam
for certainClusters
and (optionally)Namespaces
.
Defining TeamRoles
TeamRoles
define what actions a team can perform within the Kubernetes cluster.
Common roles including the below cluster-admin
are pre-defined within each organization.
Example
This TeamRole named pod-read
grants read access to Pods!.
apiVersion: greenhouse.sap/v1alpha1
kind: TeamRole
metadata:
name: pod-read
spec:
rules:
- apiGroups:
- ""
resources:
- "pods"
verbs:
- "get"
- "list"
Seeded default TeamRoles
Greenhouse provides a set of default TeamRoles
that are seeded to all clusters:
TeamRole | Description | APIGroups | Resources | Verbs |
---|---|---|---|---|
cluster-admin | Full privileges | * | * | * |
cluster-viewer | get , list and watch all resources | * | * | get , list , watch |
cluster-developer | Aggregated role. Greenhouse aggregates the application-developer and the cluster-viewer . Further TeamRoles can be aggregated. | |||
application-developer | Set of permissions on pods , deployments and statefulsets necessary to develop applications on k8s | apps | deployments , statefulsets | patch |
"" | pods , pods/portforward , pods/eviction , pods/proxy , pods/log , pods/status , | get , list , watch , create , update , patch , delete | ||
node-maintainer | get and patch nodes | "" | nodes | get , patch |
namespace-creator | All permissions on namespaces | "" | namespaces | * |
Defining TeamRoleBindings
TeamRoleBindings
define the permissions of a Greenhouse Team within Clusters by linking to a specific TeamRole
.
TeamRoleBindings have a simple specification that links a Team, a TeamRole, one or more Clusters and optionally a one or more Namespaces together. Once the TeamRoleBinding is created, the Team will have the permissions defined in the TeamRole within the specified Clusters and Namespaces. This allows for fine-grained control over the permissions of each Team within each Cluster.
The TeamRoleBinding Controller within Greenhouse deploys rbacv1 resources to the targeted Clusters. The referenced TeamRole is created as a rbacv1.ClusterRole. In case the TeamRoleBinding references a Namespace, the Controller will create a rbacv1.RoleBinding which links the Team with the rbacv1.ClusterRole. In case no Namespace is referenced, the Controller will create a rbacv1.ClusterRoleBinding instead.
Assigning TeamRoles to Teams on a single Cluster
Roles are assigned to teams through the TeamRoleBinding configuration, which links teams to their respective roles within specific clusters.
This TeamRoleBinding assigns the pod-read
TeamRole to the Team named my-team
in the Cluster named my-cluster
.
Example: team-rolebindings.yaml
apiVersion: greenhouse.sap/v1alpha1
kind: TeamRoleBinding
metadata:
name: my-team-read-access
spec:
teamRef: my-team
roleRef: pod-read
clusterName: my-cluster
Assigning TeamRoles to Teams on multiple Clusters
It is also possible to use a LabelSelector to assign TeamRoleBindings to multiple Clusters at once.
This TeamRoleBinding assigns the pod-read
TeamRole to the Team named my-team
in all Clusters with the label environment: production
.
apiVersion: greenhouse.sap/v1alpha1
kind: TeamRoleBinding
metadata:
name: production-cluster-admins
spec:
teamRef: my-team
roleRef: pod-read
clusterSelector:
matchLabels:
environment: production
Aggregating TeamRoles
It is possible with RBAC to aggregate rbacv1.ClusterRoles. This is also supported for TeamRoles. By specifying .spec.Labels
on a TeamRole the resulting ClusterRole on the target cluster will have the same labels set. Then it is possible to aggregate multiple ClusterRole resources by using a rbacv1.AggregationRule. This can be specified on a TeamRole by setting .spec.aggregationRule
.
More details on the concept of Aggregated ClusterRoles can be found in the Kubernetes documentation: Aggregated ClusterRoles
[!NOTE] A TeamRole is only created on a cluster if it is referenced by a TeamRoleBinding. If a TeamRole is not referenced by a TeamRoleBinding it will not be created on any target cluster. A TeamRoleBinding referencing a TeamRole with an aggregationRule will only provide the correct access, if there is at least one TeamRoleBinding referencing a TeamRole with the corresponding label deployed to the same cluster.
The following example shows how to an AggregationRule can be used with TeamRoles and TeamRoleBindings.
This TeamRole specifies .spec.Labels
. The labels will be applied to the resulting ClusterRole on the target cluster.
apiVersion: greenhouse.sap/v1alpha1
kind: TeamRole
metadata:
name: pod-read
spec:
labels:
aggregate: "true"
rules:
- apiGroups:
- ""
resources:
- "pods"
verbs:
- "get"
- "list"
This TeamRoleBinding assigns the pod-read
TeamRole to the Team named my-team
in all Clusters with the label environment: production
.
apiVersion: greenhouse.sap/v1alpha1
kind: TeamRoleBinding
metadata:
name: production-pod-read
spec:
teamRef: my-team
roleRef: pod-read
clusterSelector:
matchLabels:
environment: production
This creates another TeamRole and TeamRoleBinding including the same labels as above.
apiVersion: greenhouse.sap/v1alpha1
kind: TeamRole
metadata:
name: pod-edit
spec:
labels:
aggregate: "true"
rules:
- apiGroups:
- ""
resources:
- "pod"
verbs:
- "update"
- "patch"
---
apiVersion: greenhouse.sap/v1alpha1
kind: TeamRoleBinding
metadata:
name: production-pod-edit
spec:
teamRef: my-team
roleRef: pod-edit
clusterSelector:
matchLabels:
environment: production
This TeamRole has an aggregationRule set. This aggregationRule will be added to the ClusterRole created on the target clusters. With the aggregationRule set it will aggregate the ClusterRoles created by the TeamRoles with the label aggregate: "true"
. The team will have the permissions of both TeamRoles and will be able to get
, list
, update
and patch
Pods.
apiVersion: greenhouse.sap/v1alpha1
kind: TeamRole
metadata:
name: aggregated-role
spec:
aggregationRule:
clusterRoleSelectors:
- matchLabels:
"aggregate": "true"
apiVersion: greenhouse.sap/v1alpha1
kind: TeamRoleBinding
metadata:
name: aggregated-rolebinding
spec:
teamRef: operators
roleRef: aggregated-role
clusterSelector:
matchLabels:
environment: production
2.4.2 - Team creation
Before you begin
This guides describes how to create a team in your Greenhouse organization.
While all members of an organization can see existing teams, their management requires organization admin privileges.
Creating a team
The team resource is used to structure members of your organization and assign fine-grained access and permission levels.
Each Team must be backed by a group in the identity provider (IdP) of the Organization.
- IdP group should be set on the
mappedIdPGroup
field in Team configuration. - This, along with SCIM API configured in the Organization, allows for synchronization of TeamMemberships with Greenhouse.
NOTE: The UI is currently in development. For now this guides describes the onboarding workflow via command line.
- To onboard a new cluster provide the kubeconfig file with a static, short-lived token.
It should look similar to this example:cat <<EOF | kubectl apply -f - apiVersion: greenhouse.sap/v1alpha1 kind: Team metadata: name: <name> spec: description: My new team mappedIdPGroup: <IdP group name> EOF
3 - Architecture
This section provides an overview of the architecture and design of the Greenhouse platform.
3.1 - High-level architecture
This section provides a high-level overview of the Greenhouse concepts.
Greenhouse components
Conceptually, the Greenhouse platform consists of 2 types of Kubernetes cluster fulfilling specific purposes.
Central cluster
The central cluster accommodates the core components of the Greenhouse platform, providing the holistic API and dashboard to let users manage their entire cloud infrastructure from a single control point.
Users of the Greenhouse cloud operations platform, depending on their roles, can perform tasks such as managing organizations, configuring and using plugins, monitoring resources, developing custom plugins, and conducting audits to ensure compliance with standards.
Greenhouse’s flexibility allows users to efficiently manage cloud infrastructure and applications while tailoring their experience to organizational needs.
The configuration and metadata is persisted in Kubernetes custom resource definitions (CRDs), acted upon by the Greenhouse operator and managed in the customer cluster.Customer cluster
Managing and operating Kubernetes clusters can be challenging due to the complexity of tasks related to orchestration, scaling, and ensuring high availability in containerized environments.
By onboarding their Kubernetes clusters into Greenhouse, users can centralize cluster management, streamlining tasks like resource monitoring and access control.
This integration enhances visibility, simplifies operational tasks, and ensures better compliance, enabling users to efficiently manage and optimize their Kubernetes infrastructure. While the central cluster contains the user configuration and metadata, all workloads of user-selected Plugins are run in the customer cluster and managed by Greenhouse.
A simplified architecture of the Greenhouse platform is illustrated below.
3.2 - Product design
Introduction
Vision
“Greenhouse is an extendable Platform that enables Organizations to operate their Infrastructure in a compliant, efficient and transparent manner”
We want to build a Platform that is capable to integrate all the tools that are required to operate services in our private cloud environment in a compliant, effective and transparent manner. Greenhouse aims to be the single stop for Operators in the GCS PlusOne Organization. The primary focus of Greenhouse is to provide a unified interface for all operational aspects and tools, providing a simple shared data model describing the support organization.
As every organization is different, using different tools and has different requirements the platform is build in an extendable fashion that allows a distributed development of plugins.
While initially developed for the GCS PlusOne Organization the platform is explicitly designed to be of generic use and can be consumed by multiple organizations and teams.
Problem Statements
Consolidation of Toolsuite
The operation of cloud infrastructure and applications include a large amount of tasks that are supported by different tools and automations.
Due to the high complexity of cloud environments often times a conglomerate of tools is used to cover all operational aspects. Confguration and setup of the operations toolchain is a complex and time-consuming task that often times lacks automation when it comes to on- and off-boarding people and setting up new teams for new projects.
Visibility of Application Specific permission concepts
At SAP, we are managing identities and access centrally. The Converged Cloud is utilizing Cloud Access Manager for this task.
While it is true that we manage who has access to which access level is defined in there it starts getting complicated if you want to figure out the actual Permission Levels on individual Applications those Access Levels are mapped to.
Management of organizational Groups of People
You often have groups of people that are fulfilling a organizational purpose:
- Support Groups
- Working Groups
- etc.
We have currently no way to represent and manage those groups.
Harmonization and Standardization of Authorization Concepts
We are missing a tool that supports teams on creating access levels and profiles following a standardized naming scheme that is enforced by design and takes away the burden of coming up with names for access levels and user profiles/roles.
Single Point of Truth for Operations Metadata of an Organization
For automations, it is often critical to retrieve Metadata about an Organizational Unit:
- Who is member of a certain Group of people, that is maybe not reflecting the HR View of a Team?
- Which Tool is giving me access to data x,y,z?
- What action items are due and need to get alerted on?
- Does component x,z,y belong to my organization?
- etc. Currently, we do not have a single point of Truth for this kind of metadata and have to use a vaierity of Tools.
Terms
This section lists down Terms and description to Terms to ensure a common languague when talking in context of Greenhouse.
Term | Description |
---|---|
Plugin | A Greenhouse plugin provides additional features to the Greenhouse project. It consists of a juno microfrontend that integrates with the Greenhouse UI AND / ORΒ Β a backend component. |
PluginSpec | Yaml Specification describing a plugin. Contains reference to components that need to be installed. Describes mandatory and optional configuration values |
Plugin configuration | A specific configuration instance of an Plugin Spec in a greenhouse organization. References the PluginSpec and actual configuration values |
Organization | A specific configuration instance of an Plugin Spec in a greenhouse organization. References the PluginSpec and actual configuration values |
Team | A team is part of an organization and consists of users |
Role | A role that can be assigned to teams. Roles are a static set that can used by UIs to allow/disallow actions (admin, viewer, editor) |
Cluster | A specific Kubernetes cluster to which an Organization and its members have access and can registered with greenhouse. |
Identity Provider | Central authentication provider that provides authentication for the User of on organization. Used by the UI and apiserver to authenticate users. |
Authorization Provider | External system that provides authorization, e.g. team assignments for users |
Greenhouse apiserver | central apiserver for greenhouse. k8s apiserver with greenhouse CRDs |
OIDC Group | A Group provided by the OIDC Provider (Identity Provider) userinfo with the JWT Token. |
Greenhouse Role | A Greenhouse-Specific Role that grants access to Greenhouse. |
Plugin Role | A Role used by a Greenhouse Plugin to decide if a user has access to the Plugin or not and which level of access within the Plugin is provided. The possible Roles are defined by the Plugin itself within the Plugin Spec including a default OIDC Group to Plugin Role Mapping. The final mapping for a Plugin instance can get configured on the Organization Level with the Plugin Configuration.OIDC Groups that are mapped to Plugin Roles are furthermore assigned to Teams which makes users members of an Organization. |
User Profiles
Every Application has Users / Stakeholders, so has Greenhouse. The User Profiles mentioned here give a abstract overview of considered Users / Stakeholders for the Application and the Goals and Tasks of them in context of the Platform.
Greenhouse admin
Administrator of a Greenhouse installation.
Goals
- Ensure overall function of the Greenhouse plattform
Tasks
- Create Organizations
- Enables Plugins
- Operates central infrastructure (Kubernetes cluster, operator, etc.)
- Assign initial organization admins
Organization admin
Administrator of a Greenhouse Organization
Goals
- Manage organization-wide resources
Tasks
- activate/configure plugins for the organization
- Create and manage teams for the organization
- Onboard and manage Kubernetes clusters for the organization
Organization member
Member of on organization that accesses the UI to do ops/support tasks. Is member of one ore more teams of the organization. By default members have view permissions on organization resources.
Goals
- Provide ops/support for the services of the organization
Tasks
- Highly dependend on team membership and plugins configured for the organization Examples:Β Β Β Β Β Β Β Β Β Β Β Β Β Β Β Β Β Β
- Check alerts for teams user is assigned
- Check policy violations for deployed resources owned by users team
- Check for known vulnerabilites in services
Plugin developer
A plugin developer is developing plugins for Greenhouse.
Goals
- Must be easy to create plugins
- Can create and test plugins independently
- Greenhouse provides tooling to assist creating, validating, etc. plugins
- Publishing the plugin to Greenhouse requires admin permissions.
Tasks
- Plugin DevelopmentΒ Β Β Β Β Β
- Juno UI Development
- Plugin backend development
Auditor
An Auditor audits Greenhouse and/or Greenhouse Plugins for compliance with industry or regulatory standards.
Goals
- Wants to see that we have a record for all changes happening in greenhouse
- Wants to have access to resources required to audit the respective Plugin
Tasks
- Performs Audits
Greenhouse Developer
Develops parts of the Greenhouse platform (Kubernetes, Greenhouse app, Greenhouse backend, …)
Goals
- Provide Greenhouse framework
Tasks
- Provides framework for plugin developers to develop plugins
- Develops Greenhouse framework components (UI or backend)
User Stories
The User Stories outlined in this Section have the target to archive a common Understanding of the capabilities/functionalities the Platform wants to archive and the functional requirements that come with those. The Integration / Development of Functionalities is not going to be strictly bound to User Stories and they are rather used as an orientation to ensure that envisioned capabilities are not getting Blocked due to implementation details.
The details of all User Stories are subject to change based on the results of Proof of Concept implementations, User feedback or other unforseen factors.
Auditor
Auditor 01 - Audit Logging
As an Auditor, I want to see who did which action and when to verify that the Vulnerability and Patch management process is followed according to company policies and that the platform is functioning as expected.
Acceptance Criteria
- Every state-changing action is logged into an immutable log collection, including:
- What action was performed
- Who performed the action
- When was the action performed
- Every authentication to the platform is logged into an immutable log collection, including:
- Who logged in
- When was the login
Greenhouse Admin
Greenhouse Admin 01 - Greenhouse Management Dashboard
As an Greenhouse Admin, I want a central Greenhouse Management Dashboard that allows me to navigate through all organization-wide resources to be able to manage the Platform.
Acceptance Criteria
- Assuming I am on the Greenhouse Management Dashboard view, i can:
- See all Plugins, including the enabled version
- Order not enabled Plugins by last Update Date
- Plugins are Ordered by the Order Attribute
- The order attribute is a numeric value that can be changed to reflect a different ordering of the Plugin:
- 1 is ordered before 2 etc.
- The order attribute is used as well to order the Plugins on the Navigation Bar
- Navigate to “Plugin Detail View” by clicking a Plugin
- See all Organizations, including:
- Number of Organization Admins
- Number of Organization Members
- Navigate to organization creation view by clicking “Create Organization”
- Navigate to Organization Detail View by clicking a Organization
- Only Greenhouse Admin’s should be able to see the Dashboard
- The Navigation item to the Greenhouse Management Dashboard should only be visible to Greenhouse admins
Greenhouse Admin 02 - Organization Creation View
As a Greenhouse Admin, I want a Organization Creation view that allows me to create a new Organization
Acceptance Criteria
- Assuming I am on the Organization creation View, i can:
- Give a unique name for the organisation
- Provide a short description for the organization
- Provide a OIDC Group that gives Organization Admin Privileges
Greenhouse Admin & Organization Admin
Greenhouse Admin & Organization Admin 01 - Organization Detail View
As a Greenhouse Admin or Organization Admin, I want an Organization Detail view that allows me to view details about an organization
Acceptance Criteria
- Assuming I am on the organization detail View, i can:
- Can see the organization details (name/description)
- See a list of teams created for this organization
- See the list of active plugins
- Add Plugins to the organization by clicking “add Plugin”
- Change the Organization Admin Role Name
Greenhouse Admin & Organization Admin 02 - Plugin Detail View
As an Greenhouse Admin or Organizatioin Admin, I want a Plugin Detail view that allows me to seeΒ Β Plugin Details to be able to see details about the plugin.
Acceptance Criteria
- Assuming I am on the Plugin Detail View, I can:
- see the plugin name
- see the plugin description
- see the last update date
- see the release reference
- see the ui release refrence
- see the helm chart reference
- see the ordering attribute
- see configuration values for the plugin
- set the configuration values for the current organization
- see a change log
- see the actually released (deployed to greenhouse) version
Organization Admin
Organization Admin 01 - Organization Managment Dashboard
As an Organization Admin, I want to have an Dashboard showing me the most relevant information about my Organization to be able to manage it efficently.
Acceptance Criteria
- Assuming I am on the organisation management dashboard
- I can see a list of all teams in my organization
- I can see a list of configured plugins
- I can click a “add plugin” button to add a new plugin
- I can see a list of registered clusters
- I can click a “add cluster” button to register a cluster
Organization Admin 02 - Plugin Configuration View
As an organization admin, I want a Plugin configuration view that allows me to enable and configure greenhouse plugins for my organization
Acceptance Criteria
- Assuming I am on the Plugin configuration View, I can:
- select the type of plugin I want to configure
- enable/disable the plugin (for my org)
- remove the plugin (when already added)
- manage configuration options specific for the plugin
Organization Admin 03 - Cluster registration View
As an organization admin, I want a Cluster registration view to onboard kubernetes clusters into my organization.
Acceptance Criteria
- Assuming I am on the cluster registration view, I can:
- Get instructions how to register a kubernetes clusters
- give a name and description for the registered cluster
- After executing the provided instructions I get feedback that the cluster has been successfully registered
- A cluster can be registered to exactly one organization
Organization Admin 04 - Cluster detail View
As an organization admin, I want a Cluster detail view to get some information about a registered cluster
Acceptance Criteria
- Assuming I am on the cluster detail view, I can:
- see basic details about the cluster:
- name
- api url
- version
- node status
- de-register the cluster from my organization
- see basic details about the cluster:
Organization Admin 05 - Team Detail View
:exclamation: User Story details depending on final decision of ADR-01 |
---|
As an organization admin, i want to have a Team Detail View, with the option to configure teams based on role mapping to be to manage teams within my organization without managing the permission administration itself on the Platform
Acceptance Criteria
- Assuming I am on the Team detail view, i can:
- Change the Name of the Team
- change the description of the team
- Define a single OIDC GroupΒ Β that assign you this team
- Define The Greenhouse Role that you get within Greenhouse if you are a member of the team
- On Login of a User into an Organization the Platform verifies if the User has ALL required roles
Organization Admin 05 - Team Creation View
:exclamation: User Story details depending on final decision of ADR-01 |
---|
As an organization admin, i want to have a Team Creation View, to be able to create a new Team
Acceptance Criteria
- Assuming I am on the Team Creation view, I can::
- Set the name of the Team
- Set a description of the Team
- Set a OIDC Group Name that assigns users to this team
Organization Member
Organization Member 01 - Unified task inbox
As an organization member, I want a task inbox that shows my open tasks from all enabled plugins that need my attention to be on top of my tasks to fulfill across all plugins
Acceptance Criteria
- Assuming I am on the task inbox:
- I can a list of open task accross all plugins that need attention
- clicking on an open task jumps in the plugin specific UI the task belong to
- I can sort open tasks by name, plugin or date
Plugin Developer
As a Plugin Developer, I want to have a seperate Repository for my Plugin which I can own and use to configure plugin internals to have control over the Development efforts and configuration of the Plugin
Plugin Developer 01 - Decentrally Managed Plugin
As a Plugin Developer, I want to have a seperate Repository for my Plugin which I can own and use to configure plugin internals to have control over the Development efforts and configuration of the Plugin
Acceptance Criteria
- Plugin lives on his own Github Repository
- Versions are managed via Github Releases using Tags and the release to Greenhouse is managed by the Plugin:
- The version to be pulled by Greenhouse is managed by the Plugin Developer.
- I can configure the Plugin Configuration over a greenhouse.yml in the root of the repository, which at least includes (mandatory):
- description: …
- version: …
- There are optional attributes in the greenhouse.yml:
- icon: which if it has a valid absolute path to an image file on the repository makes the icon selectable as an icon in the plugin detail view (GA02)
- describes available configuration options that attributes that are required for the plugin to function
- I can specify a reference to a UI App
- I can specify a reference to Helm Charts
Plugin Developer 02 - Plugin Role Config
As a Plugin Developer i want a section within the Greenhouse.yml metadata, named “Roles” where i can setup Roles used by my Plugin
Acceptance Criteria
:warning: User Story details depending on final decision of ADR-01 and are therefore not further described here |
---|
Plugin Developer 03 - Spec Schema Validation
As a Plugin Developer I want to have the possibility to validate the schema of my greenhouse.yaml to be able to catch errors within my specification early.
Acceptance Criteria
- The schema check should support IDE’s
- The schema check should be automate-able and be integrate-able to pre-commit hooks and quality gates
- A version with a broken schema should not be release on greenhouse even when pushing for a pull of the release
- It should be visible on the Plugin detail view when an invalid schema was released with a recent version
Plugin Developer 04 - Config Value Validation
As a Plugin Developer I want to have the possibility to write custom regex checks for configuration options of my plugin that include the check to be performed on a field and an error message to be shown if configured wrong by an organization to support organization admins on configuring my plugin
Acceptance Criteria
- The validation rules should be controlled by Plugin Developer
- The validation should happen on the frontend before submitting a configuration
- The error message should be shown when a config value is provided wrong
Plugin Developer 05 - Plugin development tooling
As a plugin developer I want to have an easy setup for developing and testing greenhouse plugins
Acceptance Criteria
- Dev environment available within X Time
- Possible to have a working local setup with a “mock greenhouse apiserver”
- Has a fully working Bootstrap Project that includes Backend and Frontend which can be run locally immediately
- Has documentation
Product Stages
Overview
This Section gives an overview of the different early stages of the Platform that are beeing developed and which functional requirements need to get fulfilled within those stages.
Proof of Concept (POC)
The Proof of Concept is the stage where fundamental Framework/Platform decisions are proven to be working as intended. At this Stage the Platform is not suitable to be used by the intended audience yet but most necessary core functionalities are implemented.
The desired functionalities in this phase are:
- Frontend with Authentication
- Authorization within Greenhouse (Greenhouse Admin, Org Admin, Org Member)
- Team Management (without UI)
- Org Management (without UI)
- Greenhouse Admin User Stories (without UI)
- Dummy Plugin
- with configuration spec
- Plugin Development Setup (without Documentation)
- Plugin Versioning & Provisioning (without UI)
Minimal viable product (MVP)
This stage is considered to be the earliest stage to open the Platform for General use.
In addition to the PoC functionalities we expect the following requirements to be fulfilled:
- Integrated 3 Plugins:
- Supernova (Alerts)
- DOOP (Violations)
- Heureka
NOTE: Heureka was excluded from MVP as the Heureka API is only available at a later point in time.
- Team management
- Organization management
3.3 - Building Observability Architectures
The main terminologies used in this document can be found in core-concepts.
Introduction to Observability
Observability in software and infrastructure has become essential for operating the complex modern cloud environments. The concept centers on understanding the internal states of systems based on the data they produce, enabling teams to:
- Detect and troubleshoot issues quickly,
- Maintain performance and reliability,
- Make data-driven improvements.
Core Signals and Open Source Tools
Key pillars of observability are metrics, logs, and traces, each providing unique insights that contribute to a comprehensive view of a systemβs health.
Metrics:
- Metrics are numerical data points representing system health over time (e.g., CPU usage, memory usage, request latency).
- Prometheus is a widely used tool for collecting and querying metrics. It uses a time-series database optimized for real-time data, making it ideal for gathering system health data, enabling alerting, and visualizing trends.
Logs:
- Logs capture detailed, structured/unstructured records of system events, which are essential for post-incident analysis and troubleshooting.
- OpenSearch provides a robust, scalable platform for log indexing, search, and analysis, enabling teams to sift through large volumes of logs to identify issues and understand system behavior over time.
Traces:
- Traces follow a requestβs journey through the system, capturing latency and failures across microservices. Traces are key for understanding dependencies and diagnosing bottlenecks.
- Jaeger is a popular open-source tool for distributed tracing, providing a detailed view of request paths and performance across services.
To provide a unified approach to open source observability, OpenTelemetry was developed as a framework for instrumenting applications and infrastructures to collect metrics, logs and traces. In addition to providing a unified API and SDKs for multiple programming languages, OpenTelemetry also simplifies the integration of various backend systems such as Prometheus, OpenSearch and Jaeger.
Observability in Greenhouse
Greenhouse provides a suite of Plugins which consist of pre-packaged configurations for monitoring and logging tools. These Plugins are designed to simplify the setup and configuration of observability components, enabling users to quickly deploy and manage monitoring and logging for their Greenhouse-onboarded Kubernetes clusters.
The following Plugins are available currently:
- Kubernetes Monitoring: Prometheus, to collect custom and Kubernetes specific metrics with standard Kubernetes alerting enabled.
- Thanos: Thanos, to enable long term metric retention and unified metric accessibility.
- Plutono: Grafana fork, to create dynamic dashboards for metrics.
- Alerts: Prometheus Alertmanager and Supernova, to manage and visualize alerts sent by Prometheus.
- OpenTelemetry: OpenTelemetry Pipelines, to collect metrics and logs from applications and forward them to backends like Prometheus and OpenSearch.
Overview Architecture
4 - Reference
This section contains reference documentation for Greenhouse.
4.1 - API
Packages:
greenhouse.sap/v1alpha1
Resource Types:Authentication
(Appears on: OrganizationSpec)
Field | Description |
---|---|
oidc OIDCConfig | OIDConfig configures the OIDC provider. |
scim SCIMConfig | SCIMConfig configures the SCIM client. |
Cluster
Cluster is the Schema for the clusters API
Field | Description | ||||
---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||||
spec ClusterSpec |
| ||||
status ClusterStatus |
ClusterAccessMode
(string
alias)
(Appears on: ClusterSpec)
ClusterAccessMode configures the access mode to the customer cluster.
ClusterConditionType
(string
alias)
ClusterConditionType is a valid condition of a cluster.
ClusterKubeConfig
(Appears on: ClusterSpec)
ClusterKubeConfig configures kube config values.
Field | Description |
---|---|
maxTokenValidity int32 | MaxTokenValidity specifies the maximum duration for which a token remains valid in hours. |
ClusterKubeconfig
ClusterKubeconfig is the Schema for the clusterkubeconfigs API ObjectMeta.OwnerReferences is used to link the ClusterKubeconfig to the Cluster ObjectMeta.Generation is used to detect changes in the ClusterKubeconfig and sync local kubeconfig files ObjectMeta.Name is designed to be the same with the Cluster name
Field | Description | ||
---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||
spec ClusterKubeconfigSpec |
| ||
status ClusterKubeconfigStatus |
ClusterKubeconfigAuthInfo
(Appears on: ClusterKubeconfigAuthInfoItem)
Field | Description |
---|---|
auth-provider k8s.io/client-go/tools/clientcmd/api.AuthProviderConfig | |
client-certificate-data []byte | |
client-key-data []byte |
ClusterKubeconfigAuthInfoItem
(Appears on: ClusterKubeconfigData)
Field | Description |
---|---|
name string | |
user ClusterKubeconfigAuthInfo |
ClusterKubeconfigCluster
(Appears on: ClusterKubeconfigClusterItem)
Field | Description |
---|---|
server string | |
certificate-authority-data []byte |
ClusterKubeconfigClusterItem
(Appears on: ClusterKubeconfigData)
Field | Description |
---|---|
name string | |
cluster ClusterKubeconfigCluster |
ClusterKubeconfigContext
(Appears on: ClusterKubeconfigContextItem)
Field | Description |
---|---|
cluster string | |
user string | |
namespace string |
ClusterKubeconfigContextItem
(Appears on: ClusterKubeconfigData)
Field | Description |
---|---|
name string | |
context ClusterKubeconfigContext |
ClusterKubeconfigData
(Appears on: ClusterKubeconfigSpec)
ClusterKubeconfigData stores the kubeconfig data ready to use kubectl or other local tooling It is a simplified version of clientcmdapi.Config: https://pkg.go.dev/k8s.io/client-go/tools/clientcmd/api#Config
Field | Description |
---|---|
kind string | |
apiVersion string | |
clusters []ClusterKubeconfigClusterItem | |
users []ClusterKubeconfigAuthInfoItem | |
contexts []ClusterKubeconfigContextItem | |
current-context string | |
preferences ClusterKubeconfigPreferences |
ClusterKubeconfigPreferences
(Appears on: ClusterKubeconfigData)
ClusterKubeconfigSpec
(Appears on: ClusterKubeconfig)
ClusterKubeconfigSpec stores the kubeconfig data for the cluster The idea is to use kubeconfig data locally with minimum effort (with local tools or plain kubectl): kubectl get cluster-kubeconfig $NAME -o yaml | yq -y .spec.kubeconfig
Field | Description |
---|---|
kubeconfig ClusterKubeconfigData |
ClusterKubeconfigStatus
(Appears on: ClusterKubeconfig)
Field | Description |
---|---|
statusConditions StatusConditions |
ClusterOptionOverride
(Appears on: PluginPresetSpec)
ClusterOptionOverride defines which plugin option should be override in which cluster
Field | Description |
---|---|
clusterName string | |
overrides []PluginOptionValue |
ClusterSelector
ClusterSelector specifies a selector for clusters by name or by label with the option to exclude specific clusters.
Field | Description |
---|---|
clusterName string | Name of a single Cluster to select. |
labelSelector Kubernetes meta/v1.LabelSelector | LabelSelector is a label query over a set of Clusters. |
excludeList []string | ExcludeList is a list of Cluster names to exclude from LabelSelector query. |
ClusterSpec
(Appears on: Cluster)
ClusterSpec defines the desired state of the Cluster.
Field | Description |
---|---|
accessMode ClusterAccessMode | AccessMode configures how the cluster is accessed from the Greenhouse operator. |
kubeConfig ClusterKubeConfig | KubeConfig contains specific values for |
ClusterStatus
(Appears on: Cluster)
ClusterStatus defines the observed state of Cluster
Field | Description |
---|---|
kubernetesVersion string | KubernetesVersion reflects the detected Kubernetes version of the cluster. |
bearerTokenExpirationTimestamp Kubernetes meta/v1.Time | BearerTokenExpirationTimestamp reflects the expiration timestamp of the bearer token used to access the cluster. |
statusConditions StatusConditions | StatusConditions contain the different conditions that constitute the status of the Cluster. |
nodes map[string]./pkg/apis/greenhouse/v1alpha1.NodeStatus | Nodes provides a map of cluster node names to node statuses |
Condition
(Appears on: PropagationStatus, StatusConditions)
Condition contains additional information on the state of a resource.
Field | Description |
---|---|
type ConditionType | Type of the condition. |
status Kubernetes meta/v1.ConditionStatus | Status of the condition. |
reason ConditionReason | Reason is a one-word, CamelCase reason for the condition’s last transition. |
lastTransitionTime Kubernetes meta/v1.Time | LastTransitionTime is the last time the condition transitioned from one status to another. |
message string | Message is an optional human readable message indicating details about the last transition. |
ConditionReason
(string
alias)
(Appears on: Condition)
ConditionReason is a valid reason for a condition of a resource.
ConditionType
(string
alias)
(Appears on: Condition)
ConditionType is a valid condition of a resource.
HelmChartReference
(Appears on: PluginDefinitionSpec, PluginStatus)
HelmChartReference references a Helm Chart in a chart repository.
Field | Description |
---|---|
name string | Name of the HelmChart chart. |
repository string | Repository of the HelmChart chart. |
version string | Version of the HelmChart chart. |
HelmReleaseStatus
(Appears on: PluginStatus)
HelmReleaseStatus reflects the status of a Helm release.
Field | Description |
---|---|
status string | Status is the status of a HelmChart release. |
firstDeployed Kubernetes meta/v1.Time | FirstDeployed is the timestamp of the first deployment of the release. |
lastDeployed Kubernetes meta/v1.Time | LastDeployed is the timestamp of the last deployment of the release. |
NodeStatus
(Appears on: ClusterStatus)
Field | Description |
---|---|
statusConditions StatusConditions | We mirror the node conditions here for faster reference |
ready bool | Fast track to the node ready condition. |
OIDCConfig
(Appears on: Authentication)
Field | Description |
---|---|
issuer string | Issuer is the URL of the identity service. |
redirectURI string | RedirectURI is the redirect URI. If none is specified, the Greenhouse ID proxy will be used. |
clientIDReference SecretKeyReference | ClientIDReference references the Kubernetes secret containing the client id. |
clientSecretReference SecretKeyReference | ClientSecretReference references the Kubernetes secret containing the client secret. |
Organization
Organization is the Schema for the organizations API
Field | Description | ||||||||
---|---|---|---|---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||||||||
spec OrganizationSpec |
| ||||||||
status OrganizationStatus |
OrganizationSpec
(Appears on: Organization)
OrganizationSpec defines the desired state of Organization
Field | Description |
---|---|
displayName string | DisplayName is an optional name for the organization to be displayed in the Greenhouse UI. Defaults to a normalized version of metadata.name. |
authentication Authentication | Authentication configures the organizations authentication mechanism. |
description string | Description provides additional details of the organization. |
mappedOrgAdminIdPGroup string | MappedOrgAdminIDPGroup is the IDP group ID identifying org admins |
OrganizationStatus
(Appears on: Organization)
OrganizationStatus defines the observed state of an Organization
Field | Description |
---|---|
statusConditions StatusConditions | StatusConditions contain the different conditions that constitute the status of the Organization. |
Plugin
Plugin is the Schema for the plugins API
Field | Description | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||||||||||||
spec PluginSpec |
| ||||||||||||
status PluginStatus |
PluginDefinition
PluginDefinition is the Schema for the PluginDefinitions API
Field | Description | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||||||||||||||||||
spec PluginDefinitionSpec |
| ||||||||||||||||||
status PluginDefinitionStatus |
PluginDefinitionSpec
(Appears on: PluginDefinition)
PluginDefinitionSpec defines the desired state of PluginDefinitionSpec
Field | Description |
---|---|
displayName string | DisplayName provides a human-readable label for the pluginDefinition. |
description string | Description provides additional details of the pluginDefinition. |
helmChart HelmChartReference | HelmChart specifies where the Helm Chart for this pluginDefinition can be found. |
uiApplication UIApplicationReference | UIApplication specifies a reference to a UI application |
options []PluginOption | RequiredValues is a list of values required to create an instance of this PluginDefinition. |
version string | Version of this pluginDefinition |
weight int32 | Weight configures the order in which Plugins are shown in the Greenhouse UI. Defaults to alphabetical sorting if not provided or on conflict. |
icon string | Icon specifies the icon to be used for this plugin in the Greenhouse UI. Icons can be either: - A string representing a juno icon in camel case from this list: https://github.com/sapcc/juno/blob/main/libs/juno-ui-components/src/components/Icon/Icon.component.js#L6-L52 - A publicly accessible image reference to a .png file. Will be displayed 100x100px |
docMarkDownUrl string | DocMarkDownUrl specifies the URL to the markdown documentation file for this plugin. Source needs to allow all CORS origins. |
PluginDefinitionStatus
(Appears on: PluginDefinition)
PluginDefinitionStatus defines the observed state of PluginDefinition
PluginOption
(Appears on: PluginDefinitionSpec)
Field | Description |
---|---|
name string | Name/Key of the config option. |
default k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1.JSON | (Optional) Default provides a default value for the option |
description string | Description provides a human-readable text for the value as shown in the UI. |
displayName string | DisplayName provides a human-readable label for the configuration option |
required bool | Required indicates that this config option is required |
type PluginOptionType | Type of this configuration option. |
regex string | Regex specifies a match rule for validating configuration options. |
PluginOptionType
(string
alias)
(Appears on: PluginOption)
PluginOptionType specifies the type of PluginOption.
PluginOptionValue
(Appears on: ClusterOptionOverride, PluginSpec)
PluginOptionValue is the value for a PluginOption.
Field | Description |
---|---|
name string | Name of the values. |
value k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1.JSON | Value is the actual value in plain text. |
valueFrom ValueFromSource | ValueFrom references a potentially confidential value in another source. |
PluginPreset
PluginPreset is the Schema for the PluginPresets API
Field | Description | ||||||
---|---|---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||||||
spec PluginPresetSpec |
| ||||||
status PluginPresetStatus |
PluginPresetSpec
(Appears on: PluginPreset)
PluginPresetSpec defines the desired state of PluginPreset
Field | Description |
---|---|
plugin PluginSpec | PluginSpec is the spec of the plugin to be deployed by the PluginPreset. |
clusterSelector Kubernetes meta/v1.LabelSelector | ClusterSelector is a label selector to select the clusters the plugin bundle should be deployed to. |
clusterOptionOverrides []ClusterOptionOverride | ClusterOptionOverrides define plugin option values to override by the PluginPreset |
PluginPresetStatus
(Appears on: PluginPreset)
PluginPresetStatus defines the observed state of PluginPreset
Field | Description |
---|---|
statusConditions StatusConditions | StatusConditions contain the different conditions that constitute the status of the PluginPreset. |
PluginSpec
(Appears on: Plugin, PluginPresetSpec)
PluginSpec defines the desired state of Plugin
Field | Description |
---|---|
pluginDefinition string | PluginDefinition is the name of the PluginDefinition this instance is for. |
displayName string | DisplayName is an optional name for the Plugin to be displayed in the Greenhouse UI. This is especially helpful to distinguish multiple instances of a PluginDefinition in the same context. Defaults to a normalized version of metadata.name. |
disabled bool | Disabled indicates that the plugin is administratively disabled. |
optionValues []PluginOptionValue | Values are the values for a PluginDefinition instance. |
clusterName string | ClusterName is the name of the cluster the plugin is deployed to. If not set, the plugin is deployed to the greenhouse cluster. |
releaseNamespace string | ReleaseNamespace is the namespace in the remote cluster to which the backend is deployed. Defaults to the Greenhouse managed namespace if not set. |
PluginStatus
(Appears on: Plugin)
PluginStatus defines the observed state of Plugin
Field | Description |
---|---|
helmReleaseStatus HelmReleaseStatus | HelmReleaseStatus reflects the status of the latest HelmChart release. This is only configured if the pluginDefinition is backed by HelmChart. |
version string | Version contains the latest pluginDefinition version the config was last applied with successfully. |
helmChart HelmChartReference | HelmChart contains a reference the helm chart used for the deployed pluginDefinition version. |
uiApplication UIApplicationReference | UIApplication contains a reference to the frontend that is used for the deployed pluginDefinition version. |
weight int32 | Weight configures the order in which Plugins are shown in the Greenhouse UI. |
description string | Description provides additional details of the plugin. |
exposedServices map[string]./pkg/apis/greenhouse/v1alpha1.Service | ExposedServices provides an overview of the Plugins services that are centrally exposed. It maps the exposed URL to the service found in the manifest. |
statusConditions StatusConditions | StatusConditions contain the different conditions that constitute the status of the Plugin. |
PropagationStatus
(Appears on: TeamRoleBindingStatus)
PropagationStatus defines the observed state of the TeamRoleBinding’s associated rbacv1 resources on a Cluster
Field | Description |
---|---|
clusterName string | ClusterName is the name of the cluster the rbacv1 resources are created on. |
condition Condition | Condition is the overall Status of the rbacv1 resources created on the cluster |
SCIMConfig
(Appears on: Authentication)
Field | Description |
---|---|
baseURL string | URL to the SCIM server. |
basicAuthUser ValueFromSource | User to be used for basic authentication. |
basicAuthPw ValueFromSource | Password to be used for basic authentication. |
SecretKeyReference
(Appears on: OIDCConfig, ValueFromSource)
SecretKeyReference specifies the secret and key containing the value.
Field | Description |
---|---|
name string | Name of the secret in the same namespace. |
key string | Key in the secret to select the value from. |
Service
(Appears on: PluginStatus)
Service references a Kubernetes service of a Plugin.
Field | Description |
---|---|
namespace string | Namespace is the namespace of the service in the target cluster. |
name string | Name is the name of the service in the target cluster. |
port int32 | Port is the port of the service. |
protocol string | Protocol is the protocol of the service. |
StatusConditions
(Appears on: ClusterKubeconfigStatus, ClusterStatus, NodeStatus, OrganizationStatus, PluginPresetStatus, PluginStatus, TeamMembershipStatus, TeamRoleBindingStatus, TeamStatus)
A StatusConditions contains a list of conditions. Only one condition of a given type may exist in the list.
Field | Description |
---|---|
conditions []Condition |
Team
Team is the Schema for the teams API
Field | Description | ||||||
---|---|---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||||||
spec TeamSpec |
| ||||||
status TeamStatus |
TeamMembership
TeamMembership is the Schema for the teammemberships API
Field | Description | ||
---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||
spec TeamMembershipSpec |
| ||
status TeamMembershipStatus |
TeamMembershipSpec
(Appears on: TeamMembership)
TeamMembershipSpec defines the desired state of TeamMembership
Field | Description |
---|---|
members []User | (Optional) Members list users that are part of a team. |
TeamMembershipStatus
(Appears on: TeamMembership)
TeamMembershipStatus defines the observed state of TeamMembership
Field | Description |
---|---|
lastSyncedTime Kubernetes meta/v1.Time | (Optional) LastSyncedTime is the information when was the last time the membership was synced |
lastUpdateTime Kubernetes meta/v1.Time | (Optional) LastChangedTime is the information when was the last time the membership was actually changed |
statusConditions StatusConditions | StatusConditions contain the different conditions that constitute the status of the TeamMembership. |
TeamRole
TeamRole is the Schema for the TeamRoles API
Field | Description | ||||||
---|---|---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||||||
spec TeamRoleSpec |
| ||||||
status TeamRoleStatus |
TeamRoleBinding
TeamRoleBinding is the Schema for the rolebindings API
Field | Description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||||||||||
spec TeamRoleBindingSpec |
| ||||||||||
status TeamRoleBindingStatus |
TeamRoleBindingSpec
(Appears on: TeamRoleBinding)
TeamRoleBindingSpec defines the desired state of a TeamRoleBinding
Field | Description |
---|---|
teamRoleRef string | TeamRoleRef references a Greenhouse TeamRole by name |
teamRef string | TeamRef references a Greenhouse Team by name |
clusterName string | ClusterName is the name of the cluster the rbacv1 resources are created on. |
clusterSelector Kubernetes meta/v1.LabelSelector | ClusterSelector is a label selector to select the Clusters the TeamRoleBinding should be deployed to. |
namespaces []string | Namespaces is a list of namespaces in the Greenhouse Clusters to apply the RoleBinding to. If empty, a ClusterRoleBinding will be created on the remote cluster, otherwise a RoleBinding per namespace. |
TeamRoleBindingStatus
(Appears on: TeamRoleBinding)
TeamRoleBindingStatus defines the observed state of the TeamRoleBinding
Field | Description |
---|---|
statusConditions StatusConditions | StatusConditions contain the different conditions that constitute the status of the TeamRoleBinding. |
clusters []PropagationStatus | PropagationStatus is the list of clusters the TeamRoleBinding is applied to |
TeamRoleSpec
(Appears on: TeamRole)
TeamRoleSpec defines the desired state of a TeamRole
Field | Description |
---|---|
rules []Kubernetes rbac/v1.PolicyRule | Rules is a list of rbacv1.PolicyRules used on a managed RBAC (Cluster)Role |
aggregationRule Kubernetes rbac/v1.AggregationRule | AggregationRule describes how to locate ClusterRoles to aggregate into the ClusterRole on the remote cluster |
labels map[string]string | Labels are applied to the ClusterRole created on the remote cluster. This allows using TeamRoles as part of AggregationRules by other TeamRoles |
TeamRoleStatus
(Appears on: TeamRole)
TeamRoleStatus defines the observed state of a TeamRole
TeamSpec
(Appears on: Team)
TeamSpec defines the desired state of Team
Field | Description |
---|---|
description string | Description provides additional details of the team. |
mappedIdPGroup string | IdP group id matching team. |
joinUrl string | URL to join the IdP group. |
TeamStatus
(Appears on: Team)
TeamStatus defines the observed state of Team
Field | Description |
---|---|
statusConditions StatusConditions | |
members []User |
UIApplicationReference
(Appears on: PluginDefinitionSpec, PluginStatus)
UIApplicationReference references the UI pluginDefinition to use.
Field | Description |
---|---|
url string | URL specifies the url to a built javascript asset. By default, assets are loaded from the Juno asset server using the provided name and version. |
name string | Name of the UI application. |
version string | Version of the frontend application. |
User
(Appears on: TeamMembershipSpec, TeamStatus)
User specifies a human person.
Field | Description |
---|---|
id string | ID is the unique identifier of the user. |
firstName string | FirstName of the user. |
lastName string | LastName of the user. |
email string | Email of the user. |
ValueFromSource
(Appears on: PluginOptionValue, SCIMConfig)
ValueFromSource is a valid source for a value.
Field | Description |
---|---|
secret SecretKeyReference | Secret references the secret containing the value. |
This page was automatically generated with gen-crd-api-reference-docs
4.2 - Plugin Catalog
This section provides an overview of the available PluginDefinitions in Greenhouse.
4.2.1 - Alerts
Learn more about the alerts plugin. Use it to activate Prometheus alert management for your Greenhouse organisation.
The main terminologies used in this document can be found in core-concepts.
Overview
This Plugin includes a preconfigured Prometheus Alertmanager, which is deployed and managed via the Prometheus Operator, and Supernova, an advanced user interface for Prometheus Alertmanager. Certificates are automatically generated to enable sending alerts from Prometheus to Alertmanager. These alerts can too be sent as Slack notifications with a provided set of notification templates.
Components included in this Plugin:
This Plugin usually is deployed along the kube-monitoring Plugin and does not deploy the Prometheus Operator itself. However, if you are intending to use it stand-alone, you need to explicitly enable the deployment of Prometheus Operator, otherwise it will not work. It can be done in the configuration interface of the plugin.
Disclaimer
This is not meant to be a comprehensive package that covers all scenarios. If you are an expert, feel free to configure the plugin according to your needs.
The Plugin is a deeply configured kube-prometheus-stack Helm chart which helps to keep track of versions and community updates.
It is intended as a platform that can be extended by following the guide.
Contribution is highly appreciated. If you discover bugs or want to add functionality to the plugin, then pull requests are always welcome.
Quick start
This guide provides a quick and straightforward way to use alerts as a Greenhouse Plugin on your Kubernetes cluster.
Prerequisites
- A running and Greenhouse-onboarded Kubernetes cluster. If you don’t have one, follow the Cluster onboarding guide.
- kube-monitoring plugin (which brings in Prometheus Operator) OR stand alone: awareness to enable the deployment of Prometheus Operator with this plugin
Step 1:
You can install the alerts
package in your cluster with Helm manually or let the Greenhouse platform lifecycle it for you automatically. For the latter, you can either:
- Go to Greenhouse dashboard and select the Alerts Plugin from the catalog. Specify the cluster and required option values.
- Create and specify a
Plugin
resource in your Greenhouse central cluster according to the examples.
Step 2:
After the installation, you can access the Supernova UI by navigating to the Alerts
tab in the Greenhouse dashboard.
Step 3:
Greenhouse regularly performs integration tests that are bundled with alerts. These provide feedback on whether all the necessary resources are installed and continuously up and running. You will find messages about this in the plugin status and also in the Greenhouse dashboard.
Configuration
Prometheus Alertmanager options
Name | Description | Value |
---|---|---|
alerts.commonLabels | Labels to apply to all resources | {} |
alerts.alertmanager.enabled | Deploy Prometheus Alertmanager | true |
alerts.alertmanager.annotations | Annotations for Alertmanager | {} |
alerts.alertmanager.config | Alertmanager configuration directives. | {} |
alerts.alertmanager.ingress.enabled | Deploy Alertmanager Ingress | false |
alerts.alertmanager.ingress.hosts | Must be provided if Ingress is enabled. | [] |
alerts.alertmanager.ingress.tls | Must be a valid TLS configuration for Alertmanager Ingress. Supernova UI passes the client certificate to retrieve alerts. | {} |
alerts.alertmanager.ingress.ingressClassname | Specifies the ingress-controller | nginx |
alerts.alertmanager.servicemonitor.additionalLabels | kube-monitoring plugin: <plugin.name> to scrape Alertmanager metrics. | {} |
alerts.alertmanager.alertmanagerConfig.slack.routes[].name | Name of the Slack route. | "" |
alerts.alertmanager.alertmanagerConfig.slack.routes[].channel | Slack channel to post alerts to. Must be defined with slack.webhookURL . | "" |
alerts.alertmanager.alertmanagerConfig.slack.routes[].webhookURL | Slack webhookURL to post alerts to. Must be defined with slack.channel . | "" |
alerts.alertmanager.alertmanagerConfig.slack.routes[].matchers | List of matchers that the alert’s label should match. matchType , name , regex , value | [] |
alerts.alertmanager.alertmanagerConfig.webhook.routes[].name | Name of the webhook route. | "" |
alerts.alertmanager.alertmanagerConfig.webhook.routes[].url | Webhook url to post alerts to. | "" |
alerts.alertmanager.alertmanagerConfig.webhook.routes[].matchers | List of matchers that the alert’s label should match. matchType , name , regex , value | [] |
| alerts.auth.secretName
| Use custom secret for Alertmanager authentication | ""
|
| alerts.auth.autoGenerateCert.enabled
| TLS Certificate Option 1: Use Helm to automatically generate self-signed certificate. | true
|
| alerts.auth.autoGenerateCert.recreate
| If set to true, new key/certificate is generated on Helm upgrade. | false
|
| alerts.auth.autoGenerateCert.certPeriodDays
| Cert period time in days. The default is 365 days. | 365
|
| alerts.auth.certFile
| Path to your own PEM-encoded certificate. | ""
|
| alerts.auth.keyFile
| Path to your own PEM-encoded private key. | ""
|
| alerts.auth.caFile
| Path to CA cert. | ""
|
| alerts.defaultRules.create
| Creates community Alertmanager alert rules. | true
|
| alerts.defaultRules.labels
| kube-monitoring plugin: <plugin.name>
to evaluate Alertmanager rules. | {}
|
| alerts.alertmanager.alertmanagerSpec.alertmanagerConfiguration
| AlermanagerConfig to be used as top level configuration | false
|
Supernova options
theme
: Override the default theme. Possible values are "theme-light"
or "theme-dark"
(default)
endpoint
: Alertmanager API Endpoint URL /api/v2
. Should be one of alerts.alertmanager.ingress.hosts
silenceExcludedLabels
: SilenceExcludedLabels are labels that are initially excluded by default when creating a silence. However, they can be added if necessary when utilizing the advanced options in the silence form.The labels must be an array of strings. Example: ["pod", "pod_name", "instance"]
filterLabels
: FilterLabels are the labels shown in the filter dropdown, enabling users to filter alerts based on specific criteria. The ‘Status’ label serves as a default filter, automatically computed from the alert status attribute and will be not overwritten. The labels must be an array of strings. Example: ["app", "cluster", "cluster_type"]
predefinedFilters
: PredefinedFilters are filters applied through in the UI to differentiate between contexts through matching alerts with regular expressions. They are loaded by default when the application is loaded. The format is a list of objects including name, displayname and matchers (containing keys corresponding value). Example:
[
{
"name": "prod",
"displayName": "Productive System",
"matchers": {
"region": "^prod-.*"
}
}
]
silenceTemplates
: SilenceTemplates are used in the Modal (schedule silence) to allow pre-defined silences to be used to scheduled maintenance windows. The format consists of a list of objects including description, editable_labels (array of strings specifying the labels that users can modify), fixed_labels (map containing fixed labels and their corresponding values), status, and title. Example:
"silenceTemplates": [
{
"description": "Description of the silence template",
"editable_labels": ["region"],
"fixed_labels": {
"name": "Marvin",
},
"status": "active",
"title": "Silence"
}
]
Managing Alertmanager configuration
ref:
- https://prometheus.io/docs/alerting/configuration/#configuration-file
- https://prometheus.io/webtools/alerting/routing-tree-editor/
By default, the Alertmanager instances will start with a minimal configuration which isnβt really useful since it doesnβt send any notification when receiving alerts.
You have multiple options to provide the Alertmanager configuration:
- You can use
alerts.alertmanager.config
to define a Alertmanager configuration. Example below.
config:
global:
resolve_timeout: 5m
inhibit_rules:
- source_matchers:
- "severity = critical"
target_matchers:
- "severity =~ warning|info"
equal:
- "namespace"
- "alertname"
- source_matchers:
- "severity = warning"
target_matchers:
- "severity = info"
equal:
- "namespace"
- "alertname"
- source_matchers:
- "alertname = InfoInhibitor"
target_matchers:
- "severity = info"
equal:
- "namespace"
route:
group_by: ["namespace"]
group_wait: 30s
group_interval: 5m
repeat_interval: 12h
receiver: "null"
routes:
- receiver: "null"
matchers:
- alertname =~ "InfoInhibitor|Watchdog"
receivers:
- name: "null"
templates:
- "/etc/alertmanager/config/*.tmpl"
- You can discover
AlertmanagerConfig
objects. Thespec.alertmanagerConfigSelector
is always set tomatchLabels
:plugin: <name>
to tell the operator whichAlertmanagerConfigs
objects should be selected and merged with the main Alertmanager configuration. Note: The default strategy for aAlertmanagerConfig
object to match alerts isOnNamespace
.
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
name: config-example
labels:
alertmanagerConfig: example
pluginDefinition: alerts-example
spec:
route:
groupBy: ["job"]
groupWait: 30s
groupInterval: 5m
repeatInterval: 12h
receiver: "webhook"
receivers:
- name: "webhook"
webhookConfigs:
- url: "http://example.com/"
- You can use
alerts.alertmanager.alertmanagerSpec.alertmanagerConfiguration
to reference anAlertmanagerConfig
object in the same namespace which defines the main Alertmanager configuration.
# Example with select a global alertmanagerconfig
alertmanagerConfiguration:
name: global-alertmanager-configuration
TLS Certificate Requirement
Greenhouse onboarded Prometheus installations need to communicate with the Alertmanager component to enable advanced processing of alerts. The Alertmanager Ingress requires a TLS certificate to be configured and trusted by Prometheus to ensure the communication. There are various ways in which you can generate/configure the required TLS certificate.
- You can use an automatically generated self-signed certificate by setting
alerts.auth.autoGenerateCert.enabled
totrue
. Helm will create a self-signed cert and a secret for you. - You can use your own generated self-signed certificate by setting
alerts.auth.autoGenerateCert.enabled
tofalse
. You should provide the necessary values toalerts.auth.certFile
,alerts.auth.keyFile
, andalerts.auth.caFile
. - You can also sideload custom certificate by disabling
alerts.auth.autoGenerateCert.enabled
tofalse
while setting your custom cert secret name inalerts.auth.secretName
Examples
Deploy alerts with Alertmanager
apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
name: alerts
spec:
pluginDefinition: alerts
disabled: false
displayName: Alerts
optionValues:
- name: alerts.alertmanager.enabled
value: true
- name: alerts.alertmanager.ingress.enabled
value: true
- name: alerts.alertmanager.ingress.hosts
value:
- alertmanager.dns.example.com
- name: alerts.alertmanager.ingress.tls
value:
- hosts:
- alertmanager.dns.example.com
secretName: tls-alertmanager-dns-example-com
- name: alerts.alertmanagerConfig.slack.routes
value:
- channel: slack-warning-channel
webhookURL: https://hooks.slack.com/services/some-id
matchers:
- name: severity
matchType: "="
value: "warning"
- channel: slack-critical-channel
webhookURL: https://hooks.slack.com/services/some-id
matchers:
- name: severity
matchType: "="
value: "critical"
- name: alerts.alertmanagerConfig.webhook.routes
value:
- name: webhook-route
url: https://some-webhook-url
matchers:
- name: alertname
matchType: "=~"
value: ".*"
- name: alerts.alertmanager.serviceMonitor.additionalLabels
value:
plugin: kube-monitoring
- name: alerts.defaultRules.create
value: true
- name: alerts.defaultRules.labels
value:
plugin: kube-monitoring
- name: endpoint
value: https://alertmanager.dns.example.com/api/v2
- name: filterLabels
value:
- job
- severity
- status
- name: silenceExcludedLabels
value:
- pod
- pod_name
- instance
Deploy alerts without Alertmanager (Bring your own Alertmanager - Supernova UI only)
apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
name: alerts
spec:
pluginDefinition: alerts
disabled: false
displayName: Alerts
optionValues:
- name: alerts.alertmanager.enabled
value: false
- name: alerts.alertmanager.ingress.enabled
value: false
- name: alerts.defaultRules.create
value: false
- name: endpoint
value: https://alertmanager.dns.example.com/api/v2
- name: filterLabels
value:
- job
- severity
- status
- name: silenceExcludedLabels
value:
- pod
- pod_name
- instance
4.2.2 - Cert-manager
This Plugin provides the cert-manager to automate the management of TLS certificates.
Configuration
This section highlights configuration of selected Plugin features.
All available configuration options are described in the plugin.yaml.
Ingress shim
An Ingress resource in Kubernetes configures external access to services in a Kubernetes cluster.
Securing ingress resources with TLS certificates is a common use-case and the cert-manager can be configured to handle these via the ingress-shim
component.
It can be enabled by deploying an issuer in your organization and setting the following options on this plugin.
Option | Type | Description |
---|---|---|
cert-manager.ingressShim.defaultIssuerName | string | Name of the cert-manager issuer to use for TLS certificates |
cert-manager.ingressShim.defaultIssuerKind | string | Kind of the cert-manager issuer to use for TLS certificates |
cert-manager.ingressShim.defaultIssuerGroup | string | Group of the cert-manager issuer to use for TLS certificates |
4.2.3 - Decentralized Observer of Policies (Violations)
This directory contains the Greenhouse plugin for the Decentralized Observer of Policies (DOOP).
DOOP
To perform automatic validations on Kubernetes objects, we run a deployment of OPA Gatekeeper in each cluster. This dashboard aggregates all policy violations reported by those Gatekeeper instances.
4.2.4 - Designate Ingress CNAME operator (DISCO)
This Plugin provides the Designate Ingress CNAME operator (DISCO) to automate management of DNS entries in OpenStack Designate for Ingress and Services in Kubernetes.
4.2.5 - DigiCert issuer
This Plugin provides the digicert-issuer, an external Issuer extending the cert-manager with the DigiCert cert-central API.
4.2.6 - External DNS
This Plugin provides the external DNS operator) which synchronizes exposed Kubernetes Services and Ingresses with DNS providers.
4.2.7 - Github Guard
Github Guard Greenhouse Plugin manages Github teams, team memberships and repository & team assignments.
Hierarchy of Custom Resources
Custom Resources
Github
β an installation of Github App
apiVersion: githubguard.sap/v1
kind: Github
metadata:
name: com
spec:
webURL: https://github.com
v3APIURL: https://api.github.com
integrationID: 420328
clientUserAgent: greenhouse-github-guard
secret: github-com-secret
GithubOrganization
with Feature & Action Flags
apiVersion: githubguard.sap/v1
kind: GithubOrganization
metadata:
name: com--greenhouse-sandbox
labels:
githubguard.sap/addTeam: "true"
githubguard.sap/removeTeam: "true"
githubguard.sap/addOrganizationOwner: "true"
githubguard.sap/removeOrganizationOwner: "true"
githubguard.sap/addRepositoryTeam: "true"
githubguard.sap/removeRepositoryTeam: "true"
githubguard.sap/dryRun: "false"
Default team & repository assignments:
GithubTeamRepository
for exception team & repository assignments
GithubUsername
for external username matching
apiVersion: githubguard.sap/v1
kind: GithubUsername
metadata:
annotations:
last-check-timestamp: 1681614602
name: com-I313226
spec:
userID: greenhouse_onuryilmaz
githubUsername: onuryilmaz
github: com
4.2.8 - Ingress NGINX
This plugin contains the ingress NGINX controller.
Example
To instantiate the plugin create a Plugin
like:
apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
name: ingress-nginx
spec:
pluginDefinition: ingress-nginx-v4.4.0
values:
- name: controller.service.loadBalancerIP
value: 1.2.3.4
4.2.9 - Kubernetes Monitoring
Learn more about the kube-monitoring plugin. Use it to activate Kubernetes monitoring for your Greenhouse cluster.
The main terminologies used in this document can be found in core-concepts.
Overview
Observability is often required for operation and automation of service offerings. To get the insights provided by an application and the container runtime environment, you need telemetry data in the form of metrics or logs sent to backends such as Prometheus or OpenSearch. With the kube-monitoring Plugin, you will be able to cover the metrics part of the observability stack.
This Plugin includes a pre-configured package of components that help make getting started easy and efficient. At its core, an automated and managed Prometheus installation is provided using the prometheus-operator. This is complemented by Prometheus target configuration for the most common Kubernetes components providing metrics by default. In addition, Cloud operators curated Prometheus alerting rules and Plutono dashboards are included to provide a comprehensive monitoring solution out of the box.
Components included in this Plugin:
- Prometheus
- Prometheus Operator
- Prometheus target configuration for Kubernetes metrics APIs (e.g. kubelet, apiserver, coredns, etcd)
- Prometheus node exporter
- kube-state-metrics
- kubernetes-operations
Disclaimer
It is not meant to be a comprehensive package that covers all scenarios. If you are an expert, feel free to configure the plugin according to your needs.
The Plugin is a deeply configured kube-prometheus-stack Helm chart which helps to keep track of versions and community updates.
It is intended as a platform that can be extended by following the guide.
Contribution is highly appreciated. If you discover bugs or want to add functionality to the plugin, then pull requests are always welcome.
Quick start
This guide provides a quick and straightforward way to use kube-monitoring as a Greenhouse Plugin on your Kubernetes cluster.
Prerequisites
- A running and Greenhouse-onboarded Kubernetes cluster. If you don’t have one, follow the Cluster onboarding guide.
Step 1:
You can install the kube-monitoring
package in your cluster by installing it with Helm manually or let the Greenhouse platform lifecycle it for you automatically. For the latter, you can either:
- Go to Greenhouse dashboard and select the Kubernetes Monitoring plugin from the catalog. Specify the cluster and required option values.
- Create and specify a
Plugin
resource in your Greenhouse central cluster according to the examples.
Step 2:
After installation, Greenhouse will provide a generated link to the Prometheus user interface. This is done via the annotation greenhouse.sap/expose: βtrueβ
at the Prometheus Service
resource.
Step 3:
Greenhouse regularly performs integration tests that are bundled with kube-monitoring. These provide feedback on whether all the necessary resources are installed and continuously up and running. You will find messages about this in the plugin status and also in the Greenhouse dashboard.
Configuration
Global options
Name | Description | Value |
---|---|---|
global.commonLabels | Labels to add to all resources. This can be used to add a support_group or service label to all resources and alerting rules. | true |
Prometheus-operator options
Name | Description | Value |
---|---|---|
kubeMonitoring.prometheusOperator.enabled | Manages Prometheus and Alertmanager components | true |
kubeMonitoring.prometheusOperator.alertmanagerInstanceNamespaces | Filter namespaces to look for prometheus-operator Alertmanager resources | [] |
kubeMonitoring.prometheusOperator.alertmanagerConfigNamespaces | Filter namespaces to look for prometheus-operator AlertmanagerConfig resources | [] |
kubeMonitoring.prometheusOperator.prometheusInstanceNamespaces | Filter namespaces to look for prometheus-operator Prometheus resources | [] |
Kubernetes component scraper options
Name | Description | Value |
---|---|---|
kubeMonitoring.kubernetesServiceMonitors.enabled | Flag to disable all the kubernetes component scrapers | true |
kubeMonitoring.kubeApiServer.enabled | Component scraping the kube api server | true |
kubeMonitoring.kubelet.enabled | Component scraping the kubelet and kubelet-hosted cAdvisor | true |
kubeMonitoring.coreDns.enabled | Component scraping coreDns. Use either this or kubeDns | true |
kubeMonitoring.kubeEtcd.enabled | Component scraping etcd | true |
kubeMonitoring.kubeStateMetrics.enabled | Component scraping kube state metrics | true |
kubeMonitoring.nodeExporter.enabled | Deploy node exporter as a daemonset to all nodes | true |
kubeMonitoring.kubeControllerManager.enabled | Component scraping the kube controller manager | false |
kubeMonitoring.kubeScheduler.enabled | Component scraping kube scheduler | false |
kubeMonitoring.kubeProxy.enabled | Component scraping kube proxy | false |
kubeMonitoring.kubeDns.enabled | Component scraping kubeDns. Use either this or coreDns | false |
Prometheus options
Name | Description | Value |
---|---|---|
kubeMonitoring.prometheus.enabled | Deploy a Prometheus instance | true |
kubeMonitoring.prometheus.annotations | Annotations for Prometheus | {} |
kubeMonitoring.prometheus.tlsConfig.caCert | CA certificate to verify technical clients at Prometheus Ingress | Secret |
kubeMonitoring.prometheus.ingress.enabled | Deploy Prometheus Ingress | true |
kubeMonitoring.prometheus.ingress.hosts | Must be provided if Ingress is enabled. | [] |
kubeMonitoring.prometheus.ingress.ingressClassname | Specifies the ingress-controller | nginx |
kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage | How large the persistent volume should be to house the prometheus database. Default 50Gi. | "" |
kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.storageClassName | The storage class to use for the persistent volume. | "" |
kubeMonitoring.prometheus.prometheusSpec.scrapeInterval | Interval between consecutive scrapes. Defaults to 30s | "" |
kubeMonitoring.prometheus.prometheusSpec.scrapeTimeout | Number of seconds to wait for target to respond before erroring | "" |
kubeMonitoring.prometheus.prometheusSpec.evaluationInterval | Interval between consecutive evaluations | "" |
kubeMonitoring.prometheus.prometheusSpec.externalLabels | External labels to add to any time series or alerts when communicating with external systems like Alertmanager | {} |
kubeMonitoring.prometheus.prometheusSpec.ruleSelector | PrometheusRules to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } } | {} |
kubeMonitoring.prometheus.prometheusSpec.serviceMonitorSelector | ServiceMonitors to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } } | {} |
kubeMonitoring.prometheus.prometheusSpec.podMonitorSelector | PodMonitors to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } } | {} |
kubeMonitoring.prometheus.prometheusSpec.probeSelector | Probes to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } } | {} |
kubeMonitoring.prometheus.prometheusSpec.scrapeConfigSelector | scrapeConfigs to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } } | {} |
kubeMonitoring.prometheus.prometheusSpec.retention | How long to retain metrics | "" |
kubeMonitoring.prometheus.prometheusSpec.logLevel | Log level to be configured for Prometheus | "" |
kubeMonitoring.prometheus.prometheusSpec.additionalScrapeConfigs | Next to ScrapeConfig CRD, you can use AdditionalScrapeConfigs, which allows specifying additional Prometheus scrape configurations | "" |
kubeMonitoring.prometheus.prometheusSpec.additionalArgs | Allows setting additional arguments for the Prometheus container | [] |
Alertmanager options
Name | Description | Value |
---|---|---|
alerts.enabled | To send alerts to Alertmanager | false |
alerts.alertmanager.hosts | List of Alertmanager hosts Prometheus can send alerts to | [] |
alerts.alertmanager.tlsConfig.cert | TLS certificate for communication with Alertmanager | Secret |
alerts.alertmanager.tlsConfig.key | TLS key for communication with Alertmanager | Secret |
Service Discovery
The kube-monitoring Plugin uses a PodMonitor to automatically discover the Prometheus metrics of the Kubernetes Pods in any Namespace. The PodMonitor is configured to detect the metrics endpoint of the pods with the port name metrics
and the label greenhouse/scrape: βtrueβ
.
Important: The label needs to be added manually to have the pod scraped and the port name needs to match.
Examples
Deploy kube-monitoring into a remote cluster
apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
name: kube-monitoring
spec:
pluginDefinition: kube-monitoring
disabled: false
optionValues:
- name: kubeMonitoring.prometheus.prometheusSpec.retention
value: 30d
- name: kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage
value: 100Gi
- name: kubeMonitoring.prometheus.service.labels
value:
greenhouse.sap/expose: "true"
- name: kubeMonitoring.prometheus.prometheusSpec.externalLabels
value:
cluster: example-cluster
organization: example-org
region: example-region
- name: alerts.enabled
value: true
- name: alerts.alertmanagers.hosts
value:
- alertmanager.dns.example.com
- name: alerts.alertmanagers.tlsConfig.cert
valueFrom:
secret:
key: tls.crt
name: tls-<org-name>-prometheus-auth
- name: alerts.alertmanagers.tlsConfig.key
valueFrom:
secret:
key: tls.key
name: tls-<org-name>-prometheus-auth
Deploy Prometheus only
Example Plugin
to deploy Prometheus with the kube-monitoring
Plugin.
NOTE: If you are using kube-monitoring for the first time in your cluster, it is necessary to set kubeMonitoring.prometheusOperator.enabled
to true
.
apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
name: example-prometheus-name
spec:
pluginDefinition: kube-monitoring
disabled: false
optionValues:
- name: kubeMonitoring.defaultRules.create
value: false
- name: kubeMonitoring.kubernetesServiceMonitors.enabled
value: false
- name: kubeMonitoring.prometheusOperator.enabled
value: false
- name: kubeMonitoring.kubeStateMetrics.enabled
value: false
- name: kubeMonitoring.nodeExporter.enabled
value: false
- name: kubeMonitoring.prometheus.prometheusSpec.retention
value: 30d
- name: kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage
value: 100Gi
- name: kubeMonitoring.prometheus.service.labels
value:
greenhouse.sap/expose: "true"
- name: kubeMonitoring.prometheus.prometheusSpec.externalLabels
value:
cluster: example-cluster
organization: example-org
region: example-region
- name: alerts.enabled
value: true
- name: alerts.alertmanagers.hosts
value:
- alertmanager.dns.example.com
- name: alerts.alertmanagers.tlsConfig.cert
valueFrom:
secret:
key: tls.crt
name: tls-<org-name>-prometheus-auth
- name: alerts.alertmanagers.tlsConfig.key
valueFrom:
secret:
key: tls.key
name: tls-<org-name>-prometheus-auth
Extension of the plugin
kube-monitoring can be extended with your own Prometheus alerting rules and target configurations via the Custom Resource Definitions (CRDs) of the Prometheus operator. The user-defined resources to be incorporated with the desired configuration are defined via label selections.
The CRD PrometheusRule
enables the definition of alerting and recording rules that can be used by Prometheus or Thanos Rule instances. Alerts and recording rules are reconciled and dynamically loaded by the operator without having to restart Prometheus or Thanos Rule.
kube-monitoring Prometheus will automatically discover and load the rules that match labels plugin: <plugin-name>
.
Example:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: example-prometheus-rule
labels:
plugin: <metadata.name>
## e.g plugin: kube-monitoring
spec:
groups:
- name: example-group
rules:
...
The CRDs PodMonitor
, ServiceMonitor
, Probe
and ScrapeConfig
allow the definition of a set of target endpoints to be scraped by Prometheus. The operator will automatically discover and load the configurations that match labels plugin: <plugin-name>
.
Example:
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: example-pod-monitor
labels:
plugin: <metadata.name>
## e.g plugin: kube-monitoring
spec:
selector:
matchLabels:
app: example-app
namespaceSelector:
matchNames:
- example-namespace
podMetricsEndpoints:
- port: http
...
4.2.10 - Logshipper
This Plugin is intended for shipping container and systemd logs to an Elasticsearch/ OpenSearch cluster. It uses fluentbit to collect logs. The default configuration can be found under chart/templates/fluent-bit-configmap.yaml
.
Components included in this Plugin:
Owner
- @ivogoman
Parameters
Name | Description | Value |
---|---|---|
fluent-bit.parser | Parser used for container logs. [docker|cri] labels | “cri” |
fluent-bit.backend.opensearch.host | Host for the Elastic/OpenSearch HTTP Input | |
fluent-bit.backend.opensearch.port | Port for the Elastic/OpenSearch HTTP Input | |
fluent-bit.backend.opensearch.http_user | Username for the Elastic/OpenSearch HTTP Input | |
fluent-bit.backend.opensearch.http_password | Password for the Elastic/OpenSearch HTTP Input | |
fluent-bit.backend.opensearch.host | Host for the Elastic/OpenSearch HTTP Input | |
fluent-bit.filter.additionalValues | list of Key-Value pairs to label logs labels | [] |
fluent-bit.customConfig.inputs | multi-line string containing additional inputs | |
fluent-bit.customConfig.filters | multi-line string containing additional filters | |
fluent-bit.customConfig.outputs | multi-line string containing additional outputs |
Custom Configuration
To add custom configuration to the fluent-bit configuration please check the fluentbit documentation here.
The fluent-bit.customConfig.inputs
, fluent-bit.customConfig.filters
and fluent-bit.customConfig.outputs
parameters can be used to add custom configuration to the default configuration. The configuration should be added as a multi-line string.
Inputs are rendered after the default inputs, filters are rendered after the default filters and before the additional values are added. Outputs are rendered after the default outputs.
The additional values are added to all logs disregaring the source.
Example Input configuration:
fluent-bit:
config:
inputs: |
[INPUT]
Name tail-audit
Path /var/log/containers/greenhouse-controller*.log
Parser {{ default "cri" ( index .Values "fluent-bit" "parser" ) }}
Tag audit.*
Refresh_Interval 5
Mem_Buf_Limit 50MB
Skip_Long_Lines Off
Ignore_Older 1m
DB /var/log/fluent-bit-tail-audit.pos.db
Logs collected by the default configuration are prefixed with default_
. In case that logs from additional inputs are to be send and processed by the same filters and outputs, the prefix should be used as well.
In case additional secrets are required the fluent-bit.env
field can be used to add them to the environment of the fluent-bit container. The secrets should be created by adding them to the fluent-bit.backend
field.
fluent-bit:
backend:
audit:
http_user: top-secret-audit
http_password: top-secret-audit
host: "audit.test"
tls:
enabled: true
verify: true
debug: false
4.2.11 - OpenTelemetry
Learn more about the OpenTelemetry Plugin. Use it to enable the ingestion, collection and export of telemetry signals (logs and metrics) for your Greenhouse cluster.
The main terminologies used in this document can be found in core-concepts.
Overview
OpenTelemetry is an observability framework and toolkit for creating and managing telemetry data such as metrics, logs and traces. Unlike other observability tools, OpenTelemetry is vendor and tool agnostic, meaning it can be used with a variety of observability backends, including open source tools such as OpenSearch and Prometheus.
The focus of the plugin is to provide easy-to-use configurations for common use cases of receiving, processing and exporting telemetry data in Kubernetes. The storage and visualization of the same is intentionally left to other tools.
Components included in this Plugin:
Architecture
Note
It is the intention to add more configuration over time and contributions of your very own configuration is highly appreciated. If you discover bugs or want to add functionality to the plugin, feel free to create a pull request.
Quick Start
This guide provides a quick and straightforward way to use OpenTelemetry as a Greenhouse Plugin on your Kubernetes cluster.
Prerequisites
- A running and Greenhouse-onboarded Kubernetes cluster. If you don’t have one, follow the Cluster onboarding guide.
- For logs, a OpenSearch instance to store. If you don’t have one, reach out to your observability team to get access to one.
- To gather metrics, you must have a Prometheus instance in the onboarded cluster for storage and for managing Prometheus specific CRDs. If you don not have an instance, install the kube-monitoring Plugin first.
Step 1:
You can install the OpenTelemetry
package in your cluster by installing it with Helm manually or let the Greenhouse platform lifecycle do it for you automatically. For the latter, you can either:
- Go to Greenhouse dashboard and select the OpenTelemetry plugin from the catalog. Specify the cluster and required option values.
- Create and specify a
Plugin
resource in your Greenhouse central cluster according to the examples.
Step 2:
The package will deploy the OpenTelemetry Operator which works as a manager for the collectors and auto-instrumentation of the workload. By default, the package will include a configuration for collecting metrics and logs. The log-collector is currently processing data from the preconfigured receivers:
- Files via the Filelog Receiver
- Kubernetes Events from the Kubernetes API server
- Journald events from systemd journal
- its own metrics
You can disable the collection of logs by setting openTelemetry.logCollector.enabled
to false
. The same is true for disabling the collection of metrics by setting openTelemetry.metricsCollector.enabled
to false
.
The logsCollector
comes with a standard set of log-processing, such as adding cluster information and common labels for Journald events.
In addition we provide default pipelines for common log types. Currently the following log types have default configurations that can be enabled (requires logsCollector.enabled
to true
):
- KVM:
openTelemetry.logsCollector.kvmConfig
: Logs from Kernel-based Virtual Machines (KVMs) providing insights into virtualization activities, resource usage, and system performance - Ceph:
openTelemetry.logsCollector.cephConfig
: Logs from Ceph storage systems, capturing information about cluster operations, performance metrics, and health status
These default configurations provide common labels and Grok parsing for logs emitted through the respective services.
Based on the backend selection the telemetry data will be exporter to the backend.
Step 3:
Greenhouse regularly performs integration tests that are bundled with OpenTelemetry. These provide feedback on whether all the necessary resources are installed and continuously up and running. You will find messages about this in the plugin status and also in the Greenhouse dashboard.
Configuration
Name | Description | Type | required |
---|---|---|---|
openTelemetry.logsCollector.enabled | Activates the standard configuration for logs | bool | false |
openTelemetry.logsCollector.kvmConfig.enabled | Activates the configuration for KVM logs (requires logsCollector to be enabled) | bool | false |
openTelemetry.logsCollector.cephConfig.enabled | Activates the configuration for Ceph logs (requires logsCollector to be enabled) | bool | false |
openTelemetry.metricsCollector.enabled | Activates the standard configuration for metrics | bool | false |
openTelemetry.openSearchLogs.username | Username for OpenSearch endpoint | secret | false |
openTelemetry.openSearchLogs.password | Password for OpenSearch endpoint | secret | false |
openTelemetry.openSearchLogs.endpoint | Endpoint URL for OpenSearch | secret | false |
openTelemetry.region | Region label for logging | string | false |
openTelemetry.cluster | Cluster label for logging | string | false |
openTelemetry.prometheus.additionalLabels | Label selector for Prometheus resources to be picked-up by the operator | map | false |
openTelemetry.prometheus.rules.additionalRuleLabels | Additional labels for PrometheusRule alerts | map | false |
openTelemetry.prometheus.serviceMonitor.enabled | Activates the service-monitoring for the Logs Collector | bool | false |
openTelemetry.prometheus.podMonitor.enabled | Activates the pod-monitoring for the Logs Collector | bool | false |
openTelemetry.prometheus.rules.create | Enables PrometheusRule resources to be created | bool | false |
openTelemetry.prometheus.rules.disabled | List of PrometheusRules to disable | map | false |
openTelemetry.prometheus.rules.labels | Labels for PrometheusRules | map | false |
openTelemetry.prometheus.rules.annotations | Annotations for PrometheusRules | map | false |
openTelemetry.prometheus.rules.additionalRuleLabels | Additional labels for PrometheusRule alerts, | map | false |
opentelemetry-operator.admissionWebhooks.certManager.enabled | Activate to use the CertManager for generating self-signed certificates | bool | false |
opentelemetry-operator.admissionWebhooks.autoGenerateCert.enabled | Activate to use Helm to create self-signed certificates | bool | false |
opentelemetry-operator.admissionWebhooks.autoGenerateCert.recreate | Activate to recreate the cert after a defined period (certPeriodDays default is 365) | bool | false |
opentelemetry-operator.kubeRBACProxy.enabled | Activate to enable Kube-RBAC-Proxy for OpenTelemetry | bool | false |
opentelemetry-operator.manager.prometheusRule.defaultRules.enabled | Activate to enable default rules for monitoring the OpenTelemetry Manager | bool | false |
opentelemetry-operator.manager.prometheusRule.enabled | Activate to enable rules for monitoring the OpenTelemetry Manager | bool | false |
Examples
TBD
4.2.12 - Perses
[!WARNING] This plugin is in beta and please report any bugs by creating an issue here.
Table of Contents
- Table of Contents
- Overview
- Disclaimer
- Quick Start
- Configuration
- Create a custom dashboard
- Add Dashboards as ConfigMaps
Learn more about the Perses Plugin. Use it to visualize Prometheus/Thanos metrics for your Greenhouse remote cluster.
The main terminologies used in this document can be found in core-concepts.
Overview
Observability is often required for the operation and automation of service offerings. Perses is a CNCF project and it aims to become an open-standard for dashboards and visualization. It provides you with tools to display Prometheus metrics on live dashboards with insightful charts and visualizations. In the Greenhouse context, this complements the kube-monitoring plugin, which automatically acts as a Perses data source which is recognized by Perses. In addition, the Plugin provides a mechanism that automates the lifecycle of datasources and dashboards without having to restart Perses.
Disclaimer
This is not meant to be a comprehensive package that covers all scenarios. If you are an expert, feel free to configure the Plugin according to your needs.
Contribution is highly appreciated. If you discover bugs or want to add functionality to the plugin, then pull requests are always welcome.
Quick Start
This guide provides a quick and straightforward way how to use Perses as a Greenhouse Plugin on your Kubernetes cluster.
Prerequisites
- A running and Greenhouse-managed Kubernetes remote cluster
kube-monitoring
Plugin should be installed with.spec.kubeMonitoring.prometheus.persesDatasource: true
and it should have at least one Prometheus instance running in the cluster
The plugin works by default with anonymous access enabled. This plugin comes with some default dashboards and the kube-monitoring datasource will be automatically discovered by the plugin.
Step 1: Add your dashboards and datasources
Dashboards are selected from ConfigMaps
across namespaces. The plugin searches for ConfigMaps
with the label perses.dev/resource: "true"
and imports them into Perses. The ConfigMap
must contain a key like my-dashboard.json
with the dashboard JSON content. Please refer this section for more information.
A guide on how to create custom dashboards on the UI can be found here.
Configuration
Parameter | Description | Default |
---|---|---|
perses.additionalLabels | Additional labels to add to all resources | {} |
perses.annotations | Statefulset annotations | {} |
perses.config.annotations | Annotations for config | {} |
perses.config.database.file.extension | Database file extension | json |
perses.config.database.file.folder | Database file folder path | /perses |
perses.config.database.sql | SQL database configuration | {} |
perses.config.important_dashboards | List of important dashboards | [] |
perses.config.provisioning.folders.0 | Provisioning folder path | /etc/perses/provisioning |
perses.config.provisioning.interval | Provisioning check interval | 10s |
perses.config.schemas.datasources_path | Datasource schemas path | /etc/perses/cue/schemas/datasources |
perses.config.schemas.interval | Schema check interval | 5m |
perses.config.schemas.panels_path | Panel schemas path | /etc/perses/cue/schemas/panels |
perses.config.schemas.queries_path | Query schemas path | /etc/perses/cue/schemas/queries |
perses.config.schemas.variables_path | Variable schemas path | /etc/perses/cue/schemas/variables |
perses.config.security.cookie.same_site | Cookie SameSite attribute | lax |
perses.config.security.cookie.secure | Enable secure cookies | false |
perses.config.security.enableAuth | Enable authentication | false |
perses.config.security.readOnly | Configure Perses instance as readonly mode | false |
perses.datasources | Configure datasources (DEPRECATED). Please use the ‘sidecar’ configuration to provision datasources | [] |
perses.fullnameOverride | Override fully qualified app name | "" |
perses.image.name | Container image name | persesdev/perses |
perses.image.pullPolicy | Image pull policy | IfNotPresent |
perses.image.version | Override default image tag | "" |
perses.ingress.annotations | Additional annotations for the Ingress resource. To enable certificate autogeneration, place here your cert-manager annotations. | {} |
perses.ingress.enabled | Configure the ingress resource that allows you to access Thanos Query Frontend | false |
perses.ingress.hosts | Ingress hostnames | ["perses.local"] |
perses.ingress.ingressClassName | IngressClass that will be be used to implement the Ingress (Kubernetes 1.18+) | "" |
perses.ingress.path | Ingress path | / |
perses.ingress.pathType | Ingress path type | Prefix |
perses.ingress.tls | Ingress TLS configuration | [] |
perses.livenessProbe.enabled | Enable liveness probe | true |
perses.livenessProbe.failureThreshold | Liveness probe failure threshold | 5 |
perses.livenessProbe.initialDelaySeconds | Liveness probe initial delay | 10 |
perses.livenessProbe.periodSeconds | Liveness probe period | 60 |
perses.livenessProbe.successThreshold | Liveness probe success threshold | 1 |
perses.livenessProbe.timeoutSeconds | Liveness probe timeout | 5 |
perses.logLevel | Logging level - available options “panic”, “error”, “warning”, “info”, “debug”, “trace” level | info |
perses.nameOverride | Override chart name | "" |
perses.persistence.accessModes | PVC access modes for data volume | ["ReadWriteOnce"] |
perses.persistence.annotations | PVC annotations | {} |
perses.persistence.enabled | Enable persistence. If disabled, it will use a emptydir volume | false |
perses.persistence.labels | PVC labels | {} |
perses.persistence.securityContext.fsGroup | Security context for the PVC when persistence is enabled | 2000 |
perses.persistence.size | PVC storage size | 8Gi |
perses.readinessProbe.enabled | Enable readiness probe | true |
perses.readinessProbe.failureThreshold | Readiness probe failure threshold | 5 |
perses.readinessProbe.initialDelaySeconds | Readiness probe initial delay | 5 |
perses.readinessProbe.periodSeconds | Readiness probe period | 10 |
perses.readinessProbe.successThreshold | Readiness probe success threshold | 1 |
perses.readinessProbe.timeoutSeconds | Readiness probe timeout | 5 |
perses.replicas | Number of replicas | 1 |
perses.resources | Resource limits and requests | {} |
perses.service.annotations | Service annotations | {} |
perses.service.labels | Service labels | {} |
perses.service.port | Service port | 8080 |
perses.service.portName | Service port name | http |
perses.service.targetPort | Container target port | 8080 |
perses.service.type | Service type | ClusterIP |
perses.serviceAccount.annotations | Service account annotations | {} |
perses.serviceAccount.create | Create service account | true |
perses.serviceAccount.name | Service account name | "" |
perses.sidecar.enabled | Enable sidecar to auto discover the configmaps holding perses dashboards and datasources | false |
perses.sidecar.image.repository | Container image repository for the sidecar | kiwigrid/k8s-sidecar |
perses.sidecar.image.tag | Container image tag for the sidecar | 1.28.0 |
perses.sidecar.label | Label key to watch for ConfigMaps containing Perses resources | perses.dev/resource |
perses.sidecar.labelValue | Label value to watch for ConfigMaps containing Perses resources | "true" |
perses.volumeMounts | Additional volume mounts | [] |
perses.volumes | Additional volumes | [] |
Create a custom dashboard
- Add a new Project by clicking on ADD PROJECT in the top right corner. Give it a name and click Add.
- Add a new dashboard by clicking on ADD DASHBOARD. Give it a name and click Add.
- Now you can add variables, panels to your dashboard.
- You can group your panels by adding the panels to a Panel Group.
- Move and resize the panels as needed.
- Watch this gif to learn more.
- You do not need to add the kube-monitoring datasource manually. It will be automatically discovered by Perses.
- Click Save after you have made changes.
- Export the dashboard.
- Click on the {} icon in the top right corner of the dashboard.
- Copy the entire JSON model.
- See the next section for detailed instructions on how and where to paste the copied dashboard JSON model.
Add Dashboards as ConfigMaps
By default, a sidecar container is deployed in the Perses pod. This container watches all configmaps in the cluster and filters out the ones with a label perses.dev/resource: "true"
. The files defined in those configmaps are written to a folder and this folder is accessed by Perses. Changes to the configmaps are continuously monitored and are reflected in Perses within 10 seconds.
A recommendation is to use one configmap per dashboard. This way, you can easily manage the dashboards in your git repository.
Recommended folder structure
Folder structure:
dashboards/
βββ dashboard1.json
βββ dashboard2.json
βββ prometheusdatasource1.json
βββ prometheusdatasource2.json
templates/
βββdashboard-json-configmap.yaml
Helm template to create a configmap for each dashboard:
{{- range $path, $bytes := .Files.Glob "dashboards/*.json" }}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ printf "%s-%s" $.Release.Name $path | replace "/" "-" | trunc 63 }}
labels:
perses.dev/resource: "true"
data:
{{ printf "%s: |-" $path | replace "/" "-" | indent 2 }}
{{ printf "%s" $bytes | indent 4 }}
{{- end }}
4.2.13 - Plutono
Learn more about the plutono Plugin. Use it to install the web dashboarding system Plutono to collect, correlate, and visualize Prometheus metrics for your Greenhouse cluster.
The main terminologies used in this document can be found in core-concepts.
Overview
Observability is often required for the operation and automation of service offerings. Plutono provides you with tools to display Prometheus metrics on live dashboards with insightful charts and visualizations. In the Greenhouse context, this complements the kube-monitoring plugin, which automatically acts as a Plutono data source which is recognized by Plutono. In addition, the Plugin provides a mechanism that automates the lifecycle of datasources and dashboards without having to restart Plutono.
Disclaimer
This is not meant to be a comprehensive package that covers all scenarios. If you are an expert, feel free to configure the Plugin according to your needs.
Contribution is highly appreciated. If you discover bugs or want to add functionality to the plugin, then pull requests are always welcome.
Quick Start
This guide provides a quick and straightforward way how to use Plutono as a Greenhouse Plugin on your Kubernetes cluster.
Prerequisites
- A running and Greenhouse-managed Kubernetes cluster
kube-monitoring
Plugin installed to have at least one Prometheus instance running in the cluster
The plugin works by default with anonymous access enabled. If you use the standard configuration in the kube-monitoring plugin, the data source and some kubernetes-operations dashboards are already pre-installed.
Step 1: Add your dashboards
Dashboards are selected from ConfigMaps
across namespaces. The plugin searches for ConfigMaps
with the label plutono-dashboard: "true"
and imports them into Plutono. The ConfigMap
must contain a key like my-dashboard.json
with the dashboard JSON content. Example
A guide on how to create dashboards can be found here.
Step 2: Add your datasources
Data sources are selected from Secrets
across namespaces. The plugin searches for Secrets
with the label plutono-dashboard: "true"
and imports them into Plutono. The Secrets
should contain valid datasource configuration YAML. Example
Configuration
Parameter | Description | Default |
---|---|---|
plutono.replicas | Number of nodes | 1 |
plutono.deploymentStrategy | Deployment strategy | { "type": "RollingUpdate" } |
plutono.livenessProbe | Liveness Probe settings | { "httpGet": { "path": "/api/health", "port": 3000 } "initialDelaySeconds": 60, "timeoutSeconds": 30, "failureThreshold": 10 } |
plutono.readinessProbe | Readiness Probe settings | { "httpGet": { "path": "/api/health", "port": 3000 } } |
plutono.securityContext | Deployment securityContext | {"runAsUser": 472, "runAsGroup": 472, "fsGroup": 472} |
plutono.priorityClassName | Name of Priority Class to assign pods | nil |
plutono.image.registry | Image registry | ghcr.io |
plutono.image.repository | Image repository | credativ/plutono |
plutono.image.tag | Overrides the Plutono image tag whose default is the chart appVersion (Must be >= 5.0.0 ) | `` |
plutono.image.sha | Image sha (optional) | `` |
plutono.image.pullPolicy | Image pull policy | IfNotPresent |
plutono.image.pullSecrets | Image pull secrets (can be templated) | [] |
plutono.service.enabled | Enable plutono service | true |
plutono.service.ipFamilies | Kubernetes service IP families | [] |
plutono.service.ipFamilyPolicy | Kubernetes service IP family policy | "" |
plutono.service.type | Kubernetes service type | ClusterIP |
plutono.service.port | Kubernetes port where service is exposed | 80 |
plutono.service.portName | Name of the port on the service | service |
plutono.service.appProtocol | Adds the appProtocol field to the service | `` |
plutono.service.targetPort | Internal service is port | 3000 |
plutono.service.nodePort | Kubernetes service nodePort | nil |
plutono.service.annotations | Service annotations (can be templated) | {} |
plutono.service.labels | Custom labels | {} |
plutono.service.clusterIP | internal cluster service IP | nil |
plutono.service.loadBalancerIP | IP address to assign to load balancer (if supported) | nil |
plutono.service.loadBalancerSourceRanges | list of IP CIDRs allowed access to lb (if supported) | [] |
plutono.service.externalIPs | service external IP addresses | [] |
plutono.service.externalTrafficPolicy | change the default externalTrafficPolicy | nil |
plutono.headlessService | Create a headless service | false |
plutono.extraExposePorts | Additional service ports for sidecar containers | [] |
plutono.hostAliases | adds rules to the pod’s /etc/hosts | [] |
plutono.ingress.enabled | Enables Ingress | false |
plutono.ingress.annotations | Ingress annotations (values are templated) | {} |
plutono.ingress.labels | Custom labels | {} |
plutono.ingress.path | Ingress accepted path | / |
plutono.ingress.pathType | Ingress type of path | Prefix |
plutono.ingress.hosts | Ingress accepted hostnames | ["chart-example.local"] |
plutono.ingress.extraPaths | Ingress extra paths to prepend to every host configuration. Useful when configuring custom actions with AWS ALB Ingress Controller. Requires ingress.hosts to have one or more host entries. | [] |
plutono.ingress.tls | Ingress TLS configuration | [] |
plutono.ingress.ingressClassName | Ingress Class Name. MAY be required for Kubernetes versions >= 1.18 | "" |
plutono.resources | CPU/Memory resource requests/limits | {} |
plutono.nodeSelector | Node labels for pod assignment | {} |
plutono.tolerations | Toleration labels for pod assignment | [] |
plutono.affinity | Affinity settings for pod assignment | {} |
plutono.extraInitContainers | Init containers to add to the plutono pod | {} |
plutono.extraContainers | Sidecar containers to add to the plutono pod | "" |
plutono.extraContainerVolumes | Volumes that can be mounted in sidecar containers | [] |
plutono.extraLabels | Custom labels for all manifests | {} |
plutono.schedulerName | Name of the k8s scheduler (other than default) | nil |
plutono.persistence.enabled | Use persistent volume to store data | false |
plutono.persistence.type | Type of persistence (pvc or statefulset ) | pvc |
plutono.persistence.size | Size of persistent volume claim | 10Gi |
plutono.persistence.existingClaim | Use an existing PVC to persist data (can be templated) | nil |
plutono.persistence.storageClassName | Type of persistent volume claim | nil |
plutono.persistence.accessModes | Persistence access modes | [ReadWriteOnce] |
plutono.persistence.annotations | PersistentVolumeClaim annotations | {} |
plutono.persistence.finalizers | PersistentVolumeClaim finalizers | [ "kubernetes.io/pvc-protection" ] |
plutono.persistence.extraPvcLabels | Extra labels to apply to a PVC. | {} |
plutono.persistence.subPath | Mount a sub dir of the persistent volume (can be templated) | nil |
plutono.persistence.inMemory.enabled | If persistence is not enabled, whether to mount the local storage in-memory to improve performance | false |
plutono.persistence.inMemory.sizeLimit | SizeLimit for the in-memory local storage | nil |
plutono.persistence.disableWarning | Hide NOTES warning, useful when persiting to a database | false |
plutono.initChownData.enabled | If false, don’t reset data ownership at startup | true |
plutono.initChownData.image.registry | init-chown-data container image registry | docker.io |
plutono.initChownData.image.repository | init-chown-data container image repository | busybox |
plutono.initChownData.image.tag | init-chown-data container image tag | 1.31.1 |
plutono.initChownData.image.sha | init-chown-data container image sha (optional) | "" |
plutono.initChownData.image.pullPolicy | init-chown-data container image pull policy | IfNotPresent |
plutono.initChownData.resources | init-chown-data pod resource requests & limits | {} |
plutono.schedulerName | Alternate scheduler name | nil |
plutono.env | Extra environment variables passed to pods | {} |
plutono.envValueFrom | Environment variables from alternate sources. See the API docs on EnvVarSource for format details. Can be templated | {} |
plutono.envFromSecret | Name of a Kubernetes secret (must be manually created in the same namespace) containing values to be added to the environment. Can be templated | "" |
plutono.envFromSecrets | List of Kubernetes secrets (must be manually created in the same namespace) containing values to be added to the environment. Can be templated | [] |
plutono.envFromConfigMaps | List of Kubernetes ConfigMaps (must be manually created in the same namespace) containing values to be added to the environment. Can be templated | [] |
plutono.envRenderSecret | Sensible environment variables passed to pods and stored as secret. (passed through tpl) | {} |
plutono.enableServiceLinks | Inject Kubernetes services as environment variables. | true |
plutono.extraSecretMounts | Additional plutono server secret mounts | [] |
plutono.extraVolumeMounts | Additional plutono server volume mounts | [] |
plutono.extraVolumes | Additional Plutono server volumes | [] |
plutono.automountServiceAccountToken | Mounted the service account token on the plutono pod. Mandatory, if sidecars are enabled | true |
plutono.createConfigmap | Enable creating the plutono configmap | true |
plutono.extraConfigmapMounts | Additional plutono server configMap volume mounts (values are templated) | [] |
plutono.extraEmptyDirMounts | Additional plutono server emptyDir volume mounts | [] |
plutono.plugins | Plugins to be loaded along with Plutono | [] |
plutono.datasources | Configure plutono datasources (passed through tpl) | {} |
plutono.alerting | Configure plutono alerting (passed through tpl) | {} |
plutono.notifiers | Configure plutono notifiers | {} |
plutono.dashboardProviders | Configure plutono dashboard providers | {} |
plutono.dashboards | Dashboards to import | {} |
plutono.dashboardsConfigMaps | ConfigMaps reference that contains dashboards | {} |
plutono.plutono.ini | Plutono’s primary configuration | {} |
global.imageRegistry | Global image pull registry for all images. | null |
global.imagePullSecrets | Global image pull secrets (can be templated). Allows either an array of {name: pullSecret} maps (k8s-style), or an array of strings (more common helm-style). | [] |
plutono.ldap.enabled | Enable LDAP authentication | false |
plutono.ldap.existingSecret | The name of an existing secret containing the ldap.toml file, this must have the key ldap-toml . | "" |
plutono.ldap.config | Plutono’s LDAP configuration | "" |
plutono.annotations | Deployment annotations | {} |
plutono.labels | Deployment labels | {} |
plutono.podAnnotations | Pod annotations | {} |
plutono.podLabels | Pod labels | {} |
plutono.podPortName | Name of the plutono port on the pod | plutono |
plutono.lifecycleHooks | Lifecycle hooks for podStart and preStop Example | {} |
plutono.sidecar.image.registry | Sidecar image registry | quay.io |
plutono.sidecar.image.repository | Sidecar image repository | kiwigrid/k8s-sidecar |
plutono.sidecar.image.tag | Sidecar image tag | 1.26.0 |
plutono.sidecar.image.sha | Sidecar image sha (optional) | "" |
plutono.sidecar.imagePullPolicy | Sidecar image pull policy | IfNotPresent |
plutono.sidecar.resources | Sidecar resources | {} |
plutono.sidecar.securityContext | Sidecar securityContext | {} |
plutono.sidecar.enableUniqueFilenames | Sets the kiwigrid/k8s-sidecar UNIQUE_FILENAMES environment variable. If set to true the sidecar will create unique filenames where duplicate data keys exist between ConfigMaps and/or Secrets within the same or multiple Namespaces. | false |
plutono.sidecar.alerts.enabled | Enables the cluster wide search for alerts and adds/updates/deletes them in plutono | false |
plutono.sidecar.alerts.label | Label that config maps with alerts should have to be added | plutono_alert |
plutono.sidecar.alerts.labelValue | Label value that config maps with alerts should have to be added | "" |
plutono.sidecar.alerts.searchNamespace | Namespaces list. If specified, the sidecar will search for alerts config-maps inside these namespaces. Otherwise the namespace in which the sidecar is running will be used. It’s also possible to specify ALL to search in all namespaces. | nil |
plutono.sidecar.alerts.watchMethod | Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds. | WATCH |
plutono.sidecar.alerts.resource | Should the sidecar looks into secrets, configmaps or both. | both |
plutono.sidecar.alerts.reloadURL | Full url of datasource configuration reload API endpoint, to invoke after a config-map change | "http://localhost:3000/api/admin/provisioning/alerting/reload" |
plutono.sidecar.alerts.skipReload | Enabling this omits defining the REQ_URL and REQ_METHOD environment variables | false |
plutono.sidecar.alerts.initAlerts | Set to true to deploy the alerts sidecar as an initContainer. This is needed if skipReload is true, to load any alerts defined at startup time. | false |
plutono.sidecar.alerts.extraMounts | Additional alerts sidecar volume mounts. | [] |
plutono.sidecar.dashboards.enabled | Enables the cluster wide search for dashboards and adds/updates/deletes them in plutono | false |
plutono.sidecar.dashboards.SCProvider | Enables creation of sidecar provider | true |
plutono.sidecar.dashboards.provider.name | Unique name of the plutono provider | sidecarProvider |
plutono.sidecar.dashboards.provider.orgid | Id of the organisation, to which the dashboards should be added | 1 |
plutono.sidecar.dashboards.provider.folder | Logical folder in which plutono groups dashboards | "" |
plutono.sidecar.dashboards.provider.folderUid | Allows you to specify the static UID for the logical folder above | "" |
plutono.sidecar.dashboards.provider.disableDelete | Activate to avoid the deletion of imported dashboards | false |
plutono.sidecar.dashboards.provider.allowUiUpdates | Allow updating provisioned dashboards from the UI | false |
plutono.sidecar.dashboards.provider.type | Provider type | file |
plutono.sidecar.dashboards.provider.foldersFromFilesStructure | Allow Plutono to replicate dashboard structure from filesystem. | false |
plutono.sidecar.dashboards.watchMethod | Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds. | WATCH |
plutono.sidecar.skipTlsVerify | Set to true to skip tls verification for kube api calls | nil |
plutono.sidecar.dashboards.label | Label that config maps with dashboards should have to be added | plutono_dashboard |
plutono.sidecar.dashboards.labelValue | Label value that config maps with dashboards should have to be added | "" |
plutono.sidecar.dashboards.folder | Folder in the pod that should hold the collected dashboards (unless sidecar.dashboards.defaultFolderName is set). This path will be mounted. | /tmp/dashboards |
plutono.sidecar.dashboards.folderAnnotation | The annotation the sidecar will look for in configmaps to override the destination folder for files | nil |
plutono.sidecar.dashboards.defaultFolderName | The default folder name, it will create a subfolder under the sidecar.dashboards.folder and put dashboards in there instead | nil |
plutono.sidecar.dashboards.searchNamespace | Namespaces list. If specified, the sidecar will search for dashboards config-maps inside these namespaces. Otherwise the namespace in which the sidecar is running will be used. It’s also possible to specify ALL to search in all namespaces. | nil |
plutono.sidecar.dashboards.script | Absolute path to shell script to execute after a configmap got reloaded. | nil |
plutono.sidecar.dashboards.reloadURL | Full url of dashboards configuration reload API endpoint, to invoke after a config-map change | "http://localhost:3000/api/admin/provisioning/dashboards/reload" |
plutono.sidecar.dashboards.skipReload | Enabling this omits defining the REQ_USERNAME, REQ_PASSWORD, REQ_URL and REQ_METHOD environment variables | false |
plutono.sidecar.dashboards.resource | Should the sidecar looks into secrets, configmaps or both. | both |
plutono.sidecar.dashboards.extraMounts | Additional dashboard sidecar volume mounts. | [] |
plutono.sidecar.datasources.enabled | Enables the cluster wide search for datasources and adds/updates/deletes them in plutono | false |
plutono.sidecar.datasources.label | Label that config maps with datasources should have to be added | plutono_datasource |
plutono.sidecar.datasources.labelValue | Label value that config maps with datasources should have to be added | "" |
plutono.sidecar.datasources.searchNamespace | Namespaces list. If specified, the sidecar will search for datasources config-maps inside these namespaces. Otherwise the namespace in which the sidecar is running will be used. It’s also possible to specify ALL to search in all namespaces. | nil |
plutono.sidecar.datasources.watchMethod | Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds. | WATCH |
plutono.sidecar.datasources.resource | Should the sidecar looks into secrets, configmaps or both. | both |
plutono.sidecar.datasources.reloadURL | Full url of datasource configuration reload API endpoint, to invoke after a config-map change | "http://localhost:3000/api/admin/provisioning/datasources/reload" |
plutono.sidecar.datasources.skipReload | Enabling this omits defining the REQ_URL and REQ_METHOD environment variables | false |
plutono.sidecar.datasources.initDatasources | Set to true to deploy the datasource sidecar as an initContainer in addition to a container. This is needed if skipReload is true, to load any datasources defined at startup time. | false |
plutono.sidecar.notifiers.enabled | Enables the cluster wide search for notifiers and adds/updates/deletes them in plutono | false |
plutono.sidecar.notifiers.label | Label that config maps with notifiers should have to be added | plutono_notifier |
plutono.sidecar.notifiers.labelValue | Label value that config maps with notifiers should have to be added | "" |
plutono.sidecar.notifiers.searchNamespace | Namespaces list. If specified, the sidecar will search for notifiers config-maps (or secrets) inside these namespaces. Otherwise the namespace in which the sidecar is running will be used. It’s also possible to specify ALL to search in all namespaces. | nil |
plutono.sidecar.notifiers.watchMethod | Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds. | WATCH |
plutono.sidecar.notifiers.resource | Should the sidecar looks into secrets, configmaps or both. | both |
plutono.sidecar.notifiers.reloadURL | Full url of notifier configuration reload API endpoint, to invoke after a config-map change | "http://localhost:3000/api/admin/provisioning/notifications/reload" |
plutono.sidecar.notifiers.skipReload | Enabling this omits defining the REQ_URL and REQ_METHOD environment variables | false |
plutono.sidecar.notifiers.initNotifiers | Set to true to deploy the notifier sidecar as an initContainer in addition to a container. This is needed if skipReload is true, to load any notifiers defined at startup time. | false |
plutono.smtp.existingSecret | The name of an existing secret containing the SMTP credentials. | "" |
plutono.smtp.userKey | The key in the existing SMTP secret containing the username. | "user" |
plutono.smtp.passwordKey | The key in the existing SMTP secret containing the password. | "password" |
plutono.admin.existingSecret | The name of an existing secret containing the admin credentials (can be templated). | "" |
plutono.admin.userKey | The key in the existing admin secret containing the username. | "admin-user" |
plutono.admin.passwordKey | The key in the existing admin secret containing the password. | "admin-password" |
plutono.serviceAccount.automountServiceAccountToken | Automount the service account token on all pods where is service account is used | false |
plutono.serviceAccount.annotations | ServiceAccount annotations | |
plutono.serviceAccount.create | Create service account | true |
plutono.serviceAccount.labels | ServiceAccount labels | {} |
plutono.serviceAccount.name | Service account name to use, when empty will be set to created account if serviceAccount.create is set else to default | `` |
plutono.serviceAccount.nameTest | Service account name to use for test, when empty will be set to created account if serviceAccount.create is set else to default | nil |
plutono.rbac.create | Create and use RBAC resources | true |
plutono.rbac.namespaced | Creates Role and Rolebinding instead of the default ClusterRole and ClusteRoleBindings for the plutono instance | false |
plutono.rbac.useExistingRole | Set to a rolename to use existing role - skipping role creating - but still doing serviceaccount and rolebinding to the rolename set here. | nil |
plutono.rbac.pspEnabled | Create PodSecurityPolicy (with rbac.create , grant roles permissions as well) | false |
plutono.rbac.pspUseAppArmor | Enforce AppArmor in created PodSecurityPolicy (requires rbac.pspEnabled ) | false |
plutono.rbac.extraRoleRules | Additional rules to add to the Role | [] |
plutono.rbac.extraClusterRoleRules | Additional rules to add to the ClusterRole | [] |
plutono.command | Define command to be executed by plutono container at startup | nil |
plutono.args | Define additional args if command is used | nil |
plutono.testFramework.enabled | Whether to create test-related resources | true |
plutono.testFramework.image.registry | test-framework image registry. | docker.io |
plutono.testFramework.image.repository | test-framework image repository. | bats/bats |
plutono.testFramework.image.tag | test-framework image tag. | v1.4.1 |
plutono.testFramework.imagePullPolicy | test-framework image pull policy. | IfNotPresent |
plutono.testFramework.securityContext | test-framework securityContext | {} |
plutono.downloadDashboards.env | Environment variables to be passed to the download-dashboards container | {} |
plutono.downloadDashboards.envFromSecret | Name of a Kubernetes secret (must be manually created in the same namespace) containing values to be added to the environment. Can be templated | "" |
plutono.downloadDashboards.resources | Resources of download-dashboards container | {} |
plutono.downloadDashboardsImage.registry | Curl docker image registry | docker.io |
plutono.downloadDashboardsImage.repository | Curl docker image repository | curlimages/curl |
plutono.downloadDashboardsImage.tag | Curl docker image tag | 7.73.0 |
plutono.downloadDashboardsImage.sha | Curl docker image sha (optional) | "" |
plutono.downloadDashboardsImage.pullPolicy | Curl docker image pull policy | IfNotPresent |
plutono.namespaceOverride | Override the deployment namespace | "" (Release.Namespace ) |
plutono.serviceMonitor.enabled | Use servicemonitor from prometheus operator | false |
plutono.serviceMonitor.namespace | Namespace this servicemonitor is installed in | |
plutono.serviceMonitor.interval | How frequently Prometheus should scrape | 1m |
plutono.serviceMonitor.path | Path to scrape | /metrics |
plutono.serviceMonitor.scheme | Scheme to use for metrics scraping | http |
plutono.serviceMonitor.tlsConfig | TLS configuration block for the endpoint | {} |
plutono.serviceMonitor.labels | Labels for the servicemonitor passed to Prometheus Operator | {} |
plutono.serviceMonitor.scrapeTimeout | Timeout after which the scrape is ended | 30s |
plutono.serviceMonitor.relabelings | RelabelConfigs to apply to samples before scraping. | [] |
plutono.serviceMonitor.metricRelabelings | MetricRelabelConfigs to apply to samples before ingestion. | [] |
plutono.revisionHistoryLimit | Number of old ReplicaSets to retain | 10 |
plutono.networkPolicy.enabled | Enable creation of NetworkPolicy resources. | false |
plutono.networkPolicy.allowExternal | Don’t require client label for connections | true |
plutono.networkPolicy.explicitNamespacesSelector | A Kubernetes LabelSelector to explicitly select namespaces from which traffic could be allowed | {} |
plutono.networkPolicy.ingress | Enable the creation of an ingress network policy | true |
plutono.networkPolicy.egress.enabled | Enable the creation of an egress network policy | false |
plutono.networkPolicy.egress.ports | An array of ports to allow for the egress | [] |
plutono.enableKubeBackwardCompatibility | Enable backward compatibility of kubernetes where pod’s definition version below 1.13 doesn’t have the enableServiceLinks option | false |
Example of extraVolumeMounts and extraVolumes
Configure additional volumes with extraVolumes
and volume mounts with extraVolumeMounts
.
Example for extraVolumeMounts
and corresponding extraVolumes
:
extraVolumeMounts:
- name: plugins
mountPath: /var/lib/plutono/plugins
subPath: configs/plutono/plugins
readOnly: false
- name: dashboards
mountPath: /var/lib/plutono/dashboards
hostPath: /usr/shared/plutono/dashboards
readOnly: false
extraVolumes:
- name: plugins
existingClaim: existing-plutono-claim
- name: dashboards
hostPath: /usr/shared/plutono/dashboards
Volumes default to emptyDir
. Set to persistentVolumeClaim
,
hostPath
, csi
, or configMap
for other types. For a
persistentVolumeClaim
, specify an existing claim name with
existingClaim
.
Import dashboards
There are a few methods to import dashboards to Plutono. Below are some examples and explanations as to how to use each method:
dashboards:
default:
some-dashboard:
json: |
{
"annotations":
...
# Complete json file here
...
"title": "Some Dashboard",
"uid": "abcd1234",
"version": 1
}
custom-dashboard:
# This is a path to a file inside the dashboards directory inside the chart directory
file: dashboards/custom-dashboard.json
prometheus-stats:
# Ref: https://plutono.com/dashboards/2
gnetId: 2
revision: 2
datasource: Prometheus
loki-dashboard-quick-search:
gnetId: 12019
revision: 2
datasource:
- name: DS_PROMETHEUS
value: Prometheus
local-dashboard:
url: https://raw.githubusercontent.com/user/repository/master/dashboards/dashboard.json
Create a dashboard
Click Dashboards in the main menu.
Click New and select New Dashboard.
Click Add new empty panel.
Important: Add a datasource variable as they are provisioned in the cluster.
- Go to Dashboard settings.
- Click Variables.
- Click Add variable.
- General: Configure the variable with a proper Name as Type
Datasource
. - Data source options: Select the data source Type e.g.
Prometheus
. - Click Update.
- Go back.
Develop your panels.
- On the Edit panel view, choose your desired Visualization.
- Select the datasource variable you just created.
- Write or construct a query in the query language of your data source.
- Move and resize the panels as needed.
Optionally add a tag to the dashboard to make grouping easier.
- Go to Dashboard settings.
- In the General section, add a Tag.
Click Save. Note that the dashboard is saved in the browser’s local storage.
Export the dashboard.
- Go to Dashboard settings.
- Click JSON Model.
- Copy the JSON model.
- Go to your Github repository and create a new JSON file in the
dashboards
directory.
BASE64 dashboards
Dashboards could be stored on a server that does not return JSON directly and instead of it returns a Base64 encoded file (e.g. Gerrit) A new parameter has been added to the url use case so if you specify a b64content value equals to true after the url entry a Base64 decoding is applied before save the file to disk. If this entry is not set or is equals to false not decoding is applied to the file before saving it to disk.
Gerrit use case
Gerrit API for download files has the following schema: https://yourgerritserver/a/{project-name}/branches/{branch-id}/files/{file-id}/content where {project-name} and {file-id} usually has ‘/’ in their values and so they MUST be replaced by %2F so if project-name is user/repo, branch-id is master and file-id is equals to dir1/dir2/dashboard the url value is https://yourgerritserver/a/user%2Frepo/branches/master/files/dir1%2Fdir2%2Fdashboard/content
Sidecar for dashboards
If the parameter sidecar.dashboards.enabled
is set, a sidecar container is deployed in the plutono
pod. This container watches all configmaps (or secrets) in the cluster and filters out the ones with
a label as defined in sidecar.dashboards.label
. The files defined in those configmaps are written
to a folder and accessed by plutono. Changes to the configmaps are monitored and the imported
dashboards are deleted/updated.
A recommendation is to use one configmap per dashboard, as a reduction of multiple dashboards inside one configmap is currently not properly mirrored in plutono.
NOTE: Configure your data sources in your dashboards as variables to keep them portable across clusters.
Example dashboard config:
Folder structure:
dashboards/
βββ dashboard1.json
βββ dashboard2.json
templates/
βββdashboard-json-configmap.yaml
Helm template to create a configmap for each dashboard:
{{- range $path, $bytes := .Files.Glob "dashboards/*.json" }}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ printf "%s-%s" $.Release.Name $path | replace "/" "-" | trunc 63 }}
labels:
plutono-dashboard: "true"
data:
{{ printf "%s: |-" $path | replace "/" "-" | indent 2 }}
{{ printf "%s" $bytes | indent 4 }}
{{- end }}
Sidecar for datasources
If the parameter sidecar.datasources.enabled
is set, an init container is deployed in the plutono
pod. This container lists all secrets (or configmaps, though not recommended) in the cluster and
filters out the ones with a label as defined in sidecar.datasources.label
. The files defined in
those secrets are written to a folder and accessed by plutono on startup. Using these yaml files,
the data sources in plutono can be imported.
Should you aim for reloading datasources in Plutono each time the config is changed, set sidecar.datasources.skipReload: false
and adjust sidecar.datasources.reloadURL
to http://<svc-name>.<namespace>.svc.cluster.local/api/admin/provisioning/datasources/reload
.
Secrets are recommended over configmaps for this usecase because datasources usually contain private data like usernames and passwords. Secrets are the more appropriate cluster resource to manage those.
Example datasource config:
apiVersion: v1
kind: Secret
metadata:
name: plutono-datasources
labels:
# default value for: sidecar.datasources.label
plutono-datasource: "true"
stringData:
datasources.yaml: |-
apiVersion: 1
datasources:
- name: my-prometheus
type: prometheus
access: proxy
orgId: 1
url: my-url-domain:9090
isDefault: false
jsonData:
httpMethod: 'POST'
editable: false
NOTE: If you might include credentials in your datasource configuration, make sure to not use stringdata but base64 encoded data instead.
apiVersion: v1
kind: Secret
metadata:
name: my-datasource
labels:
plutono-datasource: "true"
data:
# The key must contain a unique name and the .yaml file type
my-datasource.yaml: {{ include (print $.Template.BasePath "my-datasource.yaml") . | b64enc }}
Example values to add a datasource adapted from Grafana:
datasources:
datasources.yaml:
apiVersion: 1
datasources:
# <string, required> Sets the name you use to refer to
# the data source in panels and queries.
- name: my-prometheus
# <string, required> Sets the data source type.
type: prometheus
# <string, required> Sets the access mode, either
# proxy or direct (Server or Browser in the UI).
# Some data sources are incompatible with any setting
# but proxy (Server).
access: proxy
# <int> Sets the organization id. Defaults to orgId 1.
orgId: 1
# <string> Sets a custom UID to reference this
# data source in other parts of the configuration.
# If not specified, Plutono generates one.
uid:
# <string> Sets the data source's URL, including the
# port.
url: my-url-domain:9090
# <string> Sets the database user, if necessary.
user:
# <string> Sets the database name, if necessary.
database:
# <bool> Enables basic authorization.
basicAuth:
# <string> Sets the basic authorization username.
basicAuthUser:
# <bool> Enables credential headers.
withCredentials:
# <bool> Toggles whether the data source is pre-selected
# for new panels. You can set only one default
# data source per organization.
isDefault: false
# <map> Fields to convert to JSON and store in jsonData.
jsonData:
httpMethod: 'POST'
# <bool> Enables TLS authentication using a client
# certificate configured in secureJsonData.
# tlsAuth: true
# <bool> Enables TLS authentication using a CA
# certificate.
# tlsAuthWithCACert: true
# <map> Fields to encrypt before storing in jsonData.
secureJsonData:
# <string> Defines the CA cert, client cert, and
# client key for encrypted authentication.
# tlsCACert: '...'
# tlsClientCert: '...'
# tlsClientKey: '...'
# <string> Sets the database password, if necessary.
# password:
# <string> Sets the basic authorization password.
# basicAuthPassword:
# <int> Sets the version. Used to compare versions when
# updating. Ignored when creating a new data source.
version: 1
# <bool> Allows users to edit data sources from the
# Plutono UI.
editable: false
How to serve Plutono with a path prefix (/plutono)
In order to serve Plutono with a prefix (e.g., http://example.com/plutono), add the following to your values.yaml.
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: "nginx"
nginx.ingress.kubernetes.io/rewrite-target: /$1
nginx.ingress.kubernetes.io/use-regex: "true"
path: /plutono/?(.*)
hosts:
- k8s.example.dev
plutono.ini:
server:
root_url: http://localhost:3000/plutono # this host can be localhost
How to securely reference secrets in plutono.ini
This example uses Plutono file providers for secret values and the extraSecretMounts
configuration flag (Additional plutono server secret mounts) to mount the secrets.
In plutono.ini:
plutono.ini:
[auth.generic_oauth]
enabled = true
client_id = $__file{/etc/secrets/auth_generic_oauth/client_id}
client_secret = $__file{/etc/secrets/auth_generic_oauth/client_secret}
Existing secret, or created along with helm:
---
apiVersion: v1
kind: Secret
metadata:
name: auth-generic-oauth-secret
type: Opaque
stringData:
client_id: <value>
client_secret: <value>
Include in the extraSecretMounts
configuration flag:
- extraSecretMounts:
- name: auth-generic-oauth-secret-mount
secretName: auth-generic-oauth-secret
defaultMode: 0440
mountPath: /etc/secrets/auth_generic_oauth
readOnly: true
4.2.14 - Service exposure test
This Plugin is just providing a simple exposed service for manual testing.
By adding the following label to a service it will become accessible from the central greenhouse system via a service proxy:
greenhouse.sap/expose: "true"
This plugin create an nginx deployment with an exposed service for testing.
Configuration
Specific port
By default expose would always use the first port. If you need another port, you’ve got to specify it by name:
greenhouse.sap/exposeNamedPort: YOURPORTNAME
4.2.15 - Teams2Slack
Introduction
This Plugin provides a Slack integration for a Greenhouse organization.
It manages Slack entities like channels, groups, handles, etc. and its members based on the teams configured in your Greenhouse organization.
Important: Please ensure that only one deployment of Teams2slack runs against the same set of groups in slack. Secondary instances should run in the provided Dry-Run mode. Otherwise you might notice inconsistencies if the Teammembership object of a cluster are uneqal.
Requirments
- A Kubernetes Cluster to run against
- The presence of the Greenhouse Teammemberships CRD and corresponding objects.
Architecture
The Teammembership contain the members of a team. Changes to an object will create an event in Kubernetes. This event will be consumed by the first controller. It creates a mirrored SlackGroup object that reflects the content of the Teammembership Object. This approach has the advantage that deletion of a team can be securely detected with the utilization of finalizers. The second controller detects changes on SlackGroup objects. The users present in a team will be aligned to a slack group.
Configuration
Deploy a the Teams2Slack Plugin and it’s Plugin which looks like the following structure (the following structure only includes the mandatory fields):
apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
name: teams2slack
namespace: default
spec:
pluginDefinition: teams2slack
disabled: false
optionValues:
- name: groupNamePrefix
value:
- name: groupNameSuffix
value:
- name: infoChannelID
value:
- name: token
valueFrom:
secret:
key: SLACK_TOKEN
name: teams2slack-secret
---
apiVersion: v1
kind: Secret
metadata:
name: teams2slack-secret
type: Opaque
data:
SLACK_TOKEN: // Slack token b64 encoded
The values that can or need to be provided have the following meaning:
Environment Variable | Meaning |
---|---|
groupNamePrefix (mandatory) | The prefix the created slack group should have. Choose a prefix that matches your organization. |
groupNameSuffix (mandatory) | The suffix the created slack group should have. Choose a suffix that matches your organization. |
infoChannelID (mandatory) | The channel ID created Slack Groups should have. You can currently define one slack ID which will be applied to all created groups. Make sure to take the channel ID and not the channel name. |
token(mandatory) | the slack token to authenticate against Slack. |
eventRequeueTimer (optional) | If a slack API requests fails due to a network error, or because data is currently fetched, it will be requed to the operators workQueue. Uses the golang date format. (1s = every second 1m = every minute ) |
loadDataBackoffTimer (optional) | Defines, when a Slack-API data call occurs. Uses the golang data format. |
dryRun (optional) | Slack write operations are not executed if value is set to true. Requires a valid. Requires: A valid SLACK_TOKEN; the other environment variables can be mocked. |
4.2.16 - Thanos
Learn more about the Thanos Plugin. Use it to enable extended metrics retention and querying across Prometheus servers and Greenhouse clusters.
The main terminologies used in this document can be found in core-concepts.
Overview
Thanos is a set of components that can be used to extend the storage and retrieval of metrics in Prometheus. It allows you to store metrics in a remote object store and query them across multiple Prometheus servers and Greenhouse clusters. This Plugin is intended to provide a set of pre-configured Thanos components that enable a proven composition. At the core, a set of Thanos components is installed that adds long-term storage capability to a single kube-monitoring Plugin and makes both current and historical data available again via one Thanos Query component.
The Thanos Sidecar is a component that is deployed as a container together with a Prometheus instance. This allows Thanos to optionally upload metrics to the object store and Thanos Query to access Prometheus data via a common, efficient StoreAPI.
The Thanos Compact component applies the Prometheus 2.0 Storage Engine compaction process to data uploaded to the object store. The Compactor is also responsible for applying the configured retention and downsampling of the data.
The Thanos Store also implements the StoreAPI and serves the historical data from an object store. It acts primarily as an API gateway and has no persistence itself.
Thanos Query implements the Prometheus HTTP v1 API for querying data in a Thanos cluster via PromQL. In short, it collects the data needed to evaluate the query from the connected StoreAPIs, evaluates the query and returns the result.
This plugin deploys the following Thanos components:
Planned components:
This Plugin does not deploy the following components:
- Thanos Sidecar This component is installed in the kube-monitoring plugin.
Disclaimer
It is not meant to be a comprehensive package that covers all scenarios. If you are an expert, feel free to configure the Plugin according to your needs.
Contribution is highly appreciated. If you discover bugs or want to add functionality to the plugin, then pull requests are always welcome.
Quick start
This guide provides a quick and straightforward way to use Thanos as a Greenhouse Plugin on your Kubernetes cluster. The guide is meant to build the following setup.
Prerequisites
- A running and Greenhouse-onboarded Kubernetes cluster. If you don’t have one, follow the Cluster onboarding guide.
- Ready to use credentials for a compatible object store
- kube-monitoring plugin installed. Thanos Sidecar on the Prometheus must be enabled by providing the required object store credentials.
Step 1:
Create a Kubernetes Secret with your object store credentials following the Object Store preparation section.
Step 2:
Enable the Thanos Sidecar on the Prometheus in the kube-monitoring plugin by providing the required object store credentials. Follow the kube-monitoring plugin enablement section.
Step 3:
Create a Thanos Query Plugin by following the Thanos Query section.
Configuration
Object Store preparation
To run Thanos, you need object storage credentials. Get the credentials of your provider and add them to a Kubernetes Secret. The Thanos documentation provides a great overview on the different supported store types.
Usually this looks somewhat like this
type: $STORAGE_TYPE
config:
user:
password:
domain:
...
If you’ve got everything in a file, deploy it in your remote cluster in the namespace, where Prometheus and Thanos will be.
Important: $THANOS_PLUGIN_NAME
is needed later for the respective Thanos plugin and they must not be different!
kubectl create secret generic $THANOS_PLUGIN_NAME-metrics-objectstore --from-file=thanos.yaml=/path/to/your/file
kube-monitoring plugin enablement
Prometheus in kube-monitoring needs to be altered to have a sidecar and ship metrics to the new object store too. You have to provide the Secret you’ve just created to the (most likely already existing) kube-monitoring plugin. Add this:
spec:
optionValues:
- name: kubeMonitoring.prometheus.prometheusSpec.thanos.objectStorageConfig.existingSecret.key
value: thanos.yaml
- name: kubeMonitoring.prometheus.prometheusSpec.thanos.objectStorageConfig.existingSecret.name
value: $THANOS_PLUGIN_NAME-metrics-objectstore
Values used here are described in the Prometheus Operator Spec.
Thanos Query
This is the real deal now: Define your Thanos Query by creating a plugin.
NOTE1: $THANOS_PLUGIN_NAME
needs to be consistent with your secret created earlier.
NOTE2: The releaseNamespace
needs to be the same as to where kube-monitoring resides. By default this is kube-monitoring.
apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
name: $YOUR_CLUSTER_NAME
spec:
pluginDefinition: thanos
disabled: false
clusterName: $YOUR_CLUSTER_NAME
releaseNamespace: kube-monitoring
[OPTIONAL] Handling your Prometheus and Thanos Stores.
Default Prometheus and Thanos Endpoint
Thanos Query is automatically adding the Prometheus and Thanos endpoints. If you just have a single Prometheus with Thanos enabled this will work out of the box. Details in the next two chapters. See Standalone Query for your own configuration.
Prometheus Endpoint
Thanos Query would check for a service prometheus-operated
in the same namespace with this GRPC port to be available 10901
. The cli option looks like this and is configured in the Plugin itself:
--store=prometheus-operated:10901
Thanos Endpoint
Thanos Query would check for a Thanos endpoint named like releaseName-store
. The associated command line flag for this parameter would look like:
--store=thanos-kube-store:10901
If you just have one occurence of this Thanos plugin dpeloyed, the default option would work and does not need anything else.
Standalone Query
In case you want to achieve a setup like above and have an overarching Thanos Query to run with multiple Stores, you can set it to standalone
and add your own store list. Setup your Plugin like this:
spec:
optionsValues:
- name: thanos.query.standalone
value: true
This would enable you to either:
query multiple stores with a single Query
spec: optionsValues: - name: thanos.query.stores value: - thanos-kube-1-store:10901 - thanos-kube-2-store:10901 - kube-monitoring-1-prometheus:10901 - kube-monitoring-2-prometheus:10901
query multiple Thanos Queries with a single Query Note that there is no
-store
suffix here in this case.spec: optionsValues: - name: thanos.query.stores value: - thanos-kube-1:10901 - thanos-kube-2:10901
Operations
Thanos Compactor
If you deploy the plugin with the default values, Thanos compactor will be shipped too and use the same secret ($THANOS_PLUGIN_NAME-metrics-objectstore
) to retrieve, compact and push back timeseries.
Based on experience, a 100Gi-PVC is used in order not to overload the ephermeral storage of the Kubernetes Nodes. Depending on the configured retention and the amount of metrics, this may not be sufficient and larger volumes may be required. In any case, it is always safe to clear the volume of the compactor and increase it if necessary.
The object storage costs will be heavily impacted on how granular timeseries are being stored (reference Downsampling). These are the pre-configured defaults, you can change them as needed:
raw: 777600s (90d)
5m: 777600s (90d)
1h: 157680000 (5y)
5 - Contribute
The Greenhouse core platform serves as a comprehensive cloud operations solution, providing centralized control and management for cloud infrastructure and applications.
Its extensibility is achieved through the development and integration of plugins, allowing organizations to adapt and enhance the platform to accommodate their specific operational needs, ultimately promoting efficiency and compliance across their cloud environments.
The Greenhouse team welcomes all contributions to the project.
5.1 - Local development setup
What is Greenhouse?
Greenhouse is a Kubernetes operator build with Kubebuilder and a UI on top of the k8s API.
It expands the Kubernetes API via CustomResourceDefinitions. The different aspects of the CRDs are reconciled by several controllers. It also acts as an admission webhook.
The Greenhouse Dashboard is a UI acting on the k8s apiserver of the cluster Greenhouse is running in. It includes a dashboard and an Organization admin consisting of several Juno micro frontends.
This guide provides the following:
Local Setup 1.1 Mock k8s Server
1.3 Greenhouse UI
1.4 docker compose
1.5 Bootstrap
Local Setup
Quick start the local setup with docker compose
Note: As for the time being, the images published in our registry are
linux/amd64
only. Export env var to set your docker to use this architecture, if your default differs:
export DOCKER_DEFAULT_PLATFORM=linux/amd64
Env Var Overview
Env Var | Meaning |
---|---|
KUBEBUILDER_ATTACH_CONTROL_PLANE_OUTPUT | If set to true , the mock server will additionally log apiserver and etcd logs |
DEV_ENV_CONTEXT | Mocks permissions on the mock api server, see Mock k8s Server for details |
Mock k8s Server, a.k.a. envtest
The Greenhouse controller needs a Kubernetes API to run it’s reconciliation against. This k8s API needs to know about the Greenhouse CRDs to maintain the state of their respective resources. It also needs to know about any running admission/validation webhooks.
We provide a local mock k8s apiserver and etcd leveraging the envtest package of SIG controller-runtime. This comes with the CRDs and MutatingWebhookConfiguration installed and provides a little bit of utility. Find the docker image on our registry.
Additionally it will bootstrap some users with different permissions in a test-org
. The test-org
resource does not yet exist on the apiserver, but can be bootstrapped from the test-org.yaml which is done for you if you use the docker compose setup.
Running the image will:
- spin up the apiserver and etcd
- deploy CRDs and the webhook
- create some users with respective contexts and certificates
- finally proxy the apiserver via
kubectl proxy
to127.0.0.1:8090
.
The latter is done to avoid painful authentication to the local apiserver.
We still can showcase different permission levels on the apiserver, by setting context
via the env var DEV_ENV_CONTEXT
.
DEV_ENV_CONTEXT | Permissions |
---|---|
unset | all, a.k.a. k8s cluster-admin |
test-org-member | org-member as provided by the org controller |
test-org-admin | org-admin as provided by the org controller |
test-org-cluster-admin | cluster-admin as provided by the org controller |
test-org-plugin-admin | plugin-admin as provided by the org controller |
To access the running apiserver instance, some kubeconfig
files and client certificates are created on the container in the /envtest
folder.
The internal.kubeconfig
file uses the different certificates and contextes to directly address the apiserver running on port 6884
.
The kubeconfig
file uses the proxied context without authentication running on port 8090
. It is also scoped to the namespace test-org
.
Chose the respective ports to be exposed on your localhost when running the image or expose them all by running in network host mode.
We are reusing the autogenerated certificates of the dev-env
for authenticating the webhook server on localhost
. The files are stored on /webhook-certs
on the container.
It is good practice to mount local volumes to these folders, running the image as such:
docker run --network host -e DEV_ENV_CONTEXT=<your-context> -v ./envtest:/envtest -v /tmp/k8s-webhook-server/serving-certs:/webhook-certs ghcr.io/cloudoperators/greenhouse-dev-env:main
Greenhouse Controller
Run your local go code from ./cmd/greenhouse
with the minimal configuration necessary (this example points the controller to run against the local mock apiserver):
go run . --dns-domain localhost --kubeconfig ./envtest/kubeconfig
Make sure the webhook server certs are placed in /tmp/k8s-webhook-server/serving-certs
Or run our greenhouse image as such:
docker run --network host -e KUBECONFIG=/envtest/kubeconfig -v ./envtest:/envtest -v /tmp/k8s-webhook-server/serving-certs:/tmp/k8s-webhook-server/serving-certs ghcr.io/cloudoperators/greenhouse:main --dns-domain localhost
See all available flags here.
Greenhouse UI
Use the latest upstream
Either pull or start the docker-compose to retrieve the latest juno-app-greenhouse release.
Building the UI locally
The Greenhouse UI is located in the cloudoperators/juno repository.
Clone the repository cloudoperators/juno
Run docker buildx build:
$ docker buildx build --platform=linux/amd64 -t ghcr.io/cloudoperators/juno-app-greenhouse:latest -f apps/greenhouse/docker/Dockerfile .
NOTE: Building the image is rather resource heavy on your machine. For reference, using colima:
Able to build? PROFILE STATUS ARCH CPUS MEMORY DISK RUNTIME ADDRESS β default Running aarch64 2 4GiB 100GiB docker β default Running aarch64 4 8GiB 100GiB docker Start the UI Docker
$ docker run -p 3000:80 -v ./ui/appProps.json:/appProps.json ghcr.io/cloudoperators/juno-app-greenhouse:latest
Note: We inject a props template prepared for dev-env expecting the k8s api to run on
127.0.0.1:8090
, which is the default exported by the mock api server image. Also authentication will be mocked. Have a look at the props template to point your local UI to other running Greenhouse instances.Start the UI with
node
(with support for live reloads). Follow the instructions in thecloudoperators/juno
, here.Access the UI on localhost:3000.
Note: Running the code locally only watches and live reloads the local code (changes) of the dashboard micro frontend (MFE). This is not true for the embedded MFEs. Run those separately with respective props pointing to the mock k8s apiserver for development.
docker compose
If you do not need or want to run your local code but want to run a set of Greenhouse images we provide a setup with docker compose:
Navigate to the dev-env dir, and start
docker compose
.cd ./dev-env docker compose up
You might need to build the
dev-ui
image manually, in that case follow the steps above.(Alternative) The network-host.docker-compose.yaml provides the same setup but starts all containers in host network mode instead.
Bootstrap
The docker-compose setup per default bootstraps an Organization test-org to your cluster, which is the bare minimum to get the dev-env
working.
By running:
docker compose run bootstrap kubectl apply -f /bootstrap/additional_resources
or by uncommenting the “additional resources” in the command of the bootstrap container in the docker-compose file, the following resources items would be created automatically:
- test-team-1, test-team-2, test-team-3 within Organization `test-org',
- respective dummy
teammemberships
for both teams, - cluster-1, cluster-2, cluster-3 and self with different conditions and states,
- some dummy nodes for clusters,
- some plugindefinitions with plugins across the clusters.
Note: These resources are intended to showcase the UI and produce a lot of “noise” on the Greenhouse controller.
Add any additional resources you need to the ./bootstrap
folder.
Run And Debug The Code
Spin up the envtest
container only, e.g. via:
docker compose up envtest
Reuse the certs created by envtest
for locally serving the webhooks by copying them to the default location kubebuilder expects webhook certs at:
cp ./webhook-certs/* /tmp/k8s-webhook-server/serving-certs
Note: use
$TMPDIR
on MacOS for/tmp
Start your debugging process in respective IDE exposing the envtest
kubeconfig at ./envtest/kubeconfig
. Do not forget to pass the --dns-domain=localhost
flag.
Run The Tests
For running e2e
tests see here.
Same as the local setup our unit
tests run against an envtest
mock cluster. To install the setup-envtest tool run
make envtest
which will install setup-envtest
to your $(LOCALBIN)
, usually ./bin
.
To run all tests from cli:
make test
To run tests independently make sure the $(KUBEBUILDER_ASSETS)
env var is set. This variable contains the path to the binary to use for starting up the mock controlplane with the respective k8s version on your architecture.
Print the path by executing:
./bin/setup-envtest use <your-preferred-k8s-version> -p path
Env Vars Overview In Testing
Env Var | Meaning |
---|---|
TEST_EXPORT_KUBECONFIG | If set to true , the kubeconfigs of the envtest controlplanes will be written to temporary files and their location will be printed on screen. Usefull for accessing the mock clusters when setting break points in tests. |
5.2 - Contributing a Plugin
What is a Plugin?
A Plugin is a key component that provides additional features, functionalities and may add new tools or integrations to the Greenhouse project.
They are developed de-centrally by the domain experts.
A YAML specification outlines the components that are to be installed and describes mandatory and optional, instance-specific configuration values.
It can consist of two main parts:
Juno micro frontend
This integrates with the Greenhouse dashboard, allowing users to interact with the Plugin’s features seamlessly within the Greenhouse UI.Backend component
It can include backend logic that supports the Plugin’s functionality.
Contribute
Additional ideas for plugins are very welcome!
The Greenhouse plugin catalog is defined in the Greenhouse extensions repository.
To get started, please file an issues and provide a concise description of the proposed plugin here.
A Greenhouse plugin consists of a juno micro frontend that integrates with the Greenhouse UI and/or a backend component described via Helm chart.
Contributing a plugin requires the technical skills to write Helm charts and proficiency in JavaScript.
Moreover, documentation needs to be developed to help users understand the plugin capabilities as well as how to incorporate it.
Additionally, the plugin needs to be maintained by at least one individual or a team to ensure ongoing functionality and usability within the Greenhouse ecosystem.
Development
Developing a plugin for the Greenhouse platform involves several steps, including defining the plugin, creating the necessary components, and integrating them into Greenhouse.
Here’s a high-level overview of how to develop a plugin for Greenhouse:
Define the Plugin:
- Clearly define the purpose and functionality of your plugin.
- What problem does it solve, and what features will it provide?
Plugin Configuration (plugin.yml):
- Create a
greenhouse.yml
file in the root of your repository to specify the plugin’s metadata and configuration options. This YAML file should include details like the plugin’s description, version, and any configuration values required.
- Create a
Plugin Components:
- Develop the plugin’s components, which may include both frontend and backend components.
- For the frontend, you can use Juno microfrontend components to integrate with the Greenhouse UI seamlessly.
- The backend component handles the logic and functionality of your plugin. This may involve interacting with external APIs, processing data, and more.
Testing & Validation:
- Test your plugin thoroughly to ensure it works as intended. Verify that both the frontend and backend components function correctly.
- Implement validation for your plugin’s configuration options. This helps prevent users from providing incorrect or incompatible values.
- Implement Helm Chart Tests for your plugin if it includes a Helm Chart. For more information on how to write Helm Chart Tests, please refer to this guide.
Documentation:
- Create comprehensive documentation for your plugin. This should include installation instructions, configuration details, and usage guidelines.
Integration with Greenhouse:
- Integrate your plugin with the Greenhouse platform by configuring it using the Greenhouse UI. This may involve specifying which organizations can use the plugin and setting up any required permissions.
Publishing:
- Publish your plugin to Greenhouse once it’s fully tested and ready for use. This makes it available for organizations to install and configure.
Support and Maintenance:
- Provide ongoing support for your plugin, including bug fixes and updates to accommodate changes in Greenhouse or external dependencies.
Community Involvement:
- Consider engaging with the Greenhouse community, if applicable, by seeking feedback, addressing issues, and collaborating with other developers.
5.3 - Greenhouse Controller Development
Bootstrap a new Controller
Before getting started please make sure you have read the contribution guidelines.
Greenhouse is build using Kubebuilder as the framework for Kubernetes controllers. To create a new controller, you can use the kubebuilder
CLI tool.
This project was generated with Kubebuilder v3, which requires Kubebuilder CLI <= v3.15.1 Since this project does not follow the Kubebuilder v3 scaffolding structure, it is necessary to create a symlink to the main.go
ln -s ./cmd/greenhouse/main.go main.go
To create a new controller, run the following command:
kubebuilder create api --group greenhouse --version v1alpha1 --kind MyResource
Now that the files have been generated, they need to be copied to the correct location:
mv ./apis/greenhouse/v1alpha1/myresource_types.go ./pkg/apis/v1alpha1/
mv ./controllers/greenhouse/mynewkind_controller.go ./pkg/controllers/<kind>/mynewkind_controller.go
After having moved the files, you need to fix the imports in the mynewkind_controller.go
file.
Also ensure that the entry for the resource in the PROJECT
file points to the correct location.
The new Kind should be added to the list under charts/manager/crds/kustomization.yaml
The new Controller needs to be registered in the controllers manager cmd/greenhouse/main.go
.
All other generated files can be deleted.
Now you can generate all manifests with make generate-manifests
and start implementing your controller logic.
Implementing the Controller
Within Greenhouse the controllers implement the lifecycle.Reconciler
interface. This allows for consistency between the controllers and ensures finalizers, status updates and other common controller logic is implemented in a consistent way. For examples on how this is used please refer to the existing controllers.
Testing the Controller
Unit/Integration tests for the controllers use Kubebuilder’s envtest environment and are implemented using Ginkgo and Gomega. For examples on how to write tests please refer to the existing tests. There are also some helper functions in the pkg/test
package that can be used to simplify the testing of controllers.
For e2e tests, please refer to the test/e2e/README.md
.
6 -
All ADRs have been migrated to cloudoperators/documentation.
7 -
Greenhouse documentation
This directory contains the documentation for Greenhouse, the PlusOne operations platform.
All directories containing an _index.md
with the following content are synchronized to the website.
---
title: "<title>"
linkTitle: "<link>"
landingSectionIndex: <true|false>
description: >
<Long description of the content>
---
You can execute the following command to serve the documentation locally:
make serve-docs