This section contains reference documentation for Greenhouse.
This is the multi-page printable view of this section. Click here to print.
Reference
- 1: API
- 2: Plugin Catalog
- 2.1: Alerts
- 2.2: Cert-manager
- 2.3: Decentralized Observer of Policies (Violations)
- 2.4: Designate Ingress CNAME operator (DISCO)
- 2.5: DigiCert issuer
- 2.6: External DNS
- 2.7: Github Guard
- 2.8: Ingress NGINX
- 2.9: Kubernetes Monitoring
- 2.10: Logs Plugin
- 2.11: Logshipper
- 2.12: OpenSearch
- 2.13: Perses
- 2.14: Plutono
- 2.15: Prometheus
- 2.16: Service exposure test
- 2.17: Teams2Slack
- 2.18: Thanos
1 - API
Packages:
greenhouse.sap/v1alpha1
Resource Types:Authentication
(Appears on: OrganizationSpec)
Field | Description |
---|---|
oidc OIDCConfig | OIDConfig configures the OIDC provider. |
scim SCIMConfig | SCIMConfig configures the SCIM client. |
Cluster
Cluster is the Schema for the clusters API
Field | Description | ||||
---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||||
spec ClusterSpec |
| ||||
status ClusterStatus |
ClusterAccessMode
(string
alias)
(Appears on: ClusterSpec)
ClusterAccessMode configures the access mode to the customer cluster.
ClusterConditionType
(string
alias)
ClusterConditionType is a valid condition of a cluster.
ClusterKubeConfig
(Appears on: ClusterSpec)
ClusterKubeConfig configures kube config values.
Field | Description |
---|---|
maxTokenValidity int32 | MaxTokenValidity specifies the maximum duration for which a token remains valid in hours. |
ClusterKubeconfig
ClusterKubeconfig is the Schema for the clusterkubeconfigs API ObjectMeta.OwnerReferences is used to link the ClusterKubeconfig to the Cluster ObjectMeta.Generation is used to detect changes in the ClusterKubeconfig and sync local kubeconfig files ObjectMeta.Name is designed to be the same with the Cluster name
Field | Description | ||
---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||
spec ClusterKubeconfigSpec |
| ||
status ClusterKubeconfigStatus |
ClusterKubeconfigAuthInfo
(Appears on: ClusterKubeconfigAuthInfoItem)
Field | Description |
---|---|
auth-provider k8s.io/client-go/tools/clientcmd/api.AuthProviderConfig | |
client-certificate-data []byte | |
client-key-data []byte |
ClusterKubeconfigAuthInfoItem
(Appears on: ClusterKubeconfigData)
Field | Description |
---|---|
name string | |
user ClusterKubeconfigAuthInfo |
ClusterKubeconfigCluster
(Appears on: ClusterKubeconfigClusterItem)
Field | Description |
---|---|
server string | |
certificate-authority-data []byte |
ClusterKubeconfigClusterItem
(Appears on: ClusterKubeconfigData)
Field | Description |
---|---|
name string | |
cluster ClusterKubeconfigCluster |
ClusterKubeconfigContext
(Appears on: ClusterKubeconfigContextItem)
Field | Description |
---|---|
cluster string | |
user string | |
namespace string |
ClusterKubeconfigContextItem
(Appears on: ClusterKubeconfigData)
Field | Description |
---|---|
name string | |
context ClusterKubeconfigContext |
ClusterKubeconfigData
(Appears on: ClusterKubeconfigSpec)
ClusterKubeconfigData stores the kubeconfig data ready to use kubectl or other local tooling It is a simplified version of clientcmdapi.Config: https://pkg.go.dev/k8s.io/client-go/tools/clientcmd/api#Config
Field | Description |
---|---|
kind string | |
apiVersion string | |
clusters []ClusterKubeconfigClusterItem | |
users []ClusterKubeconfigAuthInfoItem | |
contexts []ClusterKubeconfigContextItem | |
current-context string | |
preferences ClusterKubeconfigPreferences |
ClusterKubeconfigPreferences
(Appears on: ClusterKubeconfigData)
ClusterKubeconfigSpec
(Appears on: ClusterKubeconfig)
ClusterKubeconfigSpec stores the kubeconfig data for the cluster The idea is to use kubeconfig data locally with minimum effort (with local tools or plain kubectl): kubectl get cluster-kubeconfig $NAME -o yaml | yq -y .spec.kubeconfig
Field | Description |
---|---|
kubeconfig ClusterKubeconfigData |
ClusterKubeconfigStatus
(Appears on: ClusterKubeconfig)
Field | Description |
---|---|
statusConditions StatusConditions |
ClusterOptionOverride
(Appears on: PluginPresetSpec)
ClusterOptionOverride defines which plugin option should be override in which cluster
Field | Description |
---|---|
clusterName string | |
overrides []PluginOptionValue |
ClusterSelector
ClusterSelector specifies a selector for clusters by name or by label with the option to exclude specific clusters.
Field | Description |
---|---|
clusterName string | Name of a single Cluster to select. |
labelSelector Kubernetes meta/v1.LabelSelector | LabelSelector is a label query over a set of Clusters. |
excludeList []string | ExcludeList is a list of Cluster names to exclude from LabelSelector query. |
ClusterSpec
(Appears on: Cluster)
ClusterSpec defines the desired state of the Cluster.
Field | Description |
---|---|
accessMode ClusterAccessMode | AccessMode configures how the cluster is accessed from the Greenhouse operator. |
kubeConfig ClusterKubeConfig | KubeConfig contains specific values for |
ClusterStatus
(Appears on: Cluster)
ClusterStatus defines the observed state of Cluster
Field | Description |
---|---|
kubernetesVersion string | KubernetesVersion reflects the detected Kubernetes version of the cluster. |
bearerTokenExpirationTimestamp Kubernetes meta/v1.Time | BearerTokenExpirationTimestamp reflects the expiration timestamp of the bearer token used to access the cluster. |
statusConditions StatusConditions | StatusConditions contain the different conditions that constitute the status of the Cluster. |
nodes map[string]./api/v1alpha1.NodeStatus | Nodes provides a map of cluster node names to node statuses |
Condition
(Appears on: ManagedPluginStatus, PropagationStatus, StatusConditions)
Condition contains additional information on the state of a resource.
Field | Description |
---|---|
type ConditionType | Type of the condition. |
status Kubernetes meta/v1.ConditionStatus | Status of the condition. |
reason ConditionReason | Reason is a one-word, CamelCase reason for the condition’s last transition. |
lastTransitionTime Kubernetes meta/v1.Time | LastTransitionTime is the last time the condition transitioned from one status to another. |
message string | Message is an optional human readable message indicating details about the last transition. |
ConditionReason
(string
alias)
(Appears on: Condition)
ConditionReason is a valid reason for a condition of a resource.
ConditionType
(string
alias)
(Appears on: Condition)
ConditionType is a valid condition of a resource.
HelmChartReference
(Appears on: PluginDefinitionSpec, PluginStatus)
HelmChartReference references a Helm Chart in a chart repository.
Field | Description |
---|---|
name string | Name of the HelmChart chart. |
repository string | Repository of the HelmChart chart. |
version string | Version of the HelmChart chart. |
HelmReleaseStatus
(Appears on: PluginStatus)
HelmReleaseStatus reflects the status of a Helm release.
Field | Description |
---|---|
status string | Status is the status of a HelmChart release. |
firstDeployed Kubernetes meta/v1.Time | FirstDeployed is the timestamp of the first deployment of the release. |
lastDeployed Kubernetes meta/v1.Time | LastDeployed is the timestamp of the last deployment of the release. |
pluginOptionChecksum string | PluginOptionChecksum is the checksum of plugin option values. |
diff string | Diff contains the difference between the deployed helm chart and the helm chart in the last reconciliation |
ManagedPluginStatus
(Appears on: PluginPresetStatus)
ManagedPluginStatus defines the Ready condition of a managed Plugin identified by its name.
Field | Description |
---|---|
pluginName string | |
readyCondition Condition |
NodeStatus
(Appears on: ClusterStatus)
Field | Description |
---|---|
statusConditions StatusConditions | We mirror the node conditions here for faster reference |
ready bool | Fast track to the node ready condition. |
OIDCConfig
(Appears on: Authentication)
Field | Description |
---|---|
issuer string | Issuer is the URL of the identity service. |
redirectURI string | RedirectURI is the redirect URI to be used for the OIDC flow against the upstream IdP. If none is specified, the Greenhouse ID proxy will be used. |
clientIDReference SecretKeyReference | ClientIDReference references the Kubernetes secret containing the client id. |
clientSecretReference SecretKeyReference | ClientSecretReference references the Kubernetes secret containing the client secret. |
oauth2ClientRedirectURIs []string | OAuth2ClientRedirectURIs are a registered set of redirect URIs. When redirecting from the idproxy to the client application, the URI requested to redirect to must be contained in this list. |
Organization
Organization is the Schema for the organizations API
Field | Description | ||||||||
---|---|---|---|---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||||||||
spec OrganizationSpec |
| ||||||||
status OrganizationStatus |
OrganizationSpec
(Appears on: Organization)
OrganizationSpec defines the desired state of Organization
Field | Description |
---|---|
displayName string | DisplayName is an optional name for the organization to be displayed in the Greenhouse UI. Defaults to a normalized version of metadata.name. |
authentication Authentication | Authentication configures the organizations authentication mechanism. |
description string | Description provides additional details of the organization. |
mappedOrgAdminIdPGroup string | MappedOrgAdminIDPGroup is the IDP group ID identifying org admins |
OrganizationStatus
(Appears on: Organization)
OrganizationStatus defines the observed state of an Organization
Field | Description |
---|---|
statusConditions StatusConditions | StatusConditions contain the different conditions that constitute the status of the Organization. |
Plugin
Plugin is the Schema for the plugins API
Field | Description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||||||||||
spec PluginSpec |
| ||||||||||
status PluginStatus |
PluginDefinition
PluginDefinition is the Schema for the PluginDefinitions API
Field | Description | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||||||||||||||||||
spec PluginDefinitionSpec |
| ||||||||||||||||||
status PluginDefinitionStatus |
PluginDefinitionSpec
(Appears on: PluginDefinition)
PluginDefinitionSpec defines the desired state of PluginDefinitionSpec
Field | Description |
---|---|
displayName string | DisplayName provides a human-readable label for the pluginDefinition. |
description string | Description provides additional details of the pluginDefinition. |
helmChart HelmChartReference | HelmChart specifies where the Helm Chart for this pluginDefinition can be found. |
uiApplication UIApplicationReference | UIApplication specifies a reference to a UI application |
options []PluginOption | RequiredValues is a list of values required to create an instance of this PluginDefinition. |
version string | Version of this pluginDefinition |
weight int32 | Weight configures the order in which Plugins are shown in the Greenhouse UI. Defaults to alphabetical sorting if not provided or on conflict. |
icon string | Icon specifies the icon to be used for this plugin in the Greenhouse UI. Icons can be either: - A string representing a juno icon in camel case from this list: https://github.com/sapcc/juno/blob/main/libs/juno-ui-components/src/components/Icon/Icon.component.js#L6-L52 - A publicly accessible image reference to a .png file. Will be displayed 100x100px |
docMarkDownUrl string | DocMarkDownUrl specifies the URL to the markdown documentation file for this plugin. Source needs to allow all CORS origins. |
PluginDefinitionStatus
(Appears on: PluginDefinition)
PluginDefinitionStatus defines the observed state of PluginDefinition
PluginOption
(Appears on: PluginDefinitionSpec)
Field | Description |
---|---|
name string | Name/Key of the config option. |
default k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1.JSON | (Optional) Default provides a default value for the option |
description string | Description provides a human-readable text for the value as shown in the UI. |
displayName string | DisplayName provides a human-readable label for the configuration option |
required bool | Required indicates that this config option is required |
type PluginOptionType | Type of this configuration option. |
regex string | Regex specifies a match rule for validating configuration options. |
PluginOptionType
(string
alias)
(Appears on: PluginOption)
PluginOptionType specifies the type of PluginOption.
PluginOptionValue
(Appears on: ClusterOptionOverride, PluginSpec)
PluginOptionValue is the value for a PluginOption.
Field | Description |
---|---|
name string | Name of the values. |
value k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1.JSON | Value is the actual value in plain text. |
valueFrom ValueFromSource | ValueFrom references a potentially confidential value in another source. |
PluginPreset
PluginPreset is the Schema for the PluginPresets API
Field | Description | ||||||
---|---|---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||||||
spec PluginPresetSpec |
| ||||||
status PluginPresetStatus |
PluginPresetSpec
(Appears on: PluginPreset)
PluginPresetSpec defines the desired state of PluginPreset
Field | Description |
---|---|
plugin PluginSpec | PluginSpec is the spec of the plugin to be deployed by the PluginPreset. |
clusterSelector Kubernetes meta/v1.LabelSelector | ClusterSelector is a label selector to select the clusters the plugin bundle should be deployed to. |
clusterOptionOverrides []ClusterOptionOverride | ClusterOptionOverrides define plugin option values to override by the PluginPreset |
PluginPresetStatus
(Appears on: PluginPreset)
PluginPresetStatus defines the observed state of PluginPreset
Field | Description |
---|---|
statusConditions StatusConditions | StatusConditions contain the different conditions that constitute the status of the PluginPreset. |
pluginStatuses []ManagedPluginStatus | PluginStatuses contains statuses of Plugins managed by the PluginPreset. |
availablePlugins int | AvailablePlugins is the number of available Plugins managed by the PluginPreset. |
readyPlugins int | ReadyPlugins is the number of ready Plugins managed by the PluginPreset. |
failedPlugins int | FailedPlugins is the number of failed Plugins managed by the PluginPreset. |
PluginSpec
(Appears on: Plugin, PluginPresetSpec)
PluginSpec defines the desired state of Plugin
Field | Description |
---|---|
pluginDefinition string | PluginDefinition is the name of the PluginDefinition this instance is for. |
displayName string | DisplayName is an optional name for the Plugin to be displayed in the Greenhouse UI. This is especially helpful to distinguish multiple instances of a PluginDefinition in the same context. Defaults to a normalized version of metadata.name. |
optionValues []PluginOptionValue | Values are the values for a PluginDefinition instance. |
clusterName string | ClusterName is the name of the cluster the plugin is deployed to. If not set, the plugin is deployed to the greenhouse cluster. |
releaseNamespace string | ReleaseNamespace is the namespace in the remote cluster to which the backend is deployed. Defaults to the Greenhouse managed namespace if not set. |
PluginStatus
(Appears on: Plugin)
PluginStatus defines the observed state of Plugin
Field | Description |
---|---|
helmReleaseStatus HelmReleaseStatus | HelmReleaseStatus reflects the status of the latest HelmChart release. This is only configured if the pluginDefinition is backed by HelmChart. |
version string | Version contains the latest pluginDefinition version the config was last applied with successfully. |
helmChart HelmChartReference | HelmChart contains a reference the helm chart used for the deployed pluginDefinition version. |
uiApplication UIApplicationReference | UIApplication contains a reference to the frontend that is used for the deployed pluginDefinition version. |
weight int32 | Weight configures the order in which Plugins are shown in the Greenhouse UI. |
description string | Description provides additional details of the plugin. |
exposedServices map[string]./api/v1alpha1.Service | ExposedServices provides an overview of the Plugins services that are centrally exposed. It maps the exposed URL to the service found in the manifest. |
statusConditions StatusConditions | StatusConditions contain the different conditions that constitute the status of the Plugin. |
PropagationStatus
(Appears on: TeamRoleBindingStatus)
PropagationStatus defines the observed state of the TeamRoleBinding’s associated rbacv1 resources on a Cluster
Field | Description |
---|---|
clusterName string | ClusterName is the name of the cluster the rbacv1 resources are created on. |
condition Condition | Condition is the overall Status of the rbacv1 resources created on the cluster |
SCIMConfig
(Appears on: Authentication)
Field | Description |
---|---|
baseURL string | URL to the SCIM server. |
authType github.com/cloudoperators/greenhouse/internal/scim.AuthType | AuthType defined possible authentication type |
basicAuthUser ValueFromSource | User to be used for basic authentication. |
basicAuthPw ValueFromSource | Password to be used for basic authentication. |
bearerToken ValueFromSource | BearerToken to be used for bearer token authorization |
bearerPrefix string | BearerPrefix to be used to defined bearer token prefix |
bearerHeader string | BearerHeader to be used to defined bearer token header |
SecretKeyReference
(Appears on: OIDCConfig, ValueFromSource)
SecretKeyReference specifies the secret and key containing the value.
Field | Description |
---|---|
name string | Name of the secret in the same namespace. |
key string | Key in the secret to select the value from. |
Service
(Appears on: PluginStatus)
Service references a Kubernetes service of a Plugin.
Field | Description |
---|---|
namespace string | Namespace is the namespace of the service in the target cluster. |
name string | Name is the name of the service in the target cluster. |
port int32 | Port is the port of the service. |
protocol string | Protocol is the protocol of the service. |
StatusConditions
(Appears on: ClusterKubeconfigStatus, ClusterStatus, NodeStatus, OrganizationStatus, PluginPresetStatus, PluginStatus, TeamMembershipStatus, TeamRoleBindingStatus, TeamStatus)
A StatusConditions contains a list of conditions. Only one condition of a given type may exist in the list.
Field | Description |
---|---|
conditions []Condition |
Team
Team is the Schema for the teams API
Field | Description | ||||||
---|---|---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||||||
spec TeamSpec |
| ||||||
status TeamStatus |
TeamMembership
TeamMembership is the Schema for the teammemberships API
Field | Description | ||
---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||
spec TeamMembershipSpec |
| ||
status TeamMembershipStatus |
TeamMembershipSpec
(Appears on: TeamMembership)
TeamMembershipSpec defines the desired state of TeamMembership
Field | Description |
---|---|
members []User | (Optional) Members list users that are part of a team. |
TeamMembershipStatus
(Appears on: TeamMembership)
TeamMembershipStatus defines the observed state of TeamMembership
Field | Description |
---|---|
lastSyncedTime Kubernetes meta/v1.Time | (Optional) LastSyncedTime is the information when was the last time the membership was synced |
lastUpdateTime Kubernetes meta/v1.Time | (Optional) LastChangedTime is the information when was the last time the membership was actually changed |
statusConditions StatusConditions | StatusConditions contain the different conditions that constitute the status of the TeamMembership. |
TeamRole
TeamRole is the Schema for the TeamRoles API
Field | Description | ||||||
---|---|---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||||||
spec TeamRoleSpec |
| ||||||
status TeamRoleStatus |
TeamRoleBinding
TeamRoleBinding is the Schema for the rolebindings API
Field | Description | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta | Refer to the Kubernetes API documentation for the fields of the
metadata field. | ||||||||||||||
spec TeamRoleBindingSpec |
| ||||||||||||||
status TeamRoleBindingStatus |
TeamRoleBindingSpec
(Appears on: TeamRoleBinding)
TeamRoleBindingSpec defines the desired state of a TeamRoleBinding
Field | Description |
---|---|
teamRoleRef string | TeamRoleRef references a Greenhouse TeamRole by name |
teamRef string | TeamRef references a Greenhouse Team by name |
usernames []string | Usernames defines list of users to add to the (Cluster-)RoleBindings |
clusterName string | ClusterName is the name of the cluster the rbacv1 resources are created on. |
clusterSelector Kubernetes meta/v1.LabelSelector | ClusterSelector is a label selector to select the Clusters the TeamRoleBinding should be deployed to. |
namespaces []string | Namespaces is a list of namespaces in the Greenhouse Clusters to apply the RoleBinding to. If empty, a ClusterRoleBinding will be created on the remote cluster, otherwise a RoleBinding per namespace. |
createNamespaces bool | CreateNamespaces when enabled the controller will create namespaces for RoleBindings if they do not exist. |
TeamRoleBindingStatus
(Appears on: TeamRoleBinding)
TeamRoleBindingStatus defines the observed state of the TeamRoleBinding
Field | Description |
---|---|
statusConditions StatusConditions | StatusConditions contain the different conditions that constitute the status of the TeamRoleBinding. |
clusters []PropagationStatus | PropagationStatus is the list of clusters the TeamRoleBinding is applied to |
TeamRoleSpec
(Appears on: TeamRole)
TeamRoleSpec defines the desired state of a TeamRole
Field | Description |
---|---|
rules []Kubernetes rbac/v1.PolicyRule | Rules is a list of rbacv1.PolicyRules used on a managed RBAC (Cluster)Role |
aggregationRule Kubernetes rbac/v1.AggregationRule | AggregationRule describes how to locate ClusterRoles to aggregate into the ClusterRole on the remote cluster |
labels map[string]string | Labels are applied to the ClusterRole created on the remote cluster. This allows using TeamRoles as part of AggregationRules by other TeamRoles |
TeamRoleStatus
(Appears on: TeamRole)
TeamRoleStatus defines the observed state of a TeamRole
TeamSpec
(Appears on: Team)
TeamSpec defines the desired state of Team
Field | Description |
---|---|
description string | Description provides additional details of the team. |
mappedIdPGroup string | IdP group id matching team. |
joinUrl string | URL to join the IdP group. |
TeamStatus
(Appears on: Team)
TeamStatus defines the observed state of Team
Field | Description |
---|---|
statusConditions StatusConditions | |
members []User |
UIApplicationReference
(Appears on: PluginDefinitionSpec, PluginStatus)
UIApplicationReference references the UI pluginDefinition to use.
Field | Description |
---|---|
url string | URL specifies the url to a built javascript asset. By default, assets are loaded from the Juno asset server using the provided name and version. |
name string | Name of the UI application. |
version string | Version of the frontend application. |
User
(Appears on: TeamMembershipSpec, TeamStatus)
User specifies a human person.
Field | Description |
---|---|
id string | ID is the unique identifier of the user. |
firstName string | FirstName of the user. |
lastName string | LastName of the user. |
email string | Email of the user. |
ValueFromSource
(Appears on: PluginOptionValue, SCIMConfig)
ValueFromSource is a valid source for a value.
Field | Description |
---|---|
secret SecretKeyReference | Secret references the secret containing the value. |
This page was automatically generated with gen-crd-api-reference-docs
2 - Plugin Catalog
This section provides an overview of the available PluginDefinitions in Greenhouse.
2.1 - Alerts
Learn more about the alerts plugin. Use it to activate Prometheus alert management for your Greenhouse organisation.
The main terminologies used in this document can be found in core-concepts.
Overview
This Plugin includes a preconfigured Prometheus Alertmanager, which is deployed and managed via the Prometheus Operator, and Supernova, an advanced user interface for Prometheus Alertmanager. Certificates are automatically generated to enable sending alerts from Prometheus to Alertmanager. These alerts can too be sent as Slack notifications with a provided set of notification templates.
Components included in this Plugin:
This Plugin usually is deployed along the kube-monitoring Plugin and does not deploy the Prometheus Operator itself. However, if you are intending to use it stand-alone, you need to explicitly enable the deployment of Prometheus Operator, otherwise it will not work. It can be done in the configuration interface of the plugin.
Disclaimer
This is not meant to be a comprehensive package that covers all scenarios. If you are an expert, feel free to configure the plugin according to your needs.
The Plugin is a deeply configured kube-prometheus-stack Helm chart which helps to keep track of versions and community updates.
It is intended as a platform that can be extended by following the guide.
Contribution is highly appreciated. If you discover bugs or want to add functionality to the plugin, then pull requests are always welcome.
Quick start
This guide provides a quick and straightforward way to use alerts as a Greenhouse Plugin on your Kubernetes cluster.
Prerequisites
- A running and Greenhouse-onboarded Kubernetes cluster. If you don’t have one, follow the Cluster onboarding guide.
- kube-monitoring plugin (which brings in Prometheus Operator) OR stand alone: awareness to enable the deployment of Prometheus Operator with this plugin
Step 1:
You can install the alerts
package in your cluster with Helm manually or let the Greenhouse platform lifecycle it for you automatically. For the latter, you can either:
- Go to Greenhouse dashboard and select the Alerts Plugin from the catalog. Specify the cluster and required option values.
- Create and specify a
Plugin
resource in your Greenhouse central cluster according to the examples.
Step 2:
After the installation, you can access the Supernova UI by navigating to the Alerts
tab in the Greenhouse dashboard.
Step 3:
Greenhouse regularly performs integration tests that are bundled with alerts. These provide feedback on whether all the necessary resources are installed and continuously up and running. You will find messages about this in the plugin status and also in the Greenhouse dashboard.
Configuration
Prometheus Alertmanager options
Name | Description | Value |
---|---|---|
global.caCert | Additional caCert to add to the CA bundle | "" |
alerts.commonLabels | Labels to apply to all resources | {} |
alerts.defaultRules.create | Creates community Alertmanager alert rules. | true |
alerts.defaultRules.labels | kube-monitoring plugin: <plugin.name> to evaluate Alertmanager rules. | {} |
alerts.alertmanager.enabled | Deploy Prometheus Alertmanager | true |
alerts.alertmanager.annotations | Annotations for Alertmanager | {} |
alerts.alertmanager.config | Alertmanager configuration directives. | {} |
alerts.alertmanager.ingress.enabled | Deploy Alertmanager Ingress | false |
alerts.alertmanager.ingress.hosts | Must be provided if Ingress is enabled. | [] |
alerts.alertmanager.ingress.tls | Must be a valid TLS configuration for Alertmanager Ingress. Supernova UI passes the client certificate to retrieve alerts. | {} |
alerts.alertmanager.ingress.ingressClassname | Specifies the ingress-controller | nginx |
alerts.alertmanager.servicemonitor.additionalLabels | kube-monitoring plugin: <plugin.name> to scrape Alertmanager metrics. | {} |
alerts.alertmanager.alertmanagerConfig.slack.routes[].name | Name of the Slack route. | "" |
alerts.alertmanager.alertmanagerConfig.slack.routes[].channel | Slack channel to post alerts to. Must be defined with slack.webhookURL . | "" |
alerts.alertmanager.alertmanagerConfig.slack.routes[].webhookURL | Slack webhookURL to post alerts to. Must be defined with slack.channel . | "" |
alerts.alertmanager.alertmanagerConfig.slack.routes[].matchers | List of matchers that the alert’s label should match. matchType , name , regex , value | [] |
alerts.alertmanager.alertmanagerConfig.webhook.routes[].name | Name of the webhook route. | "" |
alerts.alertmanager.alertmanagerConfig.webhook.routes[].url | Webhook url to post alerts to. | "" |
alerts.alertmanager.alertmanagerConfig.webhook.routes[].matchers | List of matchers that the alert’s label should match. matchType , name , regex , value | [] |
alerts.alertmanager.alertmanagerSpec.alertmanagerConfiguration | AlermanagerConfig to be used as top level configuration | false |
alerts.alertmanager.alertmanagerConfig.webhook.routes[].matchers | List of matchers that the alert’s label should match. matchType , name , regex , value | [] |
cert-manager options
Name | Description | Value |
---|---|---|
alerts.certManager.enabled | Creates jetstack/cert-manager resources to generate Issuer and Certificates for Prometheus authentication. | true |
alerts.certManager.rootCert.duration | Duration, how long the root certificate is valid. | "5y" |
alerts.certManager.admissionCert.duration | Duration, how long the admission certificate is valid. | "1y" |
alerts.certManager.issuerRef.name | Name of the existing Issuer to use. | "" |
Supernova options
theme
: Override the default theme. Possible values are "theme-light"
or "theme-dark"
(default)
endpoint
: Alertmanager API Endpoint URL /api/v2
. Should be one of alerts.alertmanager.ingress.hosts
silenceExcludedLabels
: SilenceExcludedLabels are labels that are initially excluded by default when creating a silence. However, they can be added if necessary when utilizing the advanced options in the silence form.The labels must be an array of strings. Example: ["pod", "pod_name", "instance"]
filterLabels
: FilterLabels are the labels shown in the filter dropdown, enabling users to filter alerts based on specific criteria. The ‘Status’ label serves as a default filter, automatically computed from the alert status attribute and will be not overwritten. The labels must be an array of strings. Example: ["app", "cluster", "cluster_type"]
predefinedFilters
: PredefinedFilters are filters applied through in the UI to differentiate between contexts through matching alerts with regular expressions. They are loaded by default when the application is loaded. The format is a list of objects including name, displayname and matchers (containing keys corresponding value). Example:
[
{
"name": "prod",
"displayName": "Productive System",
"matchers": {
"region": "^prod-.*"
}
}
]
silenceTemplates
: SilenceTemplates are used in the Modal (schedule silence) to allow pre-defined silences to be used to scheduled maintenance windows. The format consists of a list of objects including description, editable_labels (array of strings specifying the labels that users can modify), fixed_labels (map containing fixed labels and their corresponding values), status, and title. Example:
"silenceTemplates": [
{
"description": "Description of the silence template",
"editable_labels": ["region"],
"fixed_labels": {
"name": "Marvin",
},
"status": "active",
"title": "Silence"
}
]
Managing Alertmanager configuration
ref:
- https://prometheus.io/docs/alerting/configuration/#configuration-file
- https://prometheus.io/webtools/alerting/routing-tree-editor/
By default, the Alertmanager instances will start with a minimal configuration which isn’t really useful since it doesn’t send any notification when receiving alerts.
You have multiple options to provide the Alertmanager configuration:
- You can use
alerts.alertmanager.config
to define a Alertmanager configuration. Example below.
config:
global:
resolve_timeout: 5m
inhibit_rules:
- source_matchers:
- "severity = critical"
target_matchers:
- "severity =~ warning|info"
equal:
- "namespace"
- "alertname"
- source_matchers:
- "severity = warning"
target_matchers:
- "severity = info"
equal:
- "namespace"
- "alertname"
- source_matchers:
- "alertname = InfoInhibitor"
target_matchers:
- "severity = info"
equal:
- "namespace"
route:
group_by: ["namespace"]
group_wait: 30s
group_interval: 5m
repeat_interval: 12h
receiver: "null"
routes:
- receiver: "null"
matchers:
- alertname =~ "InfoInhibitor|Watchdog"
receivers:
- name: "null"
templates:
- "/etc/alertmanager/config/*.tmpl"
- You can discover
AlertmanagerConfig
objects. Thespec.alertmanagerConfigSelector
is always set tomatchLabels
:plugin: <name>
to tell the operator whichAlertmanagerConfigs
objects should be selected and merged with the main Alertmanager configuration. Note: The default strategy for aAlertmanagerConfig
object to match alerts isOnNamespace
.
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
name: config-example
labels:
alertmanagerConfig: example
pluginDefinition: alerts-example
spec:
route:
groupBy: ["job"]
groupWait: 30s
groupInterval: 5m
repeatInterval: 12h
receiver: "webhook"
receivers:
- name: "webhook"
webhookConfigs:
- url: "http://example.com/"
- You can use
alerts.alertmanager.alertmanagerSpec.alertmanagerConfiguration
to reference anAlertmanagerConfig
object in the same namespace which defines the main Alertmanager configuration.
# Example with select a global alertmanagerconfig
alertmanagerConfiguration:
name: global-alertmanager-configuration
TLS Certificate Requirement
Greenhouse onboarded Prometheus installations need to communicate with the Alertmanager component to enable processing of alerts. If an Alertmanager Ingress is enabled, this requires a TLS certificate to be configured and trusted by Alertmanger to ensure the communication. To enable automatic self-signed TLS certificate provisioning via cert-manager, set the alerts.certManager.enabled
value to true
.
Note: Prerequisite of this feature is a installed jetstack/cert-manager which can be implemented via the Greenhouse cert-manager Plugin.
Examples
Deploy alerts with Alertmanager
apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
name: alerts
spec:
pluginDefinition: alerts
disabled: false
displayName: Alerts
optionValues:
- name: alerts.alertmanager.enabled
value: true
- name: alerts.alertmanager.ingress.enabled
value: true
- name: alerts.alertmanager.ingress.hosts
value:
- alertmanager.dns.example.com
- name: alerts.alertmanager.ingress.tls
value:
- hosts:
- alertmanager.dns.example.com
secretName: tls-alertmanager-dns-example-com
- name: alerts.alertmanagerConfig.slack.routes
value:
- channel: slack-warning-channel
webhookURL: https://hooks.slack.com/services/some-id
matchers:
- name: severity
matchType: "="
value: "warning"
- channel: slack-critical-channel
webhookURL: https://hooks.slack.com/services/some-id
matchers:
- name: severity
matchType: "="
value: "critical"
- name: alerts.alertmanagerConfig.webhook.routes
value:
- name: webhook-route
url: https://some-webhook-url
matchers:
- name: alertname
matchType: "=~"
value: ".*"
- name: alerts.alertmanager.serviceMonitor.additionalLabels
value:
plugin: kube-monitoring
- name: alerts.defaultRules.create
value: true
- name: alerts.defaultRules.labels
value:
plugin: kube-monitoring
- name: endpoint
value: https://alertmanager.dns.example.com/api/v2
- name: filterLabels
value:
- job
- severity
- status
- name: silenceExcludedLabels
value:
- pod
- pod_name
- instance
Deploy alerts without Alertmanager (Bring your own Alertmanager - Supernova UI only)
apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
name: alerts
spec:
pluginDefinition: alerts
disabled: false
displayName: Alerts
optionValues:
- name: alerts.alertmanager.enabled
value: false
- name: alerts.alertmanager.ingress.enabled
value: false
- name: alerts.defaultRules.create
value: false
- name: endpoint
value: https://alertmanager.dns.example.com/api/v2
- name: filterLabels
value:
- job
- severity
- status
- name: silenceExcludedLabels
value:
- pod
- pod_name
- instance
2.2 - Cert-manager
This Plugin provides the cert-manager to automate the management of TLS certificates.
Configuration
This section highlights configuration of selected Plugin features.
All available configuration options are described in the plugin.yaml.
Ingress shim
An Ingress resource in Kubernetes configures external access to services in a Kubernetes cluster.
Securing ingress resources with TLS certificates is a common use-case and the cert-manager can be configured to handle these via the ingress-shim
component.
It can be enabled by deploying an issuer in your organization and setting the following options on this plugin.
Option | Type | Description |
---|---|---|
cert-manager.ingressShim.defaultIssuerName | string | Name of the cert-manager issuer to use for TLS certificates |
cert-manager.ingressShim.defaultIssuerKind | string | Kind of the cert-manager issuer to use for TLS certificates |
cert-manager.ingressShim.defaultIssuerGroup | string | Group of the cert-manager issuer to use for TLS certificates |
2.3 - Decentralized Observer of Policies (Violations)
This directory contains the Greenhouse plugin for the Decentralized Observer of Policies (DOOP).
DOOP
To perform automatic validations on Kubernetes objects, we run a deployment of OPA Gatekeeper in each cluster. This dashboard aggregates all policy violations reported by those Gatekeeper instances.
2.4 - Designate Ingress CNAME operator (DISCO)
This Plugin provides the Designate Ingress CNAME operator (DISCO) to automate management of DNS entries in OpenStack Designate for Ingress and Services in Kubernetes.
2.5 - DigiCert issuer
This Plugin provides the digicert-issuer, an external Issuer extending the cert-manager with the DigiCert cert-central API.
2.6 - External DNS
This Plugin provides the external DNS operator) which synchronizes exposed Kubernetes Services and Ingresses with DNS providers.
2.7 - Github Guard
Github Guard Greenhouse Plugin manages Github teams, team memberships and repository & team assignments.
Hierarchy of Custom Resources
Custom Resources
Github
– an installation of Github App
apiVersion: githubguard.sap/v1
kind: Github
metadata:
name: com
spec:
webURL: https://github.com
v3APIURL: https://api.github.com
integrationID: 420328
clientUserAgent: greenhouse-github-guard
secret: github-com-secret
GithubOrganization
with Feature & Action Flags
apiVersion: githubguard.sap/v1
kind: GithubOrganization
metadata:
name: com--greenhouse-sandbox
labels:
githubguard.sap/addTeam: "true"
githubguard.sap/removeTeam: "true"
githubguard.sap/addOrganizationOwner: "true"
githubguard.sap/removeOrganizationOwner: "true"
githubguard.sap/addRepositoryTeam: "true"
githubguard.sap/removeRepositoryTeam: "true"
githubguard.sap/dryRun: "false"
Default team & repository assignments:
GithubTeamRepository
for exception team & repository assignments
GithubUsername
for external username matching
apiVersion: githubguard.sap/v1
kind: GithubUsername
metadata:
annotations:
last-check-timestamp: 1681614602
name: com-I313226
spec:
userID: greenhouse_onuryilmaz
githubUsername: onuryilmaz
github: com
2.8 - Ingress NGINX
This plugin contains the ingress NGINX controller.
Example
To instantiate the plugin create a Plugin
like:
apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
name: ingress-nginx
spec:
pluginDefinition: ingress-nginx-v4.4.0
values:
- name: controller.service.loadBalancerIP
value: 1.2.3.4
2.9 - Kubernetes Monitoring
Learn more about the kube-monitoring plugin. Use it to activate Kubernetes monitoring for your Greenhouse cluster.
The main terminologies used in this document can be found in core-concepts.
Overview
Observability is often required for operation and automation of service offerings. To get the insights provided by an application and the container runtime environment, you need telemetry data in the form of metrics or logs sent to backends such as Prometheus or OpenSearch. With the kube-monitoring Plugin, you will be able to cover the metrics part of the observability stack.
This Plugin includes a pre-configured package of components that help make getting started easy and efficient. At its core, an automated and managed Prometheus installation is provided using the prometheus-operator. This is complemented by Prometheus target configuration for the most common Kubernetes components providing metrics by default. In addition, Cloud operators curated Prometheus alerting rules and Plutono dashboards are included to provide a comprehensive monitoring solution out of the box.
Components included in this Plugin:
- Prometheus
- Prometheus Operator
- Prometheus target configuration for Kubernetes metrics APIs (e.g. kubelet, apiserver, coredns, etcd)
- Prometheus node exporter
- kube-state-metrics
- kubernetes-operations
Disclaimer
It is not meant to be a comprehensive package that covers all scenarios. If you are an expert, feel free to configure the plugin according to your needs.
The Plugin is a deeply configured kube-prometheus-stack Helm chart which helps to keep track of versions and community updates.
It is intended as a platform that can be extended by following the guide.
Contribution is highly appreciated. If you discover bugs or want to add functionality to the plugin, then pull requests are always welcome.
Quick start
This guide provides a quick and straightforward way to use kube-monitoring as a Greenhouse Plugin on your Kubernetes cluster.
Prerequisites
- A running and Greenhouse-onboarded Kubernetes cluster. If you don’t have one, follow the Cluster onboarding guide.
Step 1:
You can install the kube-monitoring
package in your cluster by installing it with Helm manually or let the Greenhouse platform lifecycle it for you automatically. For the latter, you can either:
- Go to Greenhouse dashboard and select the Kubernetes Monitoring plugin from the catalog. Specify the cluster and required option values.
- Create and specify a
Plugin
resource in your Greenhouse central cluster according to the examples.
Step 2:
After installation, Greenhouse will provide a generated link to the Prometheus user interface. This is done via the annotation greenhouse.sap/expose: “true”
at the Prometheus Service
resource.
Step 3:
Greenhouse regularly performs integration tests that are bundled with kube-monitoring. These provide feedback on whether all the necessary resources are installed and continuously up and running. You will find messages about this in the plugin status and also in the Greenhouse dashboard.
Values
Alertmanager options
Key | Type | Default | Description |
---|---|---|---|
alerts.alertmanagers.hosts | list | [] | List of Alertmanager hostsd alerts to |
alerts.alertmanagers.tlsConfig.cert | string | "" | TLS certificate for communication with Alertmanager |
alerts.alertmanagers.tlsConfig.key | string | "" | TLS key for communication with Alertmanager |
alerts.enabled | bool | false | To send alerts to Alertmanager |
Global options
Key | Type | Default | Description |
---|---|---|---|
global.commonLabels | object | {} | Labels to apply to all resources This can be used to add a support_group or service label to all resources and alerting rules. |
Kubernetes component scraper options
Key | Type | Default | Description |
---|---|---|---|
kubeMonitoring.coreDns.enabled | bool | true | Component scraping coreDns. Use either this or kubeDns |
kubeMonitoring.kubeApiServer.enabled | bool | true | Component scraping the kube API server |
kubeMonitoring.kubeControllerManager.enabled | bool | false | Component scraping the kube controller manager |
kubeMonitoring.kubeDns.enabled | bool | false | Component scraping kubeDns. Use either this or coreDns |
kubeMonitoring.kubeEtcd.enabled | bool | true | Component scraping etcd |
kubeMonitoring.kubeProxy.enabled | bool | false | Component scraping kube proxy |
kubeMonitoring.kubeScheduler.enabled | bool | false | Component scraping kube scheduler |
kubeMonitoring.kubeStateMetrics.enabled | bool | true | Component scraping kube state metrics |
kubeMonitoring.kubelet.enabled | bool | true | Component scraping the kubelet and kubelet-hosted cAdvisor |
kubeMonitoring.kubernetesServiceMonitors.enabled | bool | true | Flag to disable all the Kubernetes component scrapers |
kubeMonitoring.nodeExporter.enabled | bool | true | Deploy node exporter as a daemonset to all nodes |
Prometheus options
Key | Type | Default | Description |
---|---|---|---|
kubeMonitoring.prometheus.annotations | object | {} | Annotations for Prometheus |
kubeMonitoring.prometheus.enabled | bool | true | Deploy a Prometheus instance |
kubeMonitoring.prometheus.ingress.enabled | bool | false | Deploy Prometheus Ingress |
kubeMonitoring.prometheus.ingress.hosts | list | [] | Must be provided if Ingress is enabled |
kubeMonitoring.prometheus.ingress.ingressClassname | string | "nginx" | Specifies the ingress-controller |
kubeMonitoring.prometheus.prometheusSpec.additionalArgs | list | [] | Allows setting additional arguments for the Prometheus container |
kubeMonitoring.prometheus.prometheusSpec.additionalScrapeConfigs | string | "" | Next to ScrapeConfig CRD, you can use AdditionalScrapeConfigs, which allows specifying additional Prometheus scrape configurations |
kubeMonitoring.prometheus.prometheusSpec.evaluationInterval | string | "" | Interval between consecutive evaluations |
kubeMonitoring.prometheus.prometheusSpec.externalLabels | object | {} | External labels to add to any time series or alerts when communicating with external systems like Alertmanager |
kubeMonitoring.prometheus.prometheusSpec.logLevel | string | "" | Log level to be configured for Prometheus |
kubeMonitoring.prometheus.prometheusSpec.podMonitorSelector | object | {"matchLabels":{"plugin":"{{ $.Release.Name }}"}} | PodMonitors to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } } |
kubeMonitoring.prometheus.prometheusSpec.probeSelector | object | {"matchLabels":{"plugin":"{{ $.Release.Name }}"}} | Probes to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } } |
kubeMonitoring.prometheus.prometheusSpec.retention | string | "" | How long to retain metrics |
kubeMonitoring.prometheus.prometheusSpec.ruleSelector | object | {"matchLabels":{"plugin":"{{ $.Release.Name }}"}} | PrometheusRules to be selected for target discovery. If {}, select all PrometheusRules @default { matchLabels: { plugin: <metadata.name> } } |
kubeMonitoring.prometheus.prometheusSpec.scrapeConfigSelector | object | {"matchLabels":{"plugin":"{{ $.Release.Name }}"}} | scrapeConfigs to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } } |
kubeMonitoring.prometheus.prometheusSpec.scrapeInterval | string | "" | Interval between consecutive scrapes. Defaults to 30s |
kubeMonitoring.prometheus.prometheusSpec.scrapeTimeout | string | "" | Number of seconds to wait for target to respond before erroring |
kubeMonitoring.prometheus.prometheusSpec.serviceMonitorSelector | object | {"matchLabels":{"plugin":"{{ $.Release.Name }}"}} | ServiceMonitors to be selected for target discovery. If {}, select all ServiceMonitors @default { matchLabels: { plugin: <metadata.name> } } |
kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources | object | {"requests":{"storage":"50Gi"}} | How large the persistent volume should be to house the Prometheus database. Default 50Gi. |
kubeMonitoring.prometheus.tlsConfig.caCert | string | "Secret" | CA certificate to verify technical clients at Prometheus Ingress |
Prometheus-operator options
Key | Type | Default | Description |
---|---|---|---|
kubeMonitoring.prometheusOperator.alertmanagerConfigNamespaces | list | [] | Filter namespaces to look for prometheus-operator AlertmanagerConfig resources |
kubeMonitoring.prometheusOperator.alertmanagerInstanceNamespaces | list | [] | Filter namespaces to look for prometheus-operator Alertmanager resources |
kubeMonitoring.prometheusOperator.enabled | bool | true | Manages Prometheus and Alertmanager components |
kubeMonitoring.prometheusOperator.prometheusInstanceNamespaces | list | [] | Filter namespaces to look for prometheus-operator Prometheus resources |
Service Discovery
The kube-monitoring Plugin provides a PodMonitor to automatically discover the Prometheus metrics of the Kubernetes Pods in any Namespace. The PodMonitor is configured to detect the metrics
endpoint of the Pods if the following annotations are set:
metadata:
annotations:
greenhouse/scrape: “true”
greenhouse/target: <kube-monitoring plugin name>
Note: The annotations needs to be added manually to have the pod scraped and the port name needs to match.
Examples
Deploy kube-monitoring into a remote cluster
apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
name: kube-monitoring
spec:
pluginDefinition: kube-monitoring
disabled: false
optionValues:
- name: kubeMonitoring.prometheus.prometheusSpec.retention
value: 30d
- name: kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage
value: 100Gi
- name: kubeMonitoring.prometheus.service.labels
value:
greenhouse.sap/expose: "true"
- name: kubeMonitoring.prometheus.prometheusSpec.externalLabels
value:
cluster: example-cluster
organization: example-org
region: example-region
- name: alerts.enabled
value: true
- name: alerts.alertmanagers.hosts
value:
- alertmanager.dns.example.com
- name: alerts.alertmanagers.tlsConfig.cert
valueFrom:
secret:
key: tls.crt
name: tls-<org-name>-prometheus-auth
- name: alerts.alertmanagers.tlsConfig.key
valueFrom:
secret:
key: tls.key
name: tls-<org-name>-prometheus-auth
Deploy Prometheus only
Example Plugin
to deploy Prometheus with the kube-monitoring
Plugin.
NOTE: If you are using kube-monitoring for the first time in your cluster, it is necessary to set kubeMonitoring.prometheusOperator.enabled
to true
.
apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
name: example-prometheus-name
spec:
pluginDefinition: kube-monitoring
disabled: false
optionValues:
- name: kubeMonitoring.defaultRules.create
value: false
- name: kubeMonitoring.kubernetesServiceMonitors.enabled
value: false
- name: kubeMonitoring.prometheusOperator.enabled
value: false
- name: kubeMonitoring.kubeStateMetrics.enabled
value: false
- name: kubeMonitoring.nodeExporter.enabled
value: false
- name: kubeMonitoring.prometheus.prometheusSpec.retention
value: 30d
- name: kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage
value: 100Gi
- name: kubeMonitoring.prometheus.service.labels
value:
greenhouse.sap/expose: "true"
- name: kubeMonitoring.prometheus.prometheusSpec.externalLabels
value:
cluster: example-cluster
organization: example-org
region: example-region
- name: alerts.enabled
value: true
- name: alerts.alertmanagers.hosts
value:
- alertmanager.dns.example.com
- name: alerts.alertmanagers.tlsConfig.cert
valueFrom:
secret:
key: tls.crt
name: tls-<org-name>-prometheus-auth
- name: alerts.alertmanagers.tlsConfig.key
valueFrom:
secret:
key: tls.key
name: tls-<org-name>-prometheus-auth
Extension of the plugin
kube-monitoring can be extended with your own Prometheus alerting rules and target configurations via the Custom Resource Definitions (CRDs) of the Prometheus operator. The user-defined resources to be incorporated with the desired configuration are defined via label selections.
The CRD PrometheusRule
enables the definition of alerting and recording rules that can be used by Prometheus or Thanos Rule instances. Alerts and recording rules are reconciled and dynamically loaded by the operator without having to restart Prometheus or Thanos Rule.
kube-monitoring Prometheus will automatically discover and load the rules that match labels plugin: <plugin-name>
.
Example:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: example-prometheus-rule
labels:
plugin: <metadata.name>
## e.g plugin: kube-monitoring
spec:
groups:
- name: example-group
rules:
...
The CRDs PodMonitor
, ServiceMonitor
, Probe
and ScrapeConfig
allow the definition of a set of target endpoints to be scraped by Prometheus. The operator will automatically discover and load the configurations that match labels plugin: <plugin-name>
.
Example:
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: example-pod-monitor
labels:
plugin: <metadata.name>
## e.g plugin: kube-monitoring
spec:
selector:
matchLabels:
app: example-app
namespaceSelector:
matchNames:
- example-namespace
podMetricsEndpoints:
- port: http
...
2.10 - Logs Plugin
Learn more about the Logs Plugin. Use it to enable the ingestion, collection and export of telemetry signals (logs and metrics) for your Greenhouse cluster.
The main terminologies used in this document can be found in core-concepts.
Overview
OpenTelemetry is an observability framework and toolkit for creating and managing telemetry data such as metrics, logs and traces. Unlike other observability tools, OpenTelemetry is vendor and tool agnostic, meaning it can be used with a variety of observability backends, including open source tools such as OpenSearch and Prometheus.
The focus of the Plugin is to provide easy-to-use configurations for common use cases of receiving, processing and exporting telemetry data in Kubernetes. The storage and visualization of the same is intentionally left to other tools.
Components included in this Plugin:
Architecture
Note
It is the intention to add more configuration over time and contributions of your very own configuration is highly appreciated. If you discover bugs or want to add functionality to the Plugin, feel free to create a pull request.
Quick Start
This guide provides a quick and straightforward way to use OpenTelemetry for Logs as a Greenhouse Plugin on your Kubernetes cluster.
Prerequisites
- A running and Greenhouse-onboarded Kubernetes cluster. If you don’t have one, follow the Cluster onboarding guide.
- For logs, a OpenSearch instance to store. If you don’t have one, reach out to your observability team to get access to one.
- We recommend a running cert-manager in the cluster before installing the Logs Plugin
- To gather metrics, you must have a Prometheus instance in the onboarded cluster for storage and for managing Prometheus specific CRDs. If you don not have an instance, install the kube-monitoring Plugin first.
Step 1:
You can install the Logs
package in your cluster by installing it with Helm manually or let the Greenhouse platform lifecycle do it for you automatically. For the latter, you can either:
- Go to Greenhouse dashboard and select the Logs Plugin from the catalog. Specify the cluster and required option values.
- Create and specify a
Plugin
resource in your Greenhouse central cluster according to the examples.
Step 2:
The package will deploy the OpenTelemetry Operator which works as a manager for the collectors and auto-instrumentation of the workload. By default, the package will include a configuration for collecting metrics and logs. The log-collector is currently processing data from the preconfigured receivers:
- Files via the Filelog Receiver
- Kubernetes Events from the Kubernetes API server
- Journald events from systemd journal
- its own metrics
You can disable the collection of logs by setting openTelemetry.logCollector.enabled
to false
. The same is true for disabling the collection of metrics by setting openTelemetry.metricsCollector.enabled
to false
.
The logsCollector
comes with a standard set of log-processing, such as adding cluster information and common labels for Journald events.
In addition we provide default pipelines for common log types. Currently the following log types have default configurations that can be enabled (requires logsCollector.enabled
to true
):
- KVM:
openTelemetry.logsCollector.kvmConfig
: Logs from Kernel-based Virtual Machines (KVMs) providing insights into virtualization activities, resource usage, and system performance - Ceph:
openTelemetry.logsCollector.cephConfig
: Logs from Ceph storage systems, capturing information about cluster operations, performance metrics, and health status
These default configurations provide common labels and Grok parsing for logs emitted through the respective services.
Based on the backend selection the telemetry data will be exporter to the backend.
Step 3:
Greenhouse regularly performs integration tests that are bundled with the Logs Plugin. These provide feedback on whether all the necessary resources are installed and continuously up and running. You will find messages about this in the Plugin status and also in the Greenhouse dashboard.
Failover Connector
The Logs Plugin comes with a Failover Connector for OpenSearch for two users. The connector will periodically try to establish a stable connection for the prefered user (failover_username_a
) and in case of a failed try, the connector will try to establish a connection with the fallback user (failover_username_b
). This feature can be used to secure the shipping of logs in case of expiring credentials or password rotation.
Values
Key | Type | Default | Description |
---|---|---|---|
commonLabels | object | {} | common labels to apply to all resources. |
logs-operator.admissionWebhooks.autoGenerateCert | object | {"recreate":false} | Activate to use Helm to create self-signed certificates. |
logs-operator.admissionWebhooks.autoGenerateCert.recreate | bool | false | Activate to recreate the cert after a defined period (certPeriodDays default is 365). |
logs-operator.admissionWebhooks.certManager | object | {"enabled":false} | Activate to use the CertManager for generating self-signed certificates. |
logs-operator.admissionWebhooks.failurePolicy | string | "Ignore" | Defines if the admission webhooks should Ignore errors or Fail on errors when communicating with the API server. |
logs-operator.crds.create | bool | false | The required CRDs used by this dependency are version-controlled in this repository under ./crds. If you want to use the upstream CRDs, set this variable to `true``. |
logs-operator.kubeRBACProxy | object | {"enabled":false} | the kubeRBACProxy can be enabled to allow the operator perform RBAC authorization against the Kubernetes API. |
logs-operator.manager.collectorImage.repository | string | "ghcr.io/cloudoperators/opentelemetry-collector-contrib" | overrides the default image repository for the OpenTelemetry Collector image. |
logs-operator.manager.collectorImage.tag | string | "b05fdf1" | overrides the default image tag for the OpenTelemetry Collector image. |
logs-operator.manager.image.repository | string | "ghcr.io/open-telemetry/opentelemetry-operator/opentelemetry-operator" | overrides the default image repository for the OpenTelemetry Operator image. |
logs-operator.manager.image.tag | string | "v0.120.0" | overrides the default tag repository for the OpenTelemetry Operator image. |
logs-operator.manager.serviceMonitor | object | {"enabled":true} | Enable serviceMonitor for Prometheus metrics scrape |
logs-operator.nameOverride | string | "operator" | Provide a name in place of the default name opentelemetry-operator . |
openTelemetry.cluster | string | nil | Cluster label for Logging |
openTelemetry.customLabels | object | {} | custom Labels applied to servicemonitor, secrets and collectors |
openTelemetry.logsCollector.cephConfig | object | {"enabled":false} | Activates the configuration for Ceph logs (requires logsCollector to be enabled). |
openTelemetry.logsCollector.enabled | bool | true | Activates the standard configuration for Logs. |
openTelemetry.logsCollector.failover | object | {"enabled":true} | Activates the failover mechanism for shipping logs using the failover_username_band failover_password_b credentials in case the credentials failover_username_a and failover_password_a have expired. |
openTelemetry.logsCollector.kvmConfig | object | {"enabled":false} | Activates the configuration for KVM logs (requires logsCollector to be enabled). |
openTelemetry.metricsCollector | object | {"enabled":false} | Activates the standard configuration for metrics. |
openTelemetry.openSearchLogs.endpoint | string | nil | Endpoint URL for OpenSearch |
openTelemetry.openSearchLogs.failover_password_a | string | nil | Password for OpenSearch endpoint |
openTelemetry.openSearchLogs.failover_password_b | string | nil | Second Password (as a failover) for OpenSearch endpoint |
openTelemetry.openSearchLogs.failover_username_a | string | nil | Username for OpenSearch endpoint |
openTelemetry.openSearchLogs.failover_username_b | string | nil | Second Username (as a failover) for OpenSearch endpoint |
openTelemetry.openSearchLogs.index | string | nil | Name for OpenSearch index |
openTelemetry.prometheus.additionalLabels | object | {} | Label selectors for the Prometheus resources to be picked up by prometheus-operator. |
openTelemetry.prometheus.podMonitor | object | {"enabled":true} | Activates the pod-monitoring for the Logs Collector. |
openTelemetry.prometheus.rules | object | {"additionalRuleLabels":null,"annotations":{},"create":true,"disabled":[],"labels":{}} | Default rules for monitoring the opentelemetry components. |
openTelemetry.prometheus.rules.additionalRuleLabels | string | nil | Additional labels for PrometheusRule alerts. |
openTelemetry.prometheus.rules.annotations | object | {} | Annotations for PrometheusRules. |
openTelemetry.prometheus.rules.create | bool | true | Enables PrometheusRule resources to be created. |
openTelemetry.prometheus.rules.disabled | list | [] | PrometheusRules to disable. |
openTelemetry.prometheus.rules.labels | object | {} | Labels for PrometheusRules. |
openTelemetry.prometheus.serviceMonitor | object | {"enabled":true} | Activates the service-monitoring for the Logs Collector. |
openTelemetry.region | string | nil | Region label for Logging |
testFramework.enabled | bool | true | Activates the Helm chart testing framework. |
testFramework.image.registry | string | "ghcr.io" | Defines the image registry for the test framework. |
testFramework.image.repository | string | "cloudoperators/greenhouse-extensions-integration-test" | Defines the image repository for the test framework. |
testFramework.image.tag | string | "main" | Defines the image tag for the test framework. |
testFramework.imagePullPolicy | string | "IfNotPresent" | Defines the image pull policy for the test framework. |
Examples
TBD
2.11 - Logshipper
This Plugin is intended for shipping container and systemd logs to an Elasticsearch/ OpenSearch cluster. It uses fluentbit to collect logs. The default configuration can be found under chart/templates/fluent-bit-configmap.yaml
.
Components included in this Plugin:
Owner
- @ivogoman
Parameters
Name | Description | Value |
---|---|---|
fluent-bit.parser | Parser used for container logs. [docker|cri] labels | “cri” |
fluent-bit.backend.opensearch.host | Host for the Elastic/OpenSearch HTTP Input | |
fluent-bit.backend.opensearch.port | Port for the Elastic/OpenSearch HTTP Input | |
fluent-bit.backend.opensearch.http_user | Username for the Elastic/OpenSearch HTTP Input | |
fluent-bit.backend.opensearch.http_password | Password for the Elastic/OpenSearch HTTP Input | |
fluent-bit.backend.opensearch.host | Host for the Elastic/OpenSearch HTTP Input | |
fluent-bit.filter.additionalValues | list of Key-Value pairs to label logs labels | [] |
fluent-bit.customConfig.inputs | multi-line string containing additional inputs | |
fluent-bit.customConfig.filters | multi-line string containing additional filters | |
fluent-bit.customConfig.outputs | multi-line string containing additional outputs |
Custom Configuration
To add custom configuration to the fluent-bit configuration please check the fluentbit documentation here.
The fluent-bit.customConfig.inputs
, fluent-bit.customConfig.filters
and fluent-bit.customConfig.outputs
parameters can be used to add custom configuration to the default configuration. The configuration should be added as a multi-line string.
Inputs are rendered after the default inputs, filters are rendered after the default filters and before the additional values are added. Outputs are rendered after the default outputs.
The additional values are added to all logs disregaring the source.
Example Input configuration:
fluent-bit:
config:
inputs: |
[INPUT]
Name tail-audit
Path /var/log/containers/greenhouse-controller*.log
Parser {{ default "cri" ( index .Values "fluent-bit" "parser" ) }}
Tag audit.*
Refresh_Interval 5
Mem_Buf_Limit 50MB
Skip_Long_Lines Off
Ignore_Older 1m
DB /var/log/fluent-bit-tail-audit.pos.db
Logs collected by the default configuration are prefixed with default_
. In case that logs from additional inputs are to be send and processed by the same filters and outputs, the prefix should be used as well.
In case additional secrets are required the fluent-bit.env
field can be used to add them to the environment of the fluent-bit container. The secrets should be created by adding them to the fluent-bit.backend
field.
fluent-bit:
backend:
audit:
http_user: top-secret-audit
http_password: top-secret-audit
host: "audit.test"
tls:
enabled: true
verify: true
debug: false
2.12 - OpenSearch
OpenSearch Plugin
The OpenSearch plugin sets up an OpenSearch environment using the OpenSearch Operator, automating deployment, provisioning, management, and orchestration of OpenSearch clusters and dashboards. It functions as the backend for logs gathered by collectors such as OpenTelemetry collectors, enabling storage and visualization of logs for Greenhouse-onboarded Kubernetes clusters.
The main terminologies used in this document can be found in core-concepts.
Overview
OpenSearch is a distributed search and analytics engine designed for real-time log and event data analysis. The OpenSearch Operator simplifies the management of OpenSearch clusters by providing declarative APIs for configuration and scaling.
Components included in this Plugin:
- OpenSearch Operator
- OpenSearch Cluster Management
- OpenSearch Dashboards Deployment
- OpenSearch Index Management
- OpenSearch Security Configuration
Architecture
The OpenSearch Operator automates the management of OpenSearch clusters within a Kubernetes environment. The architecture consists of:
- OpenSearchCluster CRD: Defines the structure and configuration of OpenSearch clusters, including node roles, scaling policies, and version management.
- OpenSearchDashboards CRD: Manages OpenSearch Dashboards deployments, ensuring high availability and automatic upgrades.
- OpenSearchISMPolicy CRD: Implements index lifecycle management, defining policies for retention, rollover, and deletion.
- OpenSearchIndexTemplate CRD: Enables the definition of index mappings, settings, and template structures.
- Security Configuration via OpenSearchRole and OpenSearchUser: Manages authentication and authorization for OpenSearch users and roles.
Note
More configurations will be added over time, and contributions of custom configurations are highly appreciated. If you discover bugs or want to add functionality to the plugin, feel free to create a pull request.
Quick Start
This guide provides a quick and straightforward way to use OpenSearch as a Greenhouse Plugin on your Kubernetes cluster.
Prerequisites
- A running and Greenhouse-onboarded Kubernetes cluster. If you don’t have one, follow the Cluster onboarding guide.
- The OpenSearch Operator installed via Helm or Kubernetes manifests.
- An OpenTelemetry or similar log ingestion pipeline configured to send logs to OpenSearch.
Installation
Install via Greenhouse
- Navigate to the Greenhouse Dashboard.
- Select the OpenSearch plugin from the catalog.
- Specify the target cluster and configuration options.
Values
Key | Type | Default | Description |
---|---|---|---|
cluster.actionGroups | list | [] | List of OpensearchActionGroup. Check values.yaml file for examples. |
cluster.cluster.annotations | object | {} | OpenSearchCluster annotations |
cluster.cluster.bootstrap.additionalConfig | object | {} | bootstrap additional configuration, key-value pairs that will be added to the opensearch.yml configuration |
cluster.cluster.bootstrap.affinity | object | {} | bootstrap pod affinity rules |
cluster.cluster.bootstrap.jvm | string | "" | bootstrap pod jvm options. If jvm is not provided then the java heap size will be set to half of resources.requests.memory which is the recommend value for data nodes. If jvm is not provided and resources.requests.memory does not exist then value will be -Xmx512M -Xms512M |
cluster.cluster.bootstrap.nodeSelector | object | {} | bootstrap pod node selectors |
cluster.cluster.bootstrap.resources | object | {} | bootstrap pod cpu and memory resources |
cluster.cluster.bootstrap.tolerations | list | [] | bootstrap pod tolerations |
cluster.cluster.confMgmt.smartScaler | bool | true | Enable nodes to be safely removed from the cluster |
cluster.cluster.dashboards.additionalConfig | object | {} | Additional properties for opensearch_dashboards.yaml |
cluster.cluster.dashboards.affinity | object | {} | dashboards pod affinity rules |
cluster.cluster.dashboards.annotations | object | {} | dashboards annotations |
cluster.cluster.dashboards.basePath | string | "" | dashboards Base Path for Opensearch Clusters running behind a reverse proxy |
cluster.cluster.dashboards.enable | bool | true | Enable dashboards deployment |
cluster.cluster.dashboards.env | list | [] | dashboards pod env variables |
cluster.cluster.dashboards.image | string | "docker.io/opensearchproject/opensearch-dashboards" | dashboards image |
cluster.cluster.dashboards.imagePullPolicy | string | "IfNotPresent" | dashboards image pull policy |
cluster.cluster.dashboards.imagePullSecrets | list | [] | dashboards image pull secrets |
cluster.cluster.dashboards.labels | object | {"greenhouse.sap/expose":"true"} | dashboards labels |
cluster.cluster.dashboards.nodeSelector | object | {} | dashboards pod node selectors |
cluster.cluster.dashboards.opensearchCredentialsSecret | object | {} | Secret that contains fields username and password for dashboards to use to login to opensearch, must only be supplied if a custom securityconfig is provided |
cluster.cluster.dashboards.pluginsList | list | [] | List of dashboards plugins to install |
cluster.cluster.dashboards.podSecurityContext | object | {} | dasboards pod security context configuration |
cluster.cluster.dashboards.replicas | int | 1 | number of dashboards replicas |
cluster.cluster.dashboards.resources | object | {} | dashboards pod cpu and memory resources |
cluster.cluster.dashboards.securityContext | object | {} | dashboards security context configuration |
cluster.cluster.dashboards.service.loadBalancerSourceRanges | list | [] | source ranges for a loadbalancer |
cluster.cluster.dashboards.service.type | string | "ClusterIP" | dashboards service type |
cluster.cluster.dashboards.tls.caSecret | object | {} | Secret that contains the ca certificate as ca.crt. If this and generate=true is set the existing CA cert from that secret is used to generate the node certs. In this case must contain ca.crt and ca.key fields |
cluster.cluster.dashboards.tls.enable | bool | false | Enable HTTPS for dashboards |
cluster.cluster.dashboards.tls.generate | bool | true | generate certificate, if false secret must be provided |
cluster.cluster.dashboards.tls.secret | string | nil | Optional, name of a TLS secret that contains ca.crt, tls.key and tls.crt data. If ca.crt is in a different secret provide it via the caSecret field |
cluster.cluster.dashboards.tolerations | list | [] | dashboards pod tolerations |
cluster.cluster.dashboards.version | string | "2.19.1" | dashboards version |
cluster.cluster.general.additionalConfig | object | {} | Extra items to add to the opensearch.yml |
cluster.cluster.general.additionalVolumes | list | [] | Additional volumes to mount to all pods in the cluster. Supported volume types configMap, emptyDir, secret (with default Kubernetes configuration schema) |
cluster.cluster.general.drainDataNodes | bool | true | Controls whether to drain data notes on rolling restart operations |
cluster.cluster.general.httpPort | int | 9200 | Opensearch service http port |
cluster.cluster.general.image | string | "docker.io/opensearchproject/opensearch" | Opensearch image |
cluster.cluster.general.imagePullPolicy | string | "IfNotPresent" | Default image pull policy |
cluster.cluster.general.keystore | list | [] | Populate opensearch keystore before startup |
cluster.cluster.general.monitoring.enable | bool | true | Enable cluster monitoring |
cluster.cluster.general.monitoring.monitoringUserSecret | string | "" | Secret with ‘username’ and ‘password’ keys for monitoring user. You could also use OpenSearchUser CRD instead of setting it. |
cluster.cluster.general.monitoring.pluginUrl | string | "https://github.com/Virtimo/prometheus-exporter-plugin-for-opensearch/releases/download/v2.19.1/prometheus-exporter-2.19.1.0.zip" | Custom URL for the monitoring plugin |
cluster.cluster.general.monitoring.scrapeInterval | string | "30s" | How often to scrape metrics |
cluster.cluster.general.monitoring.tlsConfig | object | {"insecureSkipVerify":true} | Override the tlsConfig of the generated ServiceMonitor |
cluster.cluster.general.pluginsList | list | [] | List of Opensearch plugins to install |
cluster.cluster.general.podSecurityContext | object | {} | Opensearch pod security context configuration |
cluster.cluster.general.securityContext | object | {} | Opensearch securityContext |
cluster.cluster.general.serviceAccount | string | "" | Opensearch serviceAccount name. If Service Account doesn’t exist it could be created by setting serviceAccount.create and serviceAccount.name |
cluster.cluster.general.serviceName | string | "" | Opensearch service name |
cluster.cluster.general.setVMMaxMapCount | bool | true | Enable setVMMaxMapCount. OpenSearch requires the Linux kernel vm.max_map_count option to be set to at least 262144 |
cluster.cluster.general.snapshotRepositories | list | [] | Opensearch snapshot repositories configuration |
cluster.cluster.general.vendor | string | "Opensearch" | |
cluster.cluster.general.version | string | "2.19.1" | Opensearch version |
cluster.cluster.ingress.dashboards.annotations | object | {} | dashboards ingress annotations |
cluster.cluster.ingress.dashboards.className | string | "" | Ingress class name |
cluster.cluster.ingress.dashboards.enabled | bool | false | Enable ingress for dashboards service |
cluster.cluster.ingress.dashboards.hosts | list | [] | Ingress hostnames |
cluster.cluster.ingress.dashboards.tls | list | [] | Ingress tls configuration |
cluster.cluster.ingress.opensearch.annotations | object | {} | Opensearch ingress annotations |
cluster.cluster.ingress.opensearch.className | string | "" | Opensearch Ingress class name |
cluster.cluster.ingress.opensearch.enabled | bool | false | Enable ingress for Opensearch service |
cluster.cluster.ingress.opensearch.hosts | list | [] | Opensearch Ingress hostnames |
cluster.cluster.ingress.opensearch.tls | list | [] | Opensearch tls configuration |
cluster.cluster.initHelper.imagePullPolicy | string | "IfNotPresent" | initHelper image pull policy |
cluster.cluster.initHelper.imagePullSecrets | list | [] | initHelper image pull secret |
cluster.cluster.initHelper.resources | object | {} | initHelper pod cpu and memory resources |
cluster.cluster.initHelper.version | string | "1.36" | initHelper version |
cluster.cluster.labels | object | {} | OpenSearchCluster labels |
cluster.cluster.name | string | "" | OpenSearchCluster name, by default release name is used |
cluster.cluster.nodePools | list | [{"component":"main","diskSize":"30Gi","replicas":3,"resources":{"limits":{"cpu":"500m","memory":"1Gi"},"requests":{"cpu":"500m","memory":"1Gi"}},"roles":["cluster_manager"]},{"component":"data","diskSize":"30Gi","replicas":1,"resources":{"limits":{"cpu":"500m","memory":"1Gi"},"requests":{"cpu":"500m","memory":"1Gi"}},"roles":["data"]},{"component":"client","diskSize":"30Gi","replicas":1,"resources":{"limits":{"cpu":"500m","memory":"1Gi"},"requests":{"cpu":"500m","memory":"1Gi"}},"roles":["client"]}] | Opensearch nodes configuration |
cluster.cluster.security.config.adminCredentialsSecret | object | {} | Secret that contains fields username and password to be used by the operator to access the opensearch cluster for node draining. Must be set if custom securityconfig is provided. |
cluster.cluster.security.config.adminSecret | object | {} | TLS Secret that contains a client certificate (tls.key, tls.crt, ca.crt) with admin rights in the opensearch cluster. Must be set if transport certificates are provided by user and not generated |
cluster.cluster.security.config.securityConfigSecret | object | {} | Secret that contains the differnt yml files of the opensearch-security config (config.yml, internal_users.yml, etc) |
cluster.cluster.security.tls.http.caSecret | object | {} | Optional, secret that contains the ca certificate as ca.crt. If this and generate=true is set the existing CA cert from that secret is used to generate the node certs. In this case must contain ca.crt and ca.key fields |
cluster.cluster.security.tls.http.generate | bool | true | If set to true the operator will generate a CA and certificates for the cluster to use, if false - secrets with existing certificates must be supplied |
cluster.cluster.security.tls.http.secret | object | {} | Optional, name of a TLS secret that contains ca.crt, tls.key and tls.crt data. If ca.crt is in a different secret provide it via the caSecret field |
cluster.cluster.security.tls.transport.adminDn | list | [] | DNs of certificates that should have admin access, mainly used for securityconfig updates via securityadmin.sh, only used when existing certificates are provided |
cluster.cluster.security.tls.transport.caSecret | object | {} | Optional, secret that contains the ca certificate as ca.crt. If this and generate=true is set the existing CA cert from that secret is used to generate the node certs. In this case must contain ca.crt and ca.key fields |
cluster.cluster.security.tls.transport.generate | bool | true | If set to true the operator will generate a CA and certificates for the cluster to use, if false secrets with existing certificates must be supplied |
cluster.cluster.security.tls.transport.nodesDn | list | [] | Allowed Certificate DNs for nodes, only used when existing certificates are provided |
cluster.cluster.security.tls.transport.perNode | bool | true | Separate certificate per node |
cluster.cluster.security.tls.transport.secret | object | {} | Optional, name of a TLS secret that contains ca.crt, tls.key and tls.crt data. If ca.crt is in a different secret provide it via the caSecret field |
cluster.componentTemplates | list | [] | List of OpensearchComponentTemplate. Check values.yaml file for examples. |
cluster.fullnameOverride | string | "" | |
cluster.indexTemplates | list | [] | List of OpensearchIndexTemplate. Check values.yaml file for examples. |
cluster.indexTemplatesWorkAround | list | [{"dataStream":{"timestamp_field":{"name":"@timestamp"}},"indexPatterns":["logs*"],"name":"logs-index-template","priority":100,"templateSpec":{"mappings":{"properties":{"@timestamp":{"type":"date"},"message":{"type":"text"}}},"settings":{"index":{"number_of_replicas":1,"number_of_shards":1,"refresh_interval":"1s"}}}}] | List of OpensearchIndexTemplate. Check values.yaml file for examples. |
cluster.ismPolicies | list | [{"defaultState":"hot","description":"Policy to rollover logs after 7d, 30GB or 50M docs and delete after 30d","ismTemplate":{"indexPatterns":["logs*"],"priority":100},"name":"logs-rollover-policy","states":[{"actions":[{"rollover":{"minDocCount":50000000,"minIndexAge":"7d","minSize":"30gb"}}],"name":"hot","transitions":[{"conditions":{"minIndexAge":"30d"},"stateName":"delete"}]},{"actions":[{"delete":{}}],"name":"delete","transitions":[]}]}] | List of OpenSearchISMPolicy. Check values.yaml file for examples. |
cluster.nameOverride | string | "" | |
cluster.roles | list | [{"clusterPermissions":["cluster_monitor","cluster_composite_ops","cluster:admin/ingest/pipeline/put","cluster:admin/ingest/pipeline/get","indices:admin/template/get","cluster_manage_index_templates"],"indexPermissions":[{"allowedActions":["indices:admin/template/get","indices:admin/template/put","indices:admin/mapping/put","indices:admin/create","indices:data/write/bulk*","indices:data/write/index","indices:data/read*","indices:monitor*","indices_all"],"indexPatterns":["logs*"]}],"name":"logs-role"}] | List of OpensearchRole. Check values.yaml file for examples. |
cluster.serviceAccount.annotations | object | {} | Service Account annotations |
cluster.serviceAccount.create | bool | false | Create Service Account |
cluster.serviceAccount.name | string | "" | Service Account name. Set general.serviceAccount to use this Service Account for the Opensearch cluster |
cluster.tenants | list | [] | List of additional tenants. Check values.yaml file for examples. |
cluster.users | list | [{"backendRoles":[],"name":"logs","opendistroSecurityRoles":["logs-role"],"password":"","secretKey":"password","secretName":"logs-credentials"}] | List of OpensearchUser. Check values.yaml file for examples. |
cluster.usersRoleBinding | list | [{"name":"logs-access","roles":["logs-role"],"users":["logs"]}] | Allows to link any number of users, backend roles and roles with a OpensearchUserRoleBinding. Each user in the binding will be granted each role Check values.yaml file for examples. |
operator.fullnameOverride | string | "opensearch-operator" | |
operator.installCRDs | bool | false | |
operator.kubeRbacProxy.enable | bool | true | |
operator.kubeRbacProxy.image.repository | string | "gcr.io/kubebuilder/kube-rbac-proxy" | |
operator.kubeRbacProxy.image.tag | string | "v0.15.0" | |
operator.kubeRbacProxy.livenessProbe.failureThreshold | int | 3 | |
operator.kubeRbacProxy.livenessProbe.httpGet.path | string | "/healthz" | |
operator.kubeRbacProxy.livenessProbe.httpGet.port | int | 10443 | |
operator.kubeRbacProxy.livenessProbe.httpGet.scheme | string | "HTTPS" | |
operator.kubeRbacProxy.livenessProbe.initialDelaySeconds | int | 10 | |
operator.kubeRbacProxy.livenessProbe.periodSeconds | int | 15 | |
operator.kubeRbacProxy.livenessProbe.successThreshold | int | 1 | |
operator.kubeRbacProxy.livenessProbe.timeoutSeconds | int | 3 | |
operator.kubeRbacProxy.readinessProbe.failureThreshold | int | 3 | |
operator.kubeRbacProxy.readinessProbe.httpGet.path | string | "/healthz" | |
operator.kubeRbacProxy.readinessProbe.httpGet.port | int | 10443 | |
operator.kubeRbacProxy.readinessProbe.httpGet.scheme | string | "HTTPS" | |
operator.kubeRbacProxy.readinessProbe.initialDelaySeconds | int | 10 | |
operator.kubeRbacProxy.readinessProbe.periodSeconds | int | 15 | |
operator.kubeRbacProxy.readinessProbe.successThreshold | int | 1 | |
operator.kubeRbacProxy.readinessProbe.timeoutSeconds | int | 3 | |
operator.kubeRbacProxy.resources.limits.cpu | string | "50m" | |
operator.kubeRbacProxy.resources.limits.memory | string | "50Mi" | |
operator.kubeRbacProxy.resources.requests.cpu | string | "25m" | |
operator.kubeRbacProxy.resources.requests.memory | string | "25Mi" | |
operator.kubeRbacProxy.securityContext.allowPrivilegeEscalation | bool | false | |
operator.kubeRbacProxy.securityContext.capabilities.drop[0] | string | "ALL" | |
operator.kubeRbacProxy.securityContext.readOnlyRootFilesystem | bool | true | |
operator.manager.dnsBase | string | "cluster.local" | |
operator.manager.extraEnv | list | [] | |
operator.manager.image.pullPolicy | string | "Always" | |
operator.manager.image.repository | string | "opensearchproject/opensearch-operator" | |
operator.manager.image.tag | string | "" | |
operator.manager.imagePullSecrets | list | [] | |
operator.manager.livenessProbe.failureThreshold | int | 3 | |
operator.manager.livenessProbe.httpGet.path | string | "/healthz" | |
operator.manager.livenessProbe.httpGet.port | int | 8081 | |
operator.manager.livenessProbe.initialDelaySeconds | int | 10 | |
operator.manager.livenessProbe.periodSeconds | int | 15 | |
operator.manager.livenessProbe.successThreshold | int | 1 | |
operator.manager.livenessProbe.timeoutSeconds | int | 3 | |
operator.manager.loglevel | string | "debug" | |
operator.manager.parallelRecoveryEnabled | bool | true | |
operator.manager.pprofEndpointsEnabled | bool | false | |
operator.manager.readinessProbe.failureThreshold | int | 3 | |
operator.manager.readinessProbe.httpGet.path | string | "/readyz" | |
operator.manager.readinessProbe.httpGet.port | int | 8081 | |
operator.manager.readinessProbe.initialDelaySeconds | int | 10 | |
operator.manager.readinessProbe.periodSeconds | int | 15 | |
operator.manager.readinessProbe.successThreshold | int | 1 | |
operator.manager.readinessProbe.timeoutSeconds | int | 3 | |
operator.manager.resources.limits.cpu | string | "200m" | |
operator.manager.resources.limits.memory | string | "500Mi" | |
operator.manager.resources.requests.cpu | string | "100m" | |
operator.manager.resources.requests.memory | string | "350Mi" | |
operator.manager.securityContext.allowPrivilegeEscalation | bool | false | |
operator.manager.watchNamespace | string | nil | |
operator.nameOverride | string | "" | |
operator.namespace | string | "" | |
operator.nodeSelector | object | {} | |
operator.podAnnotations | object | {} | |
operator.podLabels | object | {} | |
operator.priorityClassName | string | "" | |
operator.securityContext.runAsNonRoot | bool | true | |
operator.serviceAccount.create | bool | true | |
operator.serviceAccount.name | string | "opensearch-operator-controller-manager" | |
operator.tolerations | list | [] | |
operator.useRoleBindings | bool | false |
Usage
Once deployed, OpenSearch can be accessed via OpenSearch Dashboards.
kubectl port-forward svc/opensearch-dashboards 5601:5601
Visit http://localhost:5601
in your browser and log in using the configured credentials.
Conclusion
This guide ensures that OpenSearch is fully integrated into the Greenhouse ecosystem, providing scalable log management and visualization. Additional custom configurations can be introduced to meet specific operational needs.
For troubleshooting and further details, check out the OpenSearch documentation.
2.13 - Perses
[!WARNING] This plugin is in beta and please report any bugs by creating an issue here.
Table of Contents
- Table of Contents
- Overview
- Disclaimer
- Quick Start
- Configuration
- Create a custom dashboard
- Add Dashboards as ConfigMaps
Learn more about the Perses Plugin. Use it to visualize Prometheus/Thanos metrics for your Greenhouse remote cluster.
The main terminologies used in this document can be found in core-concepts.
Overview
Observability is often required for the operation and automation of service offerings. Perses is a CNCF project and it aims to become an open-standard for dashboards and visualization. It provides you with tools to display Prometheus metrics on live dashboards with insightful charts and visualizations. In the Greenhouse context, this complements the kube-monitoring plugin, which automatically acts as a Perses data source which is recognized by Perses. In addition, the Plugin provides a mechanism that automates the lifecycle of datasources and dashboards without having to restart Perses.
Disclaimer
This is not meant to be a comprehensive package that covers all scenarios. If you are an expert, feel free to configure the Plugin according to your needs.
Contribution is highly appreciated. If you discover bugs or want to add functionality to the plugin, then pull requests are always welcome.
Quick Start
This guide provides a quick and straightforward way how to use Perses as a Greenhouse Plugin on your Kubernetes cluster.
Prerequisites
- A running and Greenhouse-managed Kubernetes remote cluster
kube-monitoring
Plugin should be installed with.spec.kubeMonitoring.prometheus.persesDatasource: true
and it should have at least one Prometheus instance running in the cluster
The plugin works by default with anonymous access enabled. This plugin comes with some default dashboards and the kube-monitoring datasource will be automatically discovered by the plugin.
Step 1: Add your dashboards and datasources
Dashboards are selected from ConfigMaps
across namespaces. The plugin searches for ConfigMaps
with the label perses.dev/resource: "true"
and imports them into Perses. The ConfigMap
must contain a key like my-dashboard.json
with the dashboard JSON content. Please refer this section for more information.
A guide on how to create custom dashboards on the UI can be found here.
Values
Key | Type | Default | Description |
---|---|---|---|
global.commonLabels | object | {} | Labels to add to all resources. This can be used to add a support_group or service label to all resources and alerting rules. |
greenhouse | object | {"defaultDashboards":{"enabled":true}} | By setting this to true, You will get some default dashboards |
perses.additionalLabels | object | {} | |
perses.annotations | object | {} | Statefulset Annotations |
perses.config.annotations | object | {} | Annotations for config |
perses.config.api_prefix | string | "/perses" | |
perses.config.database | object | {"file":{"extension":"json","folder":"/perses"}} | Database config based on data base type |
perses.config.database.file | object | {"extension":"json","folder":"/perses"} | file system configs |
perses.config.frontend | object | {"important_dashboards":[]} | Important dashboards list |
perses.config.provisioning | object | {"folders":["/etc/perses/provisioning"]} | provisioning config |
perses.config.schemas | object | {"datasources_path":"/etc/perses/cue/schemas/datasources","interval":"5m","panels_path":"/etc/perses/cue/schemas/panels","queries_path":"/etc/perses/cue/schemas/queries","variables_path":"/etc/perses/cue/schemas/variables"} | Schemas paths |
perses.config.security.cookie | object | {"same_site":"lax","secure":false} | cookie config |
perses.config.security.enable_auth | bool | false | Enable Authentication |
perses.config.security.readonly | bool | false | Configure Perses instance as readonly |
perses.fullnameOverride | string | "" | Override fully qualified app name |
perses.image | object | {"name":"persesdev/perses","pullPolicy":"IfNotPresent","version":""} | Image of Perses |
perses.image.name | string | "persesdev/perses" | Perses image repository and name |
perses.image.pullPolicy | string | "IfNotPresent" | Default image pull policy |
perses.image.version | string | "" | Overrides the image tag whose default is the chart appVersion. |
perses.ingress | object | {"annotations":{},"enabled":false,"hosts":[{"host":"perses.local","paths":[{"path":"/","pathType":"Prefix"}]}],"ingressClassName":"","tls":[]} | Configure the ingress resource that allows you to access Perses Frontend ref: https://kubernetes.io/docs/concepts/services-networking/ingress/ |
perses.ingress.annotations | object | {} | Additional annotations for the Ingress resource. To enable certificate autogeneration, place here your cert-manager annotations. For a full list of possible ingress annotations, please see ref: https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/nginx-configuration/annotations.md |
perses.ingress.enabled | bool | false | Enable ingress controller resource |
perses.ingress.hosts | list | [{"host":"perses.local","paths":[{"path":"/","pathType":"Prefix"}]}] | Default host for the ingress resource |
perses.ingress.ingressClassName | string | "" | IngressClass that will be be used to implement the Ingress (Kubernetes 1.18+) This is supported in Kubernetes 1.18+ and required if you have more than one IngressClass marked as the default for your cluster . ref: https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/ |
perses.ingress.tls | list | [] | Ingress TLS configuration |
perses.livenessProbe | object | {"enabled":true,"failureThreshold":5,"initialDelaySeconds":10,"periodSeconds":60,"successThreshold":1,"timeoutSeconds":5} | Liveness probe configuration Ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/ |
perses.logLevel | string | "info" | Log level for Perses be configured in available options “panic”, “error”, “warning”, “info”, “debug”, “trace” |
perses.nameOverride | string | "" | Override name of the chart used in Kubernetes object names. |
perses.persistence | object | {"accessModes":["ReadWriteOnce"],"annotations":{},"enabled":false,"labels":{},"securityContext":{"fsGroup":2000},"size":"8Gi"} | Persistence parameters |
perses.persistence.accessModes | list | ["ReadWriteOnce"] | PVC Access Modes for data volume |
perses.persistence.annotations | object | {} | Annotations for the PVC |
perses.persistence.enabled | bool | false | If disabled, it will use a emptydir volume |
perses.persistence.labels | object | {} | Labels for the PVC |
perses.persistence.securityContext | object | {"fsGroup":2000} | Security context for the PVC when persistence is enabled |
perses.persistence.size | string | "8Gi" | PVC Storage Request for data volume |
perses.readinessProbe | object | {"enabled":true,"failureThreshold":5,"initialDelaySeconds":5,"periodSeconds":10,"successThreshold":1,"timeoutSeconds":5} | Readiness probe configuration Ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/ |
perses.replicas | int | 1 | Number of pod replicas. |
perses.resources | object | {} | Resource limits & requests. Update according to your own use case as these values might be too low for a typical deployment. ref: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ |
perses.service | object | {"annotations":{},"labels":{"greenhouse.sap/expose":"true"},"port":8080,"portName":"http","targetPort":8080,"type":"ClusterIP"} | Expose the Perses service to be accessed from outside the cluster (LoadBalancer service). or access it from within the cluster (ClusterIP service). Set the service type and the port to serve it. |
perses.service.annotations | object | {} | Annotations to add to the service |
perses.service.labels | object | {"greenhouse.sap/expose":"true"} | Labeles to add to the service |
perses.service.port | int | 8080 | Service Port |
perses.service.portName | string | "http" | Service Port Name |
perses.service.targetPort | int | 8080 | Perses running port |
perses.service.type | string | "ClusterIP" | Service Type |
perses.serviceAccount | object | {"annotations":{},"create":true,"name":""} | Service account for Perses to use. |
perses.serviceAccount.annotations | object | {} | Annotations to add to the service account |
perses.serviceAccount.create | bool | true | Specifies whether a service account should be created |
perses.serviceAccount.name | string | "" | The name of the service account to use. If not set and create is true, a name is generated using the fullname template |
perses.sidecar | object | {"allNamespaces":true,"enabled":true,"label":"perses.dev/resource","labelValue":"true"} | Sidecar configuration that watches for ConfigMaps with the specified label/labelValue and loads them into Perses provisioning |
perses.sidecar.allNamespaces | bool | true | check for configmaps from all namespaces. When set to false, it will only check for configmaps in the same namespace as the Perses instance |
perses.sidecar.enabled | bool | true | Enable the sidecar container for ConfigMap provisioning |
perses.sidecar.label | string | "perses.dev/resource" | Label key to watch for ConfigMaps containing Perses resources |
perses.sidecar.labelValue | string | "true" | Label value to watch for ConfigMaps containing Perses resources |
perses.volumeMounts | list | [] | Additional VolumeMounts on the output StatefulSet definition. |
perses.volumes | list | [] | Additional volumes on the output StatefulSet definition. |
Create a custom dashboard
- Add a new Project by clicking on ADD PROJECT in the top right corner. Give it a name and click Add.
- Add a new dashboard by clicking on ADD DASHBOARD. Give it a name and click Add.
- Now you can add variables, panels to your dashboard.
- You can group your panels by adding the panels to a Panel Group.
- Move and resize the panels as needed.
- Watch this gif to learn more.
- You do not need to add the kube-monitoring datasource manually. It will be automatically discovered by Perses.
- Click Save after you have made changes.
- Export the dashboard.
- Click on the {} icon in the top right corner of the dashboard.
- Copy the entire JSON model.
- See the next section for detailed instructions on how and where to paste the copied dashboard JSON model.
Add Dashboards as ConfigMaps
By default, a sidecar container is deployed in the Perses pod. This container watches all configmaps in the cluster and filters out the ones with a label perses.dev/resource: "true"
. The files defined in those configmaps are written to a folder and this folder is accessed by Perses. Changes to the configmaps are continuously monitored and are reflected in Perses within 10 minutes.
A recommendation is to use one configmap per dashboard. This way, you can easily manage the dashboards in your git repository.
Recommended folder structure
Folder structure:
dashboards/
├── dashboard1.json
├── dashboard2.json
├── prometheusdatasource1.json
├── prometheusdatasource2.json
templates/
├──dashboard-json-configmap.yaml
Helm template to create a configmap for each dashboard:
{{- range $path, $bytes := .Files.Glob "dashboards/*.json" }}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ printf "%s-%s" $.Release.Name $path | replace "/" "-" | trunc 63 }}
labels:
perses.dev/resource: "true"
data:
{{ printf "%s: |-" $path | replace "/" "-" | indent 2 }}
{{ printf "%s" $bytes | indent 4 }}
{{- end }}
2.14 - Plutono
Learn more about the plutono Plugin. Use it to install the web dashboarding system Plutono to collect, correlate, and visualize Prometheus metrics for your Greenhouse cluster.
The main terminologies used in this document can be found in core-concepts.
Overview
Observability is often required for the operation and automation of service offerings. Plutono provides you with tools to display Prometheus metrics on live dashboards with insightful charts and visualizations. In the Greenhouse context, this complements the kube-monitoring plugin, which automatically acts as a Plutono data source which is recognized by Plutono. In addition, the Plugin provides a mechanism that automates the lifecycle of datasources and dashboards without having to restart Plutono.
Disclaimer
This is not meant to be a comprehensive package that covers all scenarios. If you are an expert, feel free to configure the Plugin according to your needs.
Contribution is highly appreciated. If you discover bugs or want to add functionality to the plugin, then pull requests are always welcome.
Quick Start
This guide provides a quick and straightforward way how to use Plutono as a Greenhouse Plugin on your Kubernetes cluster.
Prerequisites
- A running and Greenhouse-managed Kubernetes cluster
kube-monitoring
Plugin installed to have at least one Prometheus instance running in the cluster
The plugin works by default with anonymous access enabled. If you use the standard configuration in the kube-monitoring plugin, the data source and some kubernetes-operations dashboards are already pre-installed.
Step 1: Add your dashboards
Dashboards are selected from ConfigMaps
across namespaces. The plugin searches for ConfigMaps
with the label plutono-dashboard: "true"
and imports them into Plutono. The ConfigMap
must contain a key like my-dashboard.json
with the dashboard JSON content. Example
A guide on how to create dashboards can be found here.
Step 2: Add your datasources
Data sources are selected from Secrets
across namespaces. The plugin searches for Secrets
with the label plutono-dashboard: "true"
and imports them into Plutono. The Secrets
should contain valid datasource configuration YAML. Example
Values
Key | Type | Default | Description |
---|---|---|---|
global.imagePullSecrets | list | [] | To help compatibility with other charts which use global.imagePullSecrets. Allow either an array of {name: pullSecret} maps (k8s-style), or an array of strings (more common helm-style). Can be templated. global: imagePullSecrets: - name: pullSecret1 - name: pullSecret2 or global: imagePullSecrets: - pullSecret1 - pullSecret2 |
global.imageRegistry | string | nil | Overrides the Docker registry globally for all images |
plutono.“plutono.ini”.“auth.anonymous”.enabled | bool | true | |
plutono.“plutono.ini”.“auth.anonymous”.org_role | string | "Admin" | |
plutono.“plutono.ini”.auth.disable_login_form | bool | true | |
plutono.“plutono.ini”.log.mode | string | "console" | |
plutono.“plutono.ini”.paths.data | string | "/var/lib/plutono/" | |
plutono.“plutono.ini”.paths.logs | string | "/var/log/plutono" | |
plutono.“plutono.ini”.paths.plugins | string | "/var/lib/plutono/plugins" | |
plutono.“plutono.ini”.paths.provisioning | string | "/etc/plutono/provisioning" | |
plutono.admin.existingSecret | string | "" | |
plutono.admin.passwordKey | string | "admin-password" | |
plutono.admin.userKey | string | "admin-user" | |
plutono.adminPassword | string | "strongpassword" | |
plutono.adminUser | string | "admin" | |
plutono.affinity | object | {} | Affinity for pod assignment (evaluated as template) ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity |
plutono.alerting | object | {} | |
plutono.assertNoLeakedSecrets | bool | true | |
plutono.automountServiceAccountToken | bool | true | Should the service account be auto mounted on the pod |
plutono.autoscaling | object | {"behavior":{},"enabled":false,"maxReplicas":5,"minReplicas":1,"targetCPU":"60","targetMemory":""} | Create HorizontalPodAutoscaler object for deployment type |
plutono.containerSecurityContext.allowPrivilegeEscalation | bool | false | |
plutono.containerSecurityContext.capabilities.drop[0] | string | "ALL" | |
plutono.containerSecurityContext.seccompProfile.type | string | "RuntimeDefault" | |
plutono.createConfigmap | bool | true | Enable creating the plutono configmap |
plutono.dashboardProviders | object | {} | |
plutono.dashboards | object | {} | |
plutono.dashboardsConfigMaps | object | {} | |
plutono.datasources | object | {} | |
plutono.deploymentStrategy | object | {"type":"RollingUpdate"} | See kubectl explain deployment.spec.strategy for more # ref: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy |
plutono.dnsConfig | object | {} | |
plutono.dnsPolicy | string | nil | dns configuration for pod |
plutono.downloadDashboards.env | object | {} | |
plutono.downloadDashboards.envFromSecret | string | "" | |
plutono.downloadDashboards.envValueFrom | object | {} | |
plutono.downloadDashboards.resources | object | {} | |
plutono.downloadDashboards.securityContext.allowPrivilegeEscalation | bool | false | |
plutono.downloadDashboards.securityContext.capabilities.drop[0] | string | "ALL" | |
plutono.downloadDashboards.securityContext.seccompProfile.type | string | "RuntimeDefault" | |
plutono.downloadDashboardsImage.pullPolicy | string | "IfNotPresent" | |
plutono.downloadDashboardsImage.registry | string | "docker.io" | The Docker registry |
plutono.downloadDashboardsImage.repository | string | "curlimages/curl" | |
plutono.downloadDashboardsImage.sha | string | "" | |
plutono.downloadDashboardsImage.tag | string | "8.13.0" | |
plutono.enableKubeBackwardCompatibility | bool | false | Enable backward compatibility of kubernetes where version below 1.13 doesn’t have the enableServiceLinks option |
plutono.enableServiceLinks | bool | true | |
plutono.env | object | {} | |
plutono.envFromConfigMaps | list | [] | The names of conifgmaps in the same kubernetes namespace which contain values to be added to the environment Each entry should contain a name key, and can optionally specify whether the configmap must be defined with an optional key. Name is templated. ref: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.23/#configmapenvsource-v1-core |
plutono.envFromSecret | string | "" | The name of a secret in the same kubernetes namespace which contain values to be added to the environment This can be useful for auth tokens, etc. Value is templated. |
plutono.envFromSecrets | list | [] | The names of secrets in the same kubernetes namespace which contain values to be added to the environment Each entry should contain a name key, and can optionally specify whether the secret must be defined with an optional key. Name is templated. |
plutono.envRenderSecret | object | {} | Sensible environment variables that will be rendered as new secret object This can be useful for auth tokens, etc. If the secret values contains “{{”, they’ll need to be properly escaped so that they are not interpreted by Helm ref: https://helm.sh/docs/howto/charts_tips_and_tricks/#using-the-tpl-function |
plutono.envValueFrom | object | {} | |
plutono.extraConfigmapMounts | list | [] | Values are templated. |
plutono.extraContainerVolumes | list | [] | Volumes that can be used in init containers that will not be mounted to deployment pods |
plutono.extraContainers | string | "" | Enable an Specify container in extraContainers. This is meant to allow adding an authentication proxy to a plutono pod |
plutono.extraEmptyDirMounts | list | [] | |
plutono.extraExposePorts | list | [] | |
plutono.extraInitContainers | list | [] | Additional init containers (evaluated as template) ref: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/ |
plutono.extraLabels | object | {"plugin":"plutono"} | Apply extra labels to common labels. |
plutono.extraObjects | list | [] | Create a dynamic manifests via values: |
plutono.extraSecretMounts | list | [] | The additional plutono server secret mounts Defines additional mounts with secrets. Secrets must be manually created in the namespace. |
plutono.extraVolumeMounts | list | [] | The additional plutono server volume mounts Defines additional volume mounts. |
plutono.extraVolumes | list | [] | |
plutono.gossipPortName | string | "gossip" | |
plutono.headlessService | bool | false | Create a headless service for the deployment |
plutono.hostAliases | list | [] | overrides pod.spec.hostAliases in the plutono deployment’s pods |
plutono.image | object | {"pullPolicy":"IfNotPresent","pullSecrets":[],"registry":"ghcr.io","repository":"credativ/plutono","sha":"","tag":"v7.5.37"} | Use an alternate scheduler, e.g. “stork”. # ref: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/ # schedulerName: “default-scheduler” |
plutono.image.pullSecrets | list | [] | Optionally specify an array of imagePullSecrets. # Secrets must be manually created in the namespace. # ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/ # Can be templated. # |
plutono.image.tag | string | "v7.5.37" | Overrides the Plutono image tag whose default is the chart appVersion |
plutono.ingress.annotations | object | {} | |
plutono.ingress.enabled | bool | false | |
plutono.ingress.extraPaths | list | [] | Extra paths to prepend to every host configuration. This is useful when working with annotation based services. |
plutono.ingress.hosts[0] | string | "chart-example.local" | |
plutono.ingress.labels | object | {} | |
plutono.ingress.path | string | "/" | |
plutono.ingress.pathType | string | "Prefix" | pathType is only for k8s >= 1.1= |
plutono.ingress.tls | list | [] | |
plutono.lifecycleHooks | object | {} | |
plutono.livenessProbe.failureThreshold | int | 10 | |
plutono.livenessProbe.httpGet.path | string | "/api/health" | |
plutono.livenessProbe.httpGet.port | int | 3000 | |
plutono.livenessProbe.initialDelaySeconds | int | 60 | |
plutono.livenessProbe.timeoutSeconds | int | 30 | |
plutono.namespaceOverride | string | "" | |
plutono.networkPolicy.allowExternal | bool | true | @param networkPolicy.ingress When true enables the creation # an ingress network policy # |
plutono.networkPolicy.egress.blockDNSResolution | bool | false | @param networkPolicy.egress.blockDNSResolution When enabled, DNS resolution will be blocked # for all pods in the plutono namespace. |
plutono.networkPolicy.egress.enabled | bool | false | @param networkPolicy.egress.enabled When enabled, an egress network policy will be # created allowing plutono to connect to external data sources from kubernetes cluster. |
plutono.networkPolicy.egress.ports | list | [] | @param networkPolicy.egress.ports Add individual ports to be allowed by the egress |
plutono.networkPolicy.egress.to | list | [] | |
plutono.networkPolicy.enabled | bool | false | @param networkPolicy.enabled Enable creation of NetworkPolicy resources. Only Ingress traffic is filtered for now. # |
plutono.networkPolicy.explicitNamespacesSelector | object | {} | @param networkPolicy.explicitNamespacesSelector A Kubernetes LabelSelector to explicitly select namespaces from which traffic could be allowed # If explicitNamespacesSelector is missing or set to {}, only client Pods that are in the networkPolicy’s namespace # and that match other criteria, the ones that have the good label, can reach the plutono. # But sometimes, we want the plutono to be accessible to clients from other namespaces, in this case, we can use this # LabelSelector to select these namespaces, note that the networkPolicy’s namespace should also be explicitly added. # # Example: # explicitNamespacesSelector: # matchLabels: # role: frontend # matchExpressions: # - {key: role, operator: In, values: [frontend]} # |
plutono.networkPolicy.ingress | bool | true | @param networkPolicy.allowExternal Don’t require client label for connections # The Policy model to apply. When set to false, only pods with the correct # client label will have network access to plutono port defined. # When true, plutono will accept connections from any source # (with the correct destination port). # |
plutono.nodeSelector | object | {} | Node labels for pod assignment ref: https://kubernetes.io/docs/user-guide/node-selection/ |
plutono.persistence | object | {"accessModes":["ReadWriteOnce"],"disableWarning":false,"enabled":false,"extraPvcLabels":{},"finalizers":["kubernetes.io/pvc-protection"],"inMemory":{"enabled":false},"lookupVolumeName":true,"size":"10Gi","type":"pvc"} | Enable persistence using Persistent Volume Claims ref: http://kubernetes.io/docs/user-guide/persistent-volumes/ |
plutono.persistence.extraPvcLabels | object | {} | Extra labels to apply to a PVC. |
plutono.persistence.inMemory | object | {"enabled":false} | If persistence is not enabled, this allows to mount the # local storage in-memory to improve performance # |
plutono.persistence.lookupVolumeName | bool | true | If ’lookupVolumeName’ is set to true, Helm will attempt to retrieve the current value of ‘spec.volumeName’ and incorporate it into the template. |
plutono.plugins | list | [] | |
plutono.podDisruptionBudget | object | {} | See kubectl explain poddisruptionbudget.spec for more # ref: https://kubernetes.io/docs/tasks/run-application/configure-pdb/ |
plutono.podPortName | string | "plutono" | |
plutono.rbac.create | bool | true | |
plutono.rbac.extraClusterRoleRules | list | [] | |
plutono.rbac.extraRoleRules | list | [] | |
plutono.rbac.namespaced | bool | false | |
plutono.rbac.pspEnabled | bool | false | Use an existing ClusterRole/Role (depending on rbac.namespaced false/true) useExistingRole: name-of-some-role useExistingClusterRole: name-of-some-clusterRole |
plutono.rbac.pspUseAppArmor | bool | false | |
plutono.readinessProbe.httpGet.path | string | "/api/health" | |
plutono.readinessProbe.httpGet.port | int | 3000 | |
plutono.replicas | int | 1 | |
plutono.resources | object | {} | |
plutono.revisionHistoryLimit | int | 10 | |
plutono.securityContext.fsGroup | int | 472 | |
plutono.securityContext.runAsGroup | int | 472 | |
plutono.securityContext.runAsNonRoot | bool | true | |
plutono.securityContext.runAsUser | int | 472 | |
plutono.service | object | {"annotations":{},"appProtocol":"","enabled":true,"ipFamilies":[],"ipFamilyPolicy":"","labels":{"greenhouse.sap/expose":"true"},"loadBalancerClass":"","loadBalancerIP":"","loadBalancerSourceRanges":[],"port":80,"portName":"service","targetPort":3000,"type":"ClusterIP"} | Expose the plutono service to be accessed from outside the cluster (LoadBalancer service). # or access it from within the cluster (ClusterIP service). Set the service type and the port to serve it. # ref: http://kubernetes.io/docs/user-guide/services/ # |
plutono.service.annotations | object | {} | Service annotations. Can be templated. |
plutono.service.appProtocol | string | "" | Adds the appProtocol field to the service. This allows to work with istio protocol selection. Ex: “http” or “tcp” |
plutono.service.ipFamilies | list | [] | Sets the families that should be supported and the order in which they should be applied to ClusterIP as well. Can be IPv4 and/or IPv6. |
plutono.service.ipFamilyPolicy | string | "" | Set the ip family policy to configure dual-stack see Configure dual-stack |
plutono.serviceAccount.automountServiceAccountToken | bool | false | |
plutono.serviceAccount.create | bool | true | |
plutono.serviceAccount.labels | object | {} | |
plutono.serviceAccount.name | string | nil | |
plutono.serviceAccount.nameTest | string | nil | |
plutono.serviceMonitor.enabled | bool | false | If true, a ServiceMonitor CR is created for a prometheus operator # https://github.com/coreos/prometheus-operator # |
plutono.serviceMonitor.interval | string | "30s" | |
plutono.serviceMonitor.labels | object | {} | namespace: monitoring (defaults to use the namespace this chart is deployed to) |
plutono.serviceMonitor.metricRelabelings | list | [] | |
plutono.serviceMonitor.path | string | "/metrics" | |
plutono.serviceMonitor.relabelings | list | [] | |
plutono.serviceMonitor.scheme | string | "http" | |
plutono.serviceMonitor.scrapeTimeout | string | "30s" | |
plutono.serviceMonitor.targetLabels | list | [] | |
plutono.serviceMonitor.tlsConfig | object | {} | |
plutono.sidecar | object | {"alerts":{"enabled":false,"env":{},"extraMounts":[],"initAlerts":false,"label":"plutono_alert","labelValue":"","reloadURL":"http://localhost:3000/api/admin/provisioning/alerting/reload","resource":"both","script":null,"searchNamespace":null,"sizeLimit":{},"skipReload":false,"watchMethod":"WATCH"},"dashboards":{"SCProvider":true,"defaultFolderName":null,"enabled":true,"env":{},"envValueFrom":{},"extraMounts":[],"folder":"/tmp/dashboards","folderAnnotation":null,"label":"plutono-dashboard","labelValue":"true","provider":{"allowUiUpdates":false,"disableDelete":false,"folder":"","folderUid":"","foldersFromFilesStructure":false,"name":"sidecarProvider","orgid":1,"type":"file"},"reloadURL":"http://localhost:3000/api/admin/provisioning/dashboards/reload","resource":"both","script":null,"searchNamespace":"ALL","sizeLimit":{},"skipReload":false,"watchMethod":"WATCH"},"datasources":{"enabled":true,"env":{},"envValueFrom":{},"initDatasources":false,"label":"plutono-datasource","labelValue":"true","reloadURL":"http://localhost:3000/api/admin/provisioning/datasources/reload","resource":"both","script":null,"searchNamespace":"ALL","sizeLimit":{},"skipReload":false,"watchMethod":"WATCH"},"enableUniqueFilenames":false,"image":{"registry":"quay.io","repository":"kiwigrid/k8s-sidecar","sha":"","tag":"1.30.3"},"imagePullPolicy":"IfNotPresent","livenessProbe":{},"notifiers":{"enabled":false,"env":{},"initNotifiers":false,"label":"plutono_notifier","labelValue":"","reloadURL":"http://localhost:3000/api/admin/provisioning/notifications/reload","resource":"both","script":null,"searchNamespace":null,"sizeLimit":{},"skipReload":false,"watchMethod":"WATCH"},"readinessProbe":{},"resources":{},"securityContext":{"allowPrivilegeEscalation":false,"capabilities":{"drop":["ALL"]},"seccompProfile":{"type":"RuntimeDefault"}}} | Sidecars that collect the configmaps with specified label and stores the included files them into the respective folders Requires at least Plutono 5 to work and can’t be used together with parameters dashboardProviders, datasources and dashboards |
plutono.sidecar.alerts.env | object | {} | Additional environment variables for the alerts sidecar |
plutono.sidecar.alerts.label | string | "plutono_alert" | label that the configmaps with alert are marked with |
plutono.sidecar.alerts.labelValue | string | "" | value of label that the configmaps with alert are set to |
plutono.sidecar.alerts.resource | string | "both" | search in configmap, secret or both |
plutono.sidecar.alerts.searchNamespace | string | nil | If specified, the sidecar will search for alert config-maps inside this namespace. Otherwise the namespace in which the sidecar is running will be used. It’s also possible to specify ALL to search in all namespaces |
plutono.sidecar.alerts.watchMethod | string | "WATCH" | Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds. |
plutono.sidecar.dashboards.defaultFolderName | string | nil | The default folder name, it will create a subfolder under the folder and put dashboards in there instead |
plutono.sidecar.dashboards.extraMounts | list | [] | Additional dashboard sidecar volume mounts |
plutono.sidecar.dashboards.folder | string | "/tmp/dashboards" | folder in the pod that should hold the collected dashboards (unless defaultFolderName is set) |
plutono.sidecar.dashboards.folderAnnotation | string | nil | If specified, the sidecar will look for annotation with this name to create folder and put graph here. You can use this parameter together with provider.foldersFromFilesStructure to annotate configmaps and create folder structure. |
plutono.sidecar.dashboards.provider | object | {"allowUiUpdates":false,"disableDelete":false,"folder":"","folderUid":"","foldersFromFilesStructure":false,"name":"sidecarProvider","orgid":1,"type":"file"} | watchServerTimeout: request to the server, asking it to cleanly close the connection after that. defaults to 60sec; much higher values like 3600 seconds (1h) are feasible for non-Azure K8S watchServerTimeout: 3600 watchClientTimeout: is a client-side timeout, configuring your local socket. If you have a network outage dropping all packets with no RST/FIN, this is how long your client waits before realizing & dropping the connection. defaults to 66sec (sic!) watchClientTimeout: 60 provider configuration that lets plutono manage the dashboards |
plutono.sidecar.dashboards.provider.allowUiUpdates | bool | false | allow updating provisioned dashboards from the UI |
plutono.sidecar.dashboards.provider.disableDelete | bool | false | disableDelete to activate a import-only behaviour |
plutono.sidecar.dashboards.provider.folder | string | "" | folder in which the dashboards should be imported in plutono |
plutono.sidecar.dashboards.provider.folderUid | string | "" | folder UID. will be automatically generated if not specified |
plutono.sidecar.dashboards.provider.foldersFromFilesStructure | bool | false | allow Plutono to replicate dashboard structure from filesystem |
plutono.sidecar.dashboards.provider.name | string | "sidecarProvider" | name of the provider, should be unique |
plutono.sidecar.dashboards.provider.orgid | int | 1 | orgid as configured in plutono |
plutono.sidecar.dashboards.provider.type | string | "file" | type of the provider |
plutono.sidecar.dashboards.reloadURL | string | "http://localhost:3000/api/admin/provisioning/dashboards/reload" | Endpoint to send request to reload alerts |
plutono.sidecar.dashboards.searchNamespace | string | "ALL" | Namespaces list. If specified, the sidecar will search for config-maps/secrets inside these namespaces. Otherwise the namespace in which the sidecar is running will be used. It’s also possible to specify ALL to search in all namespaces. |
plutono.sidecar.dashboards.sizeLimit | object | {} | Sets the size limit of the dashboard sidecar emptyDir volume |
plutono.sidecar.datasources.env | object | {} | Additional environment variables for the datasourcessidecar |
plutono.sidecar.datasources.initDatasources | bool | false | This is needed if skipReload is true, to load any datasources defined at startup time. Deploy the datasources sidecar as an initContainer. |
plutono.sidecar.datasources.reloadURL | string | "http://localhost:3000/api/admin/provisioning/datasources/reload" | Endpoint to send request to reload datasources |
plutono.sidecar.datasources.resource | string | "both" | search in configmap, secret or both |
plutono.sidecar.datasources.script | string | nil | Absolute path to shell script to execute after a datasource got reloaded |
plutono.sidecar.datasources.searchNamespace | string | "ALL" | If specified, the sidecar will search for datasource config-maps inside this namespace. Otherwise the namespace in which the sidecar is running will be used. It’s also possible to specify ALL to search in all namespaces |
plutono.sidecar.datasources.watchMethod | string | "WATCH" | Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds. |
plutono.sidecar.image.registry | string | "quay.io" | The Docker registry |
plutono.testFramework.enabled | bool | true | |
plutono.testFramework.image.registry | string | "ghcr.io" | |
plutono.testFramework.image.repository | string | "cloudoperators/greenhouse-extensions-integration-test" | |
plutono.testFramework.image.tag | string | "main" | |
plutono.testFramework.imagePullPolicy | string | "IfNotPresent" | |
plutono.testFramework.resources | object | {} | |
plutono.testFramework.securityContext | object | {} | |
plutono.tolerations | list | [] | Tolerations for pod assignment ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ |
plutono.topologySpreadConstraints | list | [] | Topology Spread Constraints ref: https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/ |
plutono.useStatefulSet | bool | false |
Example of extraVolumeMounts and extraVolumes
Configure additional volumes with extraVolumes
and volume mounts with extraVolumeMounts
.
Example for extraVolumeMounts
and corresponding extraVolumes
:
extraVolumeMounts:
- name: plugins
mountPath: /var/lib/plutono/plugins
subPath: configs/plutono/plugins
readOnly: false
- name: dashboards
mountPath: /var/lib/plutono/dashboards
hostPath: /usr/shared/plutono/dashboards
readOnly: false
extraVolumes:
- name: plugins
existingClaim: existing-plutono-claim
- name: dashboards
hostPath: /usr/shared/plutono/dashboards
Volumes default to emptyDir
. Set to persistentVolumeClaim
,
hostPath
, csi
, or configMap
for other types. For a
persistentVolumeClaim
, specify an existing claim name with
existingClaim
.
Import dashboards
There are a few methods to import dashboards to Plutono. Below are some examples and explanations as to how to use each method:
dashboards:
default:
some-dashboard:
json: |
{
"annotations":
...
# Complete json file here
...
"title": "Some Dashboard",
"uid": "abcd1234",
"version": 1
}
custom-dashboard:
# This is a path to a file inside the dashboards directory inside the chart directory
file: dashboards/custom-dashboard.json
prometheus-stats:
# Ref: https://plutono.com/dashboards/2
gnetId: 2
revision: 2
datasource: Prometheus
loki-dashboard-quick-search:
gnetId: 12019
revision: 2
datasource:
- name: DS_PROMETHEUS
value: Prometheus
local-dashboard:
url: https://raw.githubusercontent.com/user/repository/master/dashboards/dashboard.json
Create a dashboard
Click Dashboards in the main menu.
Click New and select New Dashboard.
Click Add new empty panel.
Important: Add a datasource variable as they are provisioned in the cluster.
- Go to Dashboard settings.
- Click Variables.
- Click Add variable.
- General: Configure the variable with a proper Name as Type
Datasource
. - Data source options: Select the data source Type e.g.
Prometheus
. - Click Update.
- Go back.
Develop your panels.
- On the Edit panel view, choose your desired Visualization.
- Select the datasource variable you just created.
- Write or construct a query in the query language of your data source.
- Move and resize the panels as needed.
Optionally add a tag to the dashboard to make grouping easier.
- Go to Dashboard settings.
- In the General section, add a Tag.
Click Save. Note that the dashboard is saved in the browser’s local storage.
Export the dashboard.
- Go to Dashboard settings.
- Click JSON Model.
- Copy the JSON model.
- Go to your Github repository and create a new JSON file in the
dashboards
directory.
BASE64 dashboards
Dashboards could be stored on a server that does not return JSON directly and instead of it returns a Base64 encoded file (e.g. Gerrit) A new parameter has been added to the url use case so if you specify a b64content value equals to true after the url entry a Base64 decoding is applied before save the file to disk. If this entry is not set or is equals to false not decoding is applied to the file before saving it to disk.
Gerrit use case
Gerrit API for download files has the following schema: https://yourgerritserver/a/{project-name}/branches/{branch-id}/files/{file-id}/content where {project-name} and {file-id} usually has ‘/’ in their values and so they MUST be replaced by %2F so if project-name is user/repo, branch-id is master and file-id is equals to dir1/dir2/dashboard the url value is https://yourgerritserver/a/user%2Frepo/branches/master/files/dir1%2Fdir2%2Fdashboard/content
Sidecar for dashboards
If the parameter sidecar.dashboards.enabled
is set, a sidecar container is deployed in the plutono
pod. This container watches all configmaps (or secrets) in the cluster and filters out the ones with
a label as defined in sidecar.dashboards.label
. The files defined in those configmaps are written
to a folder and accessed by plutono. Changes to the configmaps are monitored and the imported
dashboards are deleted/updated.
A recommendation is to use one configmap per dashboard, as a reduction of multiple dashboards inside one configmap is currently not properly mirrored in plutono.
NOTE: Configure your data sources in your dashboards as variables to keep them portable across clusters.
Example dashboard config:
Folder structure:
dashboards/
├── dashboard1.json
├── dashboard2.json
templates/
├──dashboard-json-configmap.yaml
Helm template to create a configmap for each dashboard:
{{- range $path, $bytes := .Files.Glob "dashboards/*.json" }}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ printf "%s-%s" $.Release.Name $path | replace "/" "-" | trunc 63 }}
labels:
plutono-dashboard: "true"
data:
{{ printf "%s: |-" $path | replace "/" "-" | indent 2 }}
{{ printf "%s" $bytes | indent 4 }}
{{- end }}
Sidecar for datasources
If the parameter sidecar.datasources.enabled
is set, an init container is deployed in the plutono
pod. This container lists all secrets (or configmaps, though not recommended) in the cluster and
filters out the ones with a label as defined in sidecar.datasources.label
. The files defined in
those secrets are written to a folder and accessed by plutono on startup. Using these yaml files,
the data sources in plutono can be imported.
Should you aim for reloading datasources in Plutono each time the config is changed, set sidecar.datasources.skipReload: false
and adjust sidecar.datasources.reloadURL
to http://<svc-name>.<namespace>.svc.cluster.local/api/admin/provisioning/datasources/reload
.
Secrets are recommended over configmaps for this usecase because datasources usually contain private data like usernames and passwords. Secrets are the more appropriate cluster resource to manage those.
Example datasource config:
apiVersion: v1
kind: Secret
metadata:
name: plutono-datasources
labels:
# default value for: sidecar.datasources.label
plutono-datasource: "true"
stringData:
datasources.yaml: |-
apiVersion: 1
datasources:
- name: my-prometheus
type: prometheus
access: proxy
orgId: 1
url: my-url-domain:9090
isDefault: false
jsonData:
httpMethod: 'POST'
editable: false
NOTE: If you might include credentials in your datasource configuration, make sure to not use stringdata but base64 encoded data instead.
apiVersion: v1
kind: Secret
metadata:
name: my-datasource
labels:
plutono-datasource: "true"
data:
# The key must contain a unique name and the .yaml file type
my-datasource.yaml: {{ include (print $.Template.BasePath "my-datasource.yaml") . | b64enc }}
Example values to add a datasource adapted from Grafana:
datasources:
datasources.yaml:
apiVersion: 1
datasources:
# <string, required> Sets the name you use to refer to
# the data source in panels and queries.
- name: my-prometheus
# <string, required> Sets the data source type.
type: prometheus
# <string, required> Sets the access mode, either
# proxy or direct (Server or Browser in the UI).
# Some data sources are incompatible with any setting
# but proxy (Server).
access: proxy
# <int> Sets the organization id. Defaults to orgId 1.
orgId: 1
# <string> Sets a custom UID to reference this
# data source in other parts of the configuration.
# If not specified, Plutono generates one.
uid:
# <string> Sets the data source's URL, including the
# port.
url: my-url-domain:9090
# <string> Sets the database user, if necessary.
user:
# <string> Sets the database name, if necessary.
database:
# <bool> Enables basic authorization.
basicAuth:
# <string> Sets the basic authorization username.
basicAuthUser:
# <bool> Enables credential headers.
withCredentials:
# <bool> Toggles whether the data source is pre-selected
# for new panels. You can set only one default
# data source per organization.
isDefault: false
# <map> Fields to convert to JSON and store in jsonData.
jsonData:
httpMethod: 'POST'
# <bool> Enables TLS authentication using a client
# certificate configured in secureJsonData.
# tlsAuth: true
# <bool> Enables TLS authentication using a CA
# certificate.
# tlsAuthWithCACert: true
# <map> Fields to encrypt before storing in jsonData.
secureJsonData:
# <string> Defines the CA cert, client cert, and
# client key for encrypted authentication.
# tlsCACert: '...'
# tlsClientCert: '...'
# tlsClientKey: '...'
# <string> Sets the database password, if necessary.
# password:
# <string> Sets the basic authorization password.
# basicAuthPassword:
# <int> Sets the version. Used to compare versions when
# updating. Ignored when creating a new data source.
version: 1
# <bool> Allows users to edit data sources from the
# Plutono UI.
editable: false
How to serve Plutono with a path prefix (/plutono)
In order to serve Plutono with a prefix (e.g., http://example.com/plutono), add the following to your values.yaml.
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: "nginx"
nginx.ingress.kubernetes.io/rewrite-target: /$1
nginx.ingress.kubernetes.io/use-regex: "true"
path: /plutono/?(.*)
hosts:
- k8s.example.dev
plutono.ini:
server:
root_url: http://localhost:3000/plutono # this host can be localhost
How to securely reference secrets in plutono.ini
This example uses Plutono file providers for secret values and the extraSecretMounts
configuration flag (Additional plutono server secret mounts) to mount the secrets.
In plutono.ini:
plutono.ini:
[auth.generic_oauth]
enabled = true
client_id = $__file{/etc/secrets/auth_generic_oauth/client_id}
client_secret = $__file{/etc/secrets/auth_generic_oauth/client_secret}
Existing secret, or created along with helm:
---
apiVersion: v1
kind: Secret
metadata:
name: auth-generic-oauth-secret
type: Opaque
stringData:
client_id: <value>
client_secret: <value>
- Include in the
extraSecretMounts
configuration flag:
- extraSecretMounts:
- name: auth-generic-oauth-secret-mount
secretName: auth-generic-oauth-secret
defaultMode: 0440
mountPath: /etc/secrets/auth_generic_oauth
readOnly: true
2.15 - Prometheus
Learn more about the prometheus plugin. Use it to deploy a single Prometheus for your Greenhouse cluster.
The main terminologies used in this document can be found in core-concepts.
Overview
Observability is often required for operation and automation of service offerings. To get the insights provided by an application and the container runtime environment, you need telemetry data in the form of metrics or logs sent to backends such as Prometheus or OpenSearch. With the prometheus Plugin, you will be able to cover the metrics part of the observability stack.
This Plugin includes a pre-configured package of Prometheus that help make getting started easy and efficient. At its core, an automated and managed Prometheus installation is provided using the prometheus-operator.
Components included in this Plugin:
- Prometheus
- optional: Prometheus Operator
Disclaimer
It is not meant to be a comprehensive package that covers all scenarios. If you are an expert, feel free to configure the plugin according to your needs.
The Plugin is a configured kube-prometheus-stack Helm chart which helps to keep track of versions and community updates. The intention is, to deliver a pre-configured package that work out of the box and can be extended by following the guide.
Also worth to mention, we reuse the existing kube-monitoring Greenhouse plugin helm chart, which already preconfigures Prometheus just by disabling the Kubernetes component scrapers and exporters.
Contribution is highly appreciated. If you discover bugs or want to add functionality to the plugin, then pull requests are always welcome.
Quick start
This guide provides a quick and straightforward way to deploy prometheus as a Greenhouse Plugin on your Kubernetes cluster.
Prerequisites
A running and Greenhouse-onboarded Kubernetes cluster. If you don’t have one, follow the Cluster onboarding guide.
Installed prometheus-operator and it’s custom resource definitions (CRDs). As a foundation we recommend installing the
kube-monitoring
plugin first in your cluster to provide the prometheus-operator and it’s CRDs. There are two paths to do it:- Go to Greenhouse dashboard and select the Prometheus plugin from the catalog. Specify the cluster and required option values.
- Create and specify a
Plugin
resource in your Greenhouse central cluster according to the examples.
Step 1:
If you want to run the prometheus plugin without installing kube-monitoring in the first place, then you need to switch kubeMonitoring.prometheusOperator.enabled
and kubeMonitoring.crds.enabled
to true
.
Step 2:
After installation, Greenhouse will provide a generated link to the Prometheus user interface. This is done via the annotation greenhouse.sap/expose: “true”
at the Prometheus Service
resource.
Step 3:
Greenhouse regularly performs integration tests that are bundled with prometheus. These provide feedback on whether all the necessary resources are installed and continuously up and running. You will find messages about this in the plugin status and also in the Greenhouse dashboard.
Configuration
Global options
Name | Description | Value |
---|---|---|
global.commonLabels | Labels to add to all resources. This can be used to add a support_group or service label to all resources and alerting rules. | true |
Prometheus-operator options
Name | Description | Value |
---|---|---|
kubeMonitoring.prometheusOperator.enabled | Manages Prometheus and Alertmanager components | true |
kubeMonitoring.prometheusOperator.alertmanagerInstanceNamespaces | Filter namespaces to look for prometheus-operator Alertmanager resources | [] |
kubeMonitoring.prometheusOperator.alertmanagerConfigNamespaces | Filter namespaces to look for prometheus-operator AlertmanagerConfig resources | [] |
kubeMonitoring.prometheusOperator.prometheusInstanceNamespaces | Filter namespaces to look for prometheus-operator Prometheus resources | [] |
Prometheus options
Name | Description | Value |
---|---|---|
kubeMonitoring.prometheus.enabled | Deploy a Prometheus instance | true |
kubeMonitoring.prometheus.annotations | Annotations for Prometheus | {} |
kubeMonitoring.prometheus.tlsConfig.caCert | CA certificate to verify technical clients at Prometheus Ingress | Secret |
kubeMonitoring.prometheus.ingress.enabled | Deploy Prometheus Ingress | true |
kubeMonitoring.prometheus.ingress.hosts | Must be provided if Ingress is enabled. | [] |
kubeMonitoring.prometheus.ingress.ingressClassname | Specifies the ingress-controller | nginx |
kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage | How large the persistent volume should be to house the prometheus database. Default 50Gi. | "" |
kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.storageClassName | The storage class to use for the persistent volume. | "" |
kubeMonitoring.prometheus.prometheusSpec.scrapeInterval | Interval between consecutive scrapes. Defaults to 30s | "" |
kubeMonitoring.prometheus.prometheusSpec.scrapeTimeout | Number of seconds to wait for target to respond before erroring | "" |
kubeMonitoring.prometheus.prometheusSpec.evaluationInterval | Interval between consecutive evaluations | "" |
kubeMonitoring.prometheus.prometheusSpec.externalLabels | External labels to add to any time series or alerts when communicating with external systems like Alertmanager | {} |
kubeMonitoring.prometheus.prometheusSpec.ruleSelector | PrometheusRules to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } } | {} |
kubeMonitoring.prometheus.prometheusSpec.serviceMonitorSelector | ServiceMonitors to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } } | {} |
kubeMonitoring.prometheus.prometheusSpec.podMonitorSelector | PodMonitors to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } } | {} |
kubeMonitoring.prometheus.prometheusSpec.probeSelector | Probes to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } } | {} |
kubeMonitoring.prometheus.prometheusSpec.scrapeConfigSelector | scrapeConfigs to be selected for target discovery. Defaults to { matchLabels: { plugin: <metadata.name> } } | {} |
kubeMonitoring.prometheus.prometheusSpec.retention | How long to retain metrics | "" |
kubeMonitoring.prometheus.prometheusSpec.logLevel | Log level to be configured for Prometheus | "" |
kubeMonitoring.prometheus.prometheusSpec.additionalScrapeConfigs | Next to ScrapeConfig CRD, you can use AdditionalScrapeConfigs, which allows specifying additional Prometheus scrape configurations | "" |
kubeMonitoring.prometheus.prometheusSpec.additionalArgs | Allows setting additional arguments for the Prometheus container | [] |
Alertmanager options
Name | Description | Value |
---|---|---|
alerts.enabled | To send alerts to Alertmanager | false |
alerts.alertmanager.hosts | List of Alertmanager hosts Prometheus can send alerts to | [] |
alerts.alertmanager.tlsConfig.cert | TLS certificate for communication with Alertmanager | Secret |
alerts.alertmanager.tlsConfig.key | TLS key for communication with Alertmanager | Secret |
Service Discovery
The prometheus Plugin provides a PodMonitor to automatically discover the Prometheus metrics of the Kubernetes Pods in any Namespace. The PodMonitor is configured to detect the metrics
endpoint of the Pods if the following annotations are set:
metadata:
annotations:
greenhouse/scrape: “true”
greenhouse/target: <prometheus plugin name>
Note: The annotations needs to be added manually to have the pod scraped and the port name needs to match.
Examples
Deploy kube-monitoring into a remote cluster
apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
name: prometheus
spec:
pluginDefinition: prometheus
disabled: false
optionValues:
- name: kubeMonitoring.prometheus.prometheusSpec.retention
value: 30d
- name: kubeMonitoring.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage
value: 100Gi
- name: kubeMonitoring.prometheus.service.labels
value:
greenhouse.sap/expose: "true"
- name: kubeMonitoring.prometheus.prometheusSpec.externalLabels
value:
cluster: example-cluster
organization: example-org
region: example-region
- name: alerts.enabled
value: true
- name: alerts.alertmanagers.hosts
value:
- alertmanager.dns.example.com
- name: alerts.alertmanagers.tlsConfig.cert
valueFrom:
secret:
key: tls.crt
name: tls-prometheus-<org-name>
- name: alerts.alertmanagers.tlsConfig.key
valueFrom:
secret:
key: tls.key
name: tls-prometheus-<org-name>
Extension of the plugin
prometheus can be extended with your own alerting rules and target configurations via the Custom Resource Definitions (CRDs) of the prometheus-operator. The user-defined resources to be incorporated with the desired configuration are defined via label selections.
The CRD PrometheusRule
enables the definition of alerting and recording rules that can be used by Prometheus or Thanos Rule instances. Alerts and recording rules are reconciled and dynamically loaded by the operator without having to restart Prometheus or Thanos Rule.
prometheus will automatically discover and load the rules that match labels plugin: <plugin-name>
.
Example:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: example-prometheus-rule
labels:
plugin: <metadata.name>
## e.g plugin: prometheus-network
spec:
groups:
- name: example-group
rules:
...
The CRDs PodMonitor
, ServiceMonitor
, Probe
and ScrapeConfig
allow the definition of a set of target endpoints to be scraped by prometheus. The operator will automatically discover and load the configurations that match labels plugin: <plugin-name>
.
Example:
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: example-pod-monitor
labels:
plugin: <metadata.name>
## e.g plugin: prometheus-network
spec:
selector:
matchLabels:
app: example-app
namespaceSelector:
matchNames:
- example-namespace
podMetricsEndpoints:
- port: http
...
2.16 - Service exposure test
This Plugin is just providing a simple exposed service for manual testing.
By adding the following label to a service it will become accessible from the central greenhouse system via a service proxy:
greenhouse.sap/expose: "true"
This plugin create an nginx deployment with an exposed service for testing.
Configuration
Specific port
By default expose would always use the first port. If you need another port, you’ve got to specify it by name:
greenhouse.sap/exposeNamedPort: YOURPORTNAME
2.17 - Teams2Slack
Introduction
This Plugin provides a Slack integration for a Greenhouse organization.
It manages Slack entities like channels, groups, handles, etc. and its members based on the teams configured in your Greenhouse organization.
Important: Please ensure that only one deployment of Teams2slack runs against the same set of groups in slack. Secondary instances should run in the provided Dry-Run mode. Otherwise you might notice inconsistencies if the Teammembership object of a cluster are uneqal.
Requirments
- A Kubernetes Cluster to run against
- The presence of the Greenhouse Teammemberships CRD and corresponding objects.
Architecture
The Teammembership contain the members of a team. Changes to an object will create an event in Kubernetes. This event will be consumed by the first controller. It creates a mirrored SlackGroup object that reflects the content of the Teammembership Object. This approach has the advantage that deletion of a team can be securely detected with the utilization of finalizers. The second controller detects changes on SlackGroup objects. The users present in a team will be aligned to a slack group.
Configuration
Deploy a the Teams2Slack Plugin and it’s Plugin which looks like the following structure (the following structure only includes the mandatory fields):
apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
name: teams2slack
namespace: default
spec:
pluginDefinition: teams2slack
disabled: false
optionValues:
- name: groupNamePrefix
value:
- name: groupNameSuffix
value:
- name: infoChannelID
value:
- name: token
valueFrom:
secret:
key: SLACK_TOKEN
name: teams2slack-secret
---
apiVersion: v1
kind: Secret
metadata:
name: teams2slack-secret
type: Opaque
data:
SLACK_TOKEN: // Slack token b64 encoded
The values that can or need to be provided have the following meaning:
Environment Variable | Meaning |
---|---|
groupNamePrefix (mandatory) | The prefix the created slack group should have. Choose a prefix that matches your organization. |
groupNameSuffix (mandatory) | The suffix the created slack group should have. Choose a suffix that matches your organization. |
infoChannelID (mandatory) | The channel ID created Slack Groups should have. You can currently define one slack ID which will be applied to all created groups. Make sure to take the channel ID and not the channel name. |
token(mandatory) | the slack token to authenticate against Slack. |
eventRequeueTimer (optional) | If a slack API requests fails due to a network error, or because data is currently fetched, it will be requed to the operators workQueue. Uses the golang date format. (1s = every second 1m = every minute ) |
loadDataBackoffTimer (optional) | Defines, when a Slack-API data call occurs. Uses the golang data format. |
dryRun (optional) | Slack write operations are not executed if value is set to true. Requires a valid. Requires: A valid SLACK_TOKEN; the other environment variables can be mocked. |
2.18 - Thanos
Learn more about the Thanos Plugin. Use it to enable extended metrics retention and querying across Prometheus servers and Greenhouse clusters.
The main terminologies used in this document can be found in core-concepts.
Overview
Thanos is a set of components that can be used to extend the storage and retrieval of metrics in Prometheus. It allows you to store metrics in a remote object store and query them across multiple Prometheus servers and Greenhouse clusters. This Plugin is intended to provide a set of pre-configured Thanos components that enable a proven composition. At the core, a set of Thanos components is installed that adds long-term storage capability to a single kube-monitoring Plugin and makes both current and historical data available again via one Thanos Query component.
The Thanos Sidecar is a component that is deployed as a container together with a Prometheus instance. This allows Thanos to optionally upload metrics to the object store and Thanos Query to access Prometheus data via a common, efficient StoreAPI.
The Thanos Compact component applies the Prometheus 2.0 Storage Engine compaction process to data uploaded to the object store. The Compactor is also responsible for applying the configured retention and downsampling of the data.
The Thanos Store also implements the StoreAPI and serves the historical data from an object store. It acts primarily as an API gateway and has no persistence itself.
Thanos Query implements the Prometheus HTTP v1 API for querying data in a Thanos cluster via PromQL. In short, it collects the data needed to evaluate the query from the connected StoreAPIs, evaluates the query and returns the result.
This plugin deploys the following Thanos components:
Planned components:
This Plugin does not deploy the following components:
- Thanos Sidecar This component is installed in the kube-monitoring plugin.
Disclaimer
It is not meant to be a comprehensive package that covers all scenarios. If you are an expert, feel free to configure the Plugin according to your needs.
Contribution is highly appreciated. If you discover bugs or want to add functionality to the plugin, then pull requests are always welcome.
Quick start
This guide provides a quick and straightforward way to use Thanos as a Greenhouse Plugin on your Kubernetes cluster. The guide is meant to build the following setup.
Prerequisites
- A running and Greenhouse-onboarded Kubernetes cluster. If you don’t have one, follow the Cluster onboarding guide.
- Ready to use credentials for a compatible object store
- kube-monitoring plugin installed. Thanos Sidecar on the Prometheus must be enabled by providing the required object store credentials.
Step 1:
Create a Kubernetes Secret with your object store credentials following the Object Store preparation section.
Step 2:
Enable the Thanos Sidecar on the Prometheus in the kube-monitoring plugin by providing the required object store credentials. Follow the kube-monitoring plugin enablement section.
Step 3:
Create a Thanos Query Plugin by following the Thanos Query section.
Configuration
Object Store preparation
To run Thanos, you need object storage credentials. Get the credentials of your provider and add them to a Kubernetes Secret. The Thanos documentation provides a great overview on the different supported store types.
Usually this looks somewhat like this
type: $STORAGE_TYPE
config:
user:
password:
domain:
...
If you’ve got everything in a file, deploy it in your remote cluster in the namespace, where Prometheus and Thanos will be.
Important: $THANOS_PLUGIN_NAME
is needed later for the respective Thanos plugin and they must not be different!
kubectl create secret generic $THANOS_PLUGIN_NAME-metrics-objectstore --from-file=thanos.yaml=/path/to/your/file
kube-monitoring plugin enablement
Prometheus in kube-monitoring needs to be altered to have a sidecar and ship metrics to the new object store too. You have to provide the Secret you’ve just created to the (most likely already existing) kube-monitoring plugin. Add this:
spec:
optionValues:
- name: kubeMonitoring.prometheus.prometheusSpec.thanos.objectStorageConfig.existingSecret.key
value: thanos.yaml
- name: kubeMonitoring.prometheus.prometheusSpec.thanos.objectStorageConfig.existingSecret.name
value: $THANOS_PLUGIN_NAME-metrics-objectstore
Values used here are described in the Prometheus Operator Spec.
Thanos Query
This is the real deal now: Define your Thanos Query by creating a plugin.
NOTE1: $THANOS_PLUGIN_NAME
needs to be consistent with your secret created earlier.
NOTE2: The releaseNamespace
needs to be the same as to where kube-monitoring resides. By default this is kube-monitoring.
apiVersion: greenhouse.sap/v1alpha1
kind: Plugin
metadata:
name: $YOUR_CLUSTER_NAME
spec:
pluginDefinition: thanos
disabled: false
clusterName: $YOUR_CLUSTER_NAME
releaseNamespace: kube-monitoring
Thanos Ruler
Thanos Ruler evaluates Prometheus rules against choosen query API. This allows evaluation of rules using metrics from different Prometheus instances.
To enable Thanos Ruler component creation (Thanos Ruler is disabled by default) you have to set:
spec:
optionsValues:
- name: thanos.ruler.enabled
value: true
Configuration
Alertmanager
For Thanos Ruler to communicate with Alertmanager we need to enable the appropriate configuration and provide secret/key names containing necessary SSO key and certificate to the Plugin.
Example of Plugin setup with Thanos Ruler using Alertmanager
spec:
optionsValues:
- name: thanos.ruler.enabled
value: true
- name: thanos.ruler.alertmanagers.enabled
value: true
- name: thanos.ruler.alertmanagers.authentication.ssoCert
valueFrom:
secret:
key: $KEY_NAME
name: $SECRET_NAME
- name: thanos.ruler.alertmanagers.authentication.ssoKey
valueFrom:
secret:
key: $KEY_NAME
name: $SECRET_NAME
[OPTIONAL] Handling your Prometheus and Thanos Stores.
Default Prometheus and Thanos Endpoint
Thanos Query is automatically adding the Prometheus and Thanos endpoints. If you just have a single Prometheus with Thanos enabled this will work out of the box. Details in the next two chapters. See Standalone Query for your own configuration.
Prometheus Endpoint
Thanos Query would check for a service prometheus-operated
in the same namespace with this GRPC port to be available 10901
. The cli option looks like this and is configured in the Plugin itself:
--store=prometheus-operated:10901
Thanos Endpoint
Thanos Query would check for a Thanos endpoint named like releaseName-store
. The associated command line flag for this parameter would look like:
--store=thanos-kube-store:10901
If you just have one occurence of this Thanos plugin dpeloyed, the default option would work and does not need anything else.
Standalone Query
In case you want to achieve a setup like above and have an overarching Thanos Query to run with multiple Stores, you can set it to standalone
and add your own store list. Setup your Plugin like this:
spec:
optionsValues:
- name: thanos.query.standalone
value: true
This would enable you to either:
query multiple stores with a single Query
spec: optionsValues: - name: thanos.query.stores value: - thanos-kube-1-store:10901 - thanos-kube-2-store:10901 - kube-monitoring-1-prometheus:10901 - kube-monitoring-2-prometheus:10901
query multiple Thanos Queries with a single Query Note that there is no
-store
suffix here in this case.spec: optionsValues: - name: thanos.query.stores value: - thanos-kube-1:10901 - thanos-kube-2:10901
Query GRPC Ingress
To expose the Thanos Query GRPC endpoint externally, you can configure an ingress resource. This is useful for enabling external tools or other clusters to query the Thanos Query component. Example configuration for enabling GRPC ingress:
grpc:
enabled: true
hosts:
- host: thanos.local
paths:
- path: /
pathType: ImplementationSpecific
TLS Ingress
To enable TLS for the Thanos Query GRPC endpoint, you can configure a TLS secret. This is useful for securing the communication between external clients and the Thanos Query component. Example configuration for enabling TLS ingress:
tls: []
- secretName: ingress-cert
hosts: [thanos.local]
Thanos Global Query
In the case of a multi-cluster setup, you may want your Thanos Query to be able to query all Thanos components in all clusters. This is possible by leveraging GRPC Ingress and TLS Ingress.
If your remote clusters are reachable via a common domain, you can add the endpoints of the remote clusters to the stores
list in the Thanos Query configuration. This allows the Thanos Query to query all Thanos components across all clusters.
spec:
optionsValues:
- name: thanos.query.stores
value:
- thanos.local-1:443
- thanos.local-2:443
- thanos.local-3:443
Pay attention to port numbers. The default port for GRPC is 443
.
Disable Individual Thanos Components
It is possible to disable certain Thanos components for your deployment. To do so add the necessary configuration to your Plugin (currently it is not possible to disable the query component)
- name: thanos.store.enabled
value: false
- name: thanos.compactor.enabled
value: false
Thanos Component | Enabled by default | Deactivatable | Flag |
---|---|---|---|
Query | True | False | n/a |
Store | True | True | thanos.store.enabled |
Compactor | True | True | thanos.compactor.enabled |
Ruler | False | True | thanos.ruler.enabled |
Operations
Thanos Compactor
If you deploy the plugin with the default values, Thanos compactor will be shipped too and use the same secret ($THANOS_PLUGIN_NAME-metrics-objectstore
) to retrieve, compact and push back timeseries.
Based on experience, a 100Gi-PVC is used in order not to overload the ephermeral storage of the Kubernetes Nodes. Depending on the configured retention and the amount of metrics, this may not be sufficient and larger volumes may be required. In any case, it is always safe to clear the volume of the compactor and increase it if necessary.
The object storage costs will be heavily impacted on how granular timeseries are being stored (reference Downsampling). These are the pre-configured defaults, you can change them as needed:
raw: 777600s (90d)
5m: 777600s (90d)
1h: 157680000 (5y)
Thanos ServiceMonitor
ServiceMonitor configures Prometheus to scrape metrics from all the deployed Thanos components.
To enable the creation of a ServiceMonitor we can use the Thanos Plugin configuration.
NOTE: You have to provide the serviceMonitorSelector matchLabels of your Prometheus instance. In the greenhouse context this should look like ‘plugin: $PROMETHEUS_PLUGIN_NAME’
spec:
optionsValues:
- name: thanos.serviceMonitor.selfMonitor
value: true
- name: thanos.serviceMonitor.labels
value:
plugin: $PROMETHEUS_PLUGIN_NAME
Values
Key | Type | Default | Description |
---|---|---|---|
global.commonLabels | object | the chart will add some internal labels automatically | Labels to apply to all resources |
global.imageRegistry | string | nil | Overrides the registry globally for all images |
thanos.compactor.additionalArgs | list | [] | Adding additional arguments to Thanos Compactor |
thanos.compactor.annotations | object | {} | Annotations to add to the Thanos Compactor resources |
thanos.compactor.compact.cleanupInterval | string | 1800s | Set Thanos Compactor compact.cleanup-interval |
thanos.compactor.compact.concurrency | string | 1 | Set Thanos Compactor compact.concurrency |
thanos.compactor.compact.waitInterval | string | 900s | Set Thanos Compactor wait-interval |
thanos.compactor.consistencyDelay | string | 1800s | Set Thanos Compactor consistency-delay |
thanos.compactor.containerLabels | object | {} | Labels to add to the Thanos Compactor container |
thanos.compactor.deploymentLabels | object | {} | Labels to add to the Thanos Compactor deployment |
thanos.compactor.enabled | bool | true | Enable Thanos Compactor component |
thanos.compactor.httpGracePeriod | string | 120s | Set Thanos Compactor http-grace-period |
thanos.compactor.logLevel | string | info | Thanos Compactor log level |
thanos.compactor.retentionResolution1h | string | 157680000s | Set Thanos Compactor retention.resolution-1h |
thanos.compactor.retentionResolution5m | string | 7776000s | Set Thanos Compactor retention.resolution-5m |
thanos.compactor.retentionResolutionRaw | string | 7776000s | Set Thanos Compactor retention.resolution-raw |
thanos.compactor.serviceLabels | object | {} | Labels to add to the Thanos Compactor service |
thanos.compactor.volume.labels | list | [] | Labels to add to the Thanos Compactor PVC resource |
thanos.compactor.volume.size | string | 100Gi | Set Thanos Compactor PersistentVolumeClaim size in Gi |
thanos.grpcAddress | string | 0.0.0.0:10901 | GRPC-address used across the stack |
thanos.httpAddress | string | 0.0.0.0:10902 | HTTP-address used across the stack |
thanos.image.pullPolicy | string | "IfNotPresent" | Thanos image pull policy |
thanos.image.repository | string | "quay.io/thanos/thanos" | Thanos image repository |
thanos.image.tag | string | "v0.38.0" | Thanos image tag |
thanos.query.additionalArgs | list | [] | Adding additional arguments to Thanos Query |
thanos.query.annotations | object | {} | Annotations to add to the Thanos Query resources |
thanos.query.autoDownsampling | bool | true | |
thanos.query.containerLabels | object | {} | Labels to add to the Thanos Query container |
thanos.query.deploymentLabels | object | {} | Labels to add to the Thanos Query deployment |
thanos.query.ingress.annotations | object | {} | Additional annotations for the Ingress resource. To enable certificate autogeneration, place here your cert-manager annotations. For a full list of possible ingress annotations, please see ref: https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/nginx-configuration/annotations.md |
thanos.query.ingress.enabled | bool | false | Enable ingress controller resource |
thanos.query.ingress.grpc.annotations | object | {} | Additional annotations for the Ingress resource.(GRPC) To enable certificate autogeneration, place here your cert-manager annotations. For a full list of possible ingress annotations, please see ref: https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/nginx-configuration/annotations.md |
thanos.query.ingress.grpc.enabled | bool | false | Enable ingress controller resource.(GRPC) |
thanos.query.ingress.grpc.hosts | list | [{"host":"thanos.local","paths":[{"path":"/","pathType":"Prefix"}]}] | Default host for the ingress resource.(GRPC) |
thanos.query.ingress.grpc.ingressClassName | string | "" | IngressClass that will be be used to implement the Ingress (Kubernetes 1.18+)(GRPC) This is supported in Kubernetes 1.18+ and required if you have more than one IngressClass marked as the default for your cluster . ref: https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/ |
thanos.query.ingress.grpc.tls | list | [] | Ingress TLS configuration. (GRPC) |
thanos.query.ingress.hosts | list | [{"host":"thanos.local","paths":[{"path":"/","pathType":"Prefix"}]}] | Default host for the ingress resource |
thanos.query.ingress.ingressClassName | string | "" | IngressClass that will be be used to implement the Ingress (Kubernetes 1.18+) This is supported in Kubernetes 1.18+ and required if you have more than one IngressClass marked as the default for your cluster . ref: https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/ |
thanos.query.ingress.tls | list | [] | Ingress TLS configuration |
thanos.query.logLevel | string | info | Thanos Query log level |
thanos.query.persesDatasource.create | bool | false | Creates a Perses datasource for standalone Thanos Query |
thanos.query.persesDatasource.selector | object | {} | Label selectors for the Perses sidecar to detect this datasource. |
thanos.query.plutonoDatasource.create | bool | false | Creates a Perses datasource for standalone Thanos Query |
thanos.query.plutonoDatasource.selector | object | {} | Label selectors for the Plutono sidecar to detect this datasource. |
thanos.query.replicaLabel | string | nil | |
thanos.query.replicas | string | nil | Number of Thanos Query replicas to deploy |
thanos.query.serviceLabels | object | {} | Labels to add to the Thanos Query service |
thanos.query.standalone | bool | false | |
thanos.query.stores | list | [] | |
thanos.query.tls.data | object | {} | |
thanos.query.tls.secretName | string | "" | |
thanos.query.web.externalPrefix | string | nil | |
thanos.query.web.routePrefix | string | nil | |
thanos.ruler.alertmanagers | object | nil | Configures the list of Alertmanager endpoints to send alerts to. The configuration format is defined at https://thanos.io/tip/components/rule.md/#alertmanager. |
thanos.ruler.alertmanagers.authentication.enabled | bool | true | Enable Alertmanager authentication for Thanos Ruler |
thanos.ruler.alertmanagers.authentication.ssoCert | string | nil | SSO Cert for Alertmanager authentication |
thanos.ruler.alertmanagers.authentication.ssoKey | string | nil | SSO Key for Alertmanager authentication |
thanos.ruler.alertmanagers.enabled | bool | true | Enable Thanos Ruler Alertmanager config |
thanos.ruler.alertmanagers.hosts | string | nil | List of hosts endpoints to send alerts to |
thanos.ruler.annotations | object | {} | Annotations to add to the Thanos Ruler resources |
thanos.ruler.enabled | bool | false | Enable Thanos Ruler components |
thanos.ruler.externalPrefix | string | "/ruler" | Set Thanos Ruler external prefix |
thanos.ruler.labels | object | {} | Labels to add to the Thanos Ruler deployment |
thanos.ruler.matchLabel | string | nil | TO DO |
thanos.ruler.serviceLabels | object | {} | Labels to add to the Thanos Ruler service |
thanos.serviceMonitor.labels | object | {} | Labels to add to the ServiceMonitor |
thanos.serviceMonitor.selfMonitor | bool | true | Create a serviceMonitor for Thanos components |
thanos.store.additionalArgs | list | [] | Adding additional arguments to Thanos Store |
thanos.store.annotations | object | {} | Annotations to add to the Thanos Store resources |
thanos.store.chunkPoolSize | string | 4GB | Set Thanos Store chunk-pool-size |
thanos.store.containerLabels | object | {} | Labels to add to the Thanos Store container |
thanos.store.deploymentLabels | object | {} | Labels to add to the Thanos Store deployment |
thanos.store.enabled | bool | true | Enable Thanos Store component |
thanos.store.indexCacheSize | string | 1GB | Set Thanos Store index-cache-size |
thanos.store.logLevel | string | info | Thanos Store log level |
thanos.store.serviceLabels | object | {} | Labels to add to the Thanos Store service |