ProxyRequestErrorsHigh
Alert Description
This alert fires when more than 10% of HTTP requests result in 4xx (excluding 401/403) or 5xx errors for a proxy service for 15 minutes.
What does this alert mean?
Greenhouse proxy services (like service-proxy, cors-proxy, idproxy) handle HTTP traffic for various purposes. High error rates indicate that requests are failing, which affects user experience and functionality.
This could be due to:
- Backend services being unavailable or unhealthy
- Misconfigured routing or proxy rules
- Authentication/authorization issues (if 401/403 are included)
- Network connectivity problems to backend services
- Resource exhaustion in the proxy pod
- Invalid requests from clients
Diagnosis
Identify the Affected Proxy Service
The alert label proxy identifies which proxy service has high error rates:
greenhouse-service-proxy- Proxies requests to services in remote clusters. Is deployed to the<org-name>namespace, notgreenhouse!greenhouse-cors-proxy- Handles CORS for frontend applicationsgreenhouse-idproxy- Handles authentication and identity proxying
The placeholder <proxy-name> from here on is the above without the greenhouse- prefix. E.g. idproxy.
Check Proxy Metrics
Access the Prometheus instance monitoring your Greenhouse cluster and query the proxy request metrics using the following PromQL queries:
# Total HTTP requests by status code
http_requests_total{service="<proxy-name>"}
# Successful requests (2xx)
http_requests_total{service="<proxy-name>",status=~"2.."}
# Client errors (4xx, excluding 401/403)
http_requests_total{service="<proxy-name>",status=~"4..",status!~"40[13]"}
# Server errors (5xx)
http_requests_total{service="<proxy-name>",status=~"5.."}
# Error rate
(rate(http_requests_total{service="<proxy-name>",status=~"4..",status!~"40[13]"}[5m]) + rate(http_requests_total{service="<proxy-name>",status=~"5.."}[5m])) / rate(http_requests_total{service="<proxy-name>"}[5m])
Replace <proxy-name> with the actual proxy service name from the alert (e.g., greenhouse-service-proxy, greenhouse-cors-proxy, greenhouse-idproxy).
Check Proxy Logs
Important! the
service-proxyis deployed to the<org-name>namespace, notgreenhouse!
Review proxy logs for detailed error messages:
kubectl logs -n greenhouse -l app.kubernetes.io/name=<proxy-name> --tail=500 | grep -i error
For service-proxy specifically:
kubectl logs -n greenhouse -l app.kubernetes.io/name=idproxy --tail=500 | grep -E "error|status.*[45][0-9]{2}"
Look for:
- Backend connection failures
- Timeout errors
- Authentication/authorization failures
- Invalid routing or target service issues
Check Backend Service Health
If the proxy is routing to backend services, verify they are healthy. For service-proxy, check plugins with exposed services:
kubectl get plugins --all-namespaces -l greenhouse.sap/plugin-exposed-services=true
# Check if any plugins are not ready
kubectl get plugins --all-namespaces -l greenhouse.sap/plugin-exposed-services=true -o json | jq -r '.items[] | select(.status.statusConditions.conditions[]? | select(.type=="Ready" and .status!="True")) | "\(.metadata.namespace)/\(.metadata.name)"'
Check Proxy Pod Resource Usage
Verify the proxy pod has sufficient resources:
kubectl top pod -n greenhouse -l app=<service-name>
kubectl describe pod -n greenhouse -l app=<service-name>