TLDR: Templating VPA minAllowed from deployment resource requests prevents VPA from reducing resources. Popular charts make these fields optional and static. After fixing this in our charts: 93% CPU reduction on one service (200m → 15m).
I was debugging why our pods were still over-provisioned after months of VPA running in Initial mode. VPA recommendations looked reasonable - suggesting 15m CPU when we were requesting 200m - but nothing was changing.
Turns out our Helm charts had a trap built in.
The anti-pattern Link to heading
Here’s the problematic template pattern I found in multiple charts:
# templates/vpa.yaml
{{- if .Values.vpa.enabled }}
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: {{ include "app.fullname" . }}
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ include "app.fullname" . }}
updateMode: Initial
resourcePolicy:
containerPolicies:
- containerName: {{ .Chart.Name }}
minAllowed:
cpu: {{ .Values.resources.requests.cpu }} # ← THE TRAP
memory: {{ .Values.resources.requests.memory }}
{{- end }}
VPA can NEVER recommend lower than the current resource requests because minAllowed is hardcoded to match them. This defeats the entire purpose of right-sizing.
The impact Link to heading
Before I fixed this:
| Service | Chart request | VPA wanted | minAllowed blocked it at | After fix |
|---|---|---|---|---|
| api-service-a | 200m CPU | 15m | 200m | 93% reduction |
| search-service-b | 500m CPU | 2m | 500m | 99.6% reduction |
| worker-service-c | 100m CPU | 8m | 100m | 92% reduction |
VPA ran for months with no effect. The recommendations were there, but minAllowed prevented them from being applied.
Why this happens Link to heading
The trap: you template minAllowed from your initial resource guess. If your guess was too high (it usually is), VPA can never fix it.
Your guess (200m)
→ becomes minAllowed (200m)
→ blocks VPA from recommending lower (15m)
→ you stay at 200m forever
What minAllowed and maxAllowed are for Link to heading
These fields are guardrails, not starting points:
- maxAllowed: Prevent VPA from requesting more than node capacity
- minAllowed: Prevent VPA from going below functional minimum
Both are optional. Most workloads only need maxAllowed as a safety cap.
The critical mistake: templating minAllowed from current requests creates a circular dependency where VPA can never recommend changes.
How upstream charts do it Link to heading
Popular charts (Prometheus, Bitnami, Fluent-bit) make these fields optional and static. For example, Bitnami Redis:
vpa:
enabled: false
resourcePolicy:
containerPolicies:
- containerName: redis
minAllowed:
cpu: 50m # Static - Redis minimum to function
memory: 64Mi
maxAllowed:
cpu: 4000m # Static - based on node capacity
memory: 8Gi
Static values based on engineering judgment (“Redis needs at least 50m”), not templated from deployment values.
The fix Link to heading
Make minAllowed and maxAllowed optional:
# templates/vpa.yaml
{{- if .Values.vpa.enabled }}
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: {{ include "app.fullname" . }}
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ include "app.fullname" . }}
updateMode: {{ .Values.vpa.updateMode | default "Initial" }}
{{- if .Values.vpa.resourcePolicy }}
resourcePolicy:
containerPolicies:
{{- range .Values.vpa.resourcePolicy.containerPolicies }}
- containerName: {{ .containerName }}
{{- if .minAllowed }}
minAllowed:
{{- toYaml .minAllowed | nindent 8 }}
{{- end }}
{{- if .maxAllowed }}
maxAllowed:
{{- toYaml .maxAllowed | nindent 8 }}
{{- end }}
{{- end }}
{{- end }}
{{- end }}
And in values.yaml:
vpa:
enabled: false
updateMode: Initial
resourcePolicy:
containerPolicies: []
# Optional: Add static maxAllowed as safety cap
# resourcePolicy:
# containerPolicies:
# - containerName: app
# maxAllowed:
# cpu: 2000m # Static - prevents runaway recommendations
# memory: 2Gi
Applying the fix Link to heading
VPA Initial mode only applies on pod creation. After updating the VPA template, restart pods:
kubectl rollout restart deployment/my-app -n production
This is expected behaviour (kubernetes/autoscaler#5452). VPA’s admission controller mutates pod specs during creation, so existing pods keep their old requests until recreated.
Verify VPA recommendations:
# Check what VPA wants to recommend
kubectl get vpa my-app -n production -o jsonpath='{.status.recommendation.containerRecommendations[0].target}'
# After restart, verify pod got new requests
kubectl get pod my-app-xxx -n production -o yaml | grep -A5 requests:
Guidelines Link to heading
Never template minAllowed from requests - this is the anti-pattern that blocks right-sizing.
maxAllowed from limits is acceptable as a safety cap (prevents runaway recommendations), but creates config coupling. Better to use static values or leave empty.
Only set minAllowed if you have evidence the app won’t function below that threshold. Most workloads don’t need it.
Further reading Link to heading
- GKE cluster autoscaler - Overview of VPA, HPA, and cluster autoscaling
- Vertical Pod Autoscaler - The Definitive Guide
- Kubernetes VPA documentation