I was checking the GKE cluster autoscaler configuration. Here’s how.

How long does scale-up take? Link to heading

In my experience with GKE:

  • Scale-up: 2-5 minutes from pending pod to running
  • Scale-down: 10+ minutes (configurable, conservative by default)

The scale-up time depends on node pool configuration. Preemptible/spot nodes can be slightly faster. If you need faster scale-up, consider keeping a small buffer of spare capacity.

View configuration Link to heading

View autoscaling config:

gcloud container clusters describe my-cluster \
  --region=europe-north1 \
  --format="yaml(autoscaling)"

View node pool autoprovisioning defaults:

gcloud container clusters describe my-cluster \
  --region=europe-north1 \
  --format="yaml(autoscaling.autoprovisioningNodePoolDefaults)"

Check autoscaler status in the cluster:

kubectl get cm/cluster-autoscaler-status -n kube-system -o yaml

View node allocatable resources:

kubectl describe nodes | grep -A5 "Allocatable"

Check scaling activity Link to heading

Check which nodes can be scaled down:

kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints

Nodes with ToBeDeletedByClusterAutoscaler taint are being removed.

Cost considerations Link to heading

  • Min nodes too high - Paying for idle capacity
  • Min nodes too low - Cold starts during traffic spikes
  • Scale-down too aggressive - Nodes churning up and down
  • Scale-down too conservative - Paying for unused nodes

I typically set min to handle baseline traffic, max to handle peak + 20%, and leave scale-down delay at the default (10 minutes). For cost savings, use spot/preemptible nodes for workloads that can handle interruption.

Monitor scaling with watch and top.

Further reading Link to heading