Go containers and the OOM killer

TLDR: Go doesn’t auto-detect container memory limits. Without GOMEMLIMIT, the GC lets the heap double freely until the OOM killer strikes. Read the cgroup limit at startup and set GOMEMLIMIT to ~85% of it via an entrypoint script so it adapts automatically when VPA adjusts your limits.

I was investigating a pod that had been crashlooping for 25 hours. The usual suspects — liveness probe failures, connection resets, CrashLoopBackOff. After deleting the pod and watching it come back healthy on a different node, I assumed it was a flaky spot instance. But looking closer, it was OOMing.

The container had a 290Mi memory limit (set by the Vertical Pod Autoscaler), and the Go process had absolutely no idea.

Go doesn’t know about your container Link to heading

This is the bit that surprised me. Go 1.19 introduced GOMEMLIMIT, a soft memory limit that tells the garbage collector to work harder as the heap approaches a threshold. But it’s not set by default, and the runtime doesn’t read cgroup limits on its own.

Without GOMEMLIMIT, Go’s GC uses GOGC=100 — which means “trigger a collection when the heap doubles.” If your heap is at 150Mi after the last GC, the next collection won’t happen until it hits 300Mi. In a container with a 290Mi limit, that’s an OOM kill.

The problem compounds with concurrent requests. Each search operation allocates its own temporary buffers. Between GC cycles, these stack up. A handful of simultaneous requests can push memory well past the limit before the garbage collector even wakes up.

The fix Link to heading

Set GOMEMLIMIT to around 85% of the container’s memory limit. This gives Go’s GC a target to aim for, and it’ll collect more aggressively as the heap approaches it. The remaining 15% is headroom for non-heap memory — goroutine stacks, memory-mapped files, OS-level allocations.

The question is how to set it. Hardcoding a value is fragile — if you’re using a VPA, the memory limit changes over time as the autoscaler adjusts recommendations. A value that works today might be too high or too low next week.

I’d recommend reading the cgroup limit at container startup and deriving GOMEMLIMIT dynamically. A simple entrypoint script does the job:

#!/bin/sh
if [ -z "$GOMEMLIMIT" ] && [ -f /sys/fs/cgroup/memory.max ]; then
    MEM_MAX=$(cat /sys/fs/cgroup/memory.max)
    if [ "$MEM_MAX" != "max" ]; then
        MEM_PERCENTAGE=${GOMEMLIMIT_PERCENTAGE:-85}
        GOMEMLIMIT=$(( MEM_MAX * MEM_PERCENTAGE / 100 ))
        export GOMEMLIMIT
    fi
fi
exec /app/server "$@"

A few details worth noting:

The -z "$GOMEMLIMIT" check means you can still override it manually via an environment variable if needed.
/sys/fs/cgroup/memory.max is the cgroup v2 path. If you’re on cgroup v1 (older kernels), the file is /sys/fs/cgroup/memory/memory.limit_in_bytes.
The value max means no limit is set — the script skips GOMEMLIMIT in that case.
The percentage is configurable via GOMEMLIMIT_PERCENTAGE so you can tune it per deployment without rebuilding the image.

Wire it into your Dockerfile:

COPY entrypoint.sh ./
CMD ["/app/entrypoint.sh"]

If something sits in front of your binary (like secrets-init or a similar wrapper), it’ll exec the entrypoint script which in turn execs your Go binary. The chain works transparently.

Why not use a library? Link to heading

There’s automaxprocs for GOMAXPROCS (auto-detecting CPU quota) and automemlimit that does roughly what this entrypoint script does but from within Go code. Both are fine options. I landed on the shell script approach because it avoids adding a dependency and works regardless of the language — if you ever swap the Go binary for something else, the entrypoint still works.

Reduce your allocation footprint too Link to heading

Setting GOMEMLIMIT is the big win, but it’s worth looking at what’s allocating memory in the first place. In my case, the service was fetching 100 search results internally for every request, even when clients only wanted 10. Each result loaded all stored fields and computed highlights. Reducing that internal buffer from 100 to 20 cut per-request memory significantly and made the GC’s job much easier.

The VPA connection Link to heading

This is especially relevant if you’re using a Vertical Pod Autoscaler. The VPA adjusts memory limits over time based on observed usage, but it only sees the process’s actual consumption — it doesn’t know that Go’s GC is leaving a huge gap between “memory in use” and “memory limit.” The VPA sees the process using 150Mi, recommends 290Mi as a comfortable target, and has no idea that Go won’t GC until 300Mi. The entrypoint script approach works well here because it re-reads the cgroup limit on every pod restart, so as the VPA tightens or loosens the allocation, GOMEMLIMIT follows automatically.

Go doesn’t know about your container Link to heading

The fix Link to heading

Why not use a library? Link to heading

Reduce your allocation footprint too Link to heading

The VPA connection Link to heading

Further reading Link to heading