← Posts

stall — A Terminal PSI Monitor Written in Go

CPU% is a blunt instrument. A machine sitting at 30% CPU can still be grinding — everything waiting on IO, pages being swapped, memory pressure stalling tasks — and top will show you nothing useful. The kernel has known this for a while. PSI was the answer.

That’s stall.

What Is Linux PSI and Why Does It Matter?

PSI — Pressure Stall Information — measures the percentage of time tasks were stalled waiting for a resource. Not utilization. Stall time. The distinction matters: a CPU at 30% utilization tells you how busy it is; PSI tells you whether anything is actually stuck waiting.

The kernel tracks three resources:

Resource What it tracks
CPU Tasks waiting for CPU time to be scheduled
IO Tasks waiting for disk or network IO to complete
MEMORY Tasks waiting for memory to be freed or pages to come in

PSI was merged in Linux 4.20. It’s been in the kernel long enough that most distributions ship with it enabled (CONFIG_PSI=y). Check yours with cat /proc/pressure/cpu — if it returns data, you have it.

How Do You Build and Run stall?

Requires Go 1.21+. No third-party packages. Build from source:

go build -o stall .
./stall

The binary is self-contained.

Press Ctrl+C to quit. The display refreshes every 2 seconds.

What Does stall Display?

Three sections — CPU, IO, MEMORY — each with two rows:

Column Description
TYPE some or full — explained below
10s % of time stalled over the last 10 seconds
60s % of time stalled over the last 60 seconds
5min % of time stalled over the last 5 minutes
PRESSURE Visual bar — fills and colors as pressure increases
STALLED TOTAL Cumulative stall time since boot, in microseconds

What Is the Difference Between some and full?

The kernel exposes two pressure metrics for IO and MEMORY, and one for CPU:

some — at least one task was stalled. Other tasks may still have been running. The system was making progress overall, but something was waiting.

full — every runnable task was stalled simultaneously. Nothing was running. This is the serious number. A non-zero full value means the machine genuinely ground to a halt for that fraction of the measurement window.

CPU has no full line. By definition, if one task is waiting for CPU, another is running — you can’t have 100% of tasks waiting for CPU at the same time. IO and MEMORY don’t have that constraint: everything can block on a disk read at once.

On an idle machine, all values read 0.00. Run a large rsync, a find /, or a memory-hungry compile and watch the relevant section respond in real time.

How Does stall Read PSI Without Root Access?

The kernel exposes PSI data in three files, readable by any user:

/proc/pressure/cpu
/proc/pressure/io
/proc/pressure/memory

Each file has the same format. /proc/pressure/io looks like:

some avg10=0.00 avg60=0.12 avg300=0.08 total=1482930
full avg10=0.00 avg60=0.04 avg300=0.02 total=512041

stall parses this with a line scanner. Each field is key=value. The averages are exponentially weighted moving averages maintained by the kernel — stall reads them directly; there’s no local averaging. The total field is cumulative microseconds since boot.

What Do the Pressure Colors Mean?

Color Threshold Meaning
Gray 0.00 No stalls — idle
Green < 10% Low pressure, healthy
Yellow 10–30% Moderate pressure, worth watching
Red 30%+ High pressure, something is struggling

The visual bar in the PRESSURE column fills left-to-right and switches color at the same thresholds. Glancing at the display, a red bar on IO full is immediate signal: tasks are completely blocked on IO.

Does stall Have Any External Dependencies?

The go.mod lists one module: the standard library. Four files:

main.go    — main loop, signal handling, render orchestration
psi.go     — reads and parses /proc/pressure/cpu, io, memory
render.go  — all display logic: sections, pressure bars, colors
sys.go     — terminal size detection via ioctl

Terminal width comes from TIOCGWINSZ ioctl, re-queried every tick so resize works. Same approach as newtop and netsock.

How Does stall Relate to newtop and netsock?

stall is the third tool in the same family. newtop covers CPU, memory, disk, and processes. netsock covers open sockets and their exposure scope. stall covers pressure stall — the metric that catches what the others miss. A machine can look normal in newtop while IO full is quietly non-zero, indicating periodic complete stalls that don’t show up as sustained CPU load.

Together they answer: what is the machine doing, what is it talking to, and where is it actually struggling?

FAQ

Q: What is stall? A terminal UI for monitoring Linux Pressure Stall Information (PSI). It reads /proc/pressure directly — no root required, no external dependencies — and displays CPU, IO, and memory stall percentages with color-coded pressure bars, updated every 2 seconds.

Q: What is PSI and how is it different from CPU%? PSI measures the percentage of time tasks were stalled waiting for a resource — CPU, IO, or memory. CPU% measures utilization: how busy the processor is. A machine can be at 30% CPU and still have significant IO pressure if tasks are constantly waiting on disk. PSI catches what utilization misses.

Q: What does the full pressure metric mean? full means every runnable task was simultaneously stalled — nothing was executing. A non-zero full value, especially on IO or MEMORY, means the machine experienced complete stalls during that window. CPU has no full metric by definition.

Q: Does stall require root or sudo? No. It reads from /proc/pressure/cpu, /proc/pressure/io, and /proc/pressure/memory, which are readable by any user on a standard Linux system.

Q: What kernel version does PSI require? Kernel 4.20 or later with CONFIG_PSI=y. Most modern distributions ship with it enabled. Verify with cat /proc/pressure/cpu — if it returns data, PSI is available.

Q: What does stall read from /proc/pressure? Three files: /proc/pressure/cpu, /proc/pressure/io, and /proc/pressure/memory. Each contains some and full lines (CPU has only some) with exponentially weighted moving averages over 10-second, 60-second, and 5-minute windows, plus a cumulative total in microseconds since boot.

Tested on Debian/CrunchBang++. Go 1.21+. Linux 4.20+ with CONFIG_PSI=y. Reads /proc/pressure directly.