See also: fftool — A Terminal UI for ffmpeg Written in Go

ffmpeg can do almost anything with audio and video. The problem is remembering how.

I kept a text file of commands I'd looked up once and didn't want to look up again. Trim a

video without re-encoding. Two-pass loudness normalization. Generate a Mandelbrot zoom. Each

one took ten minutes of documentation reading the first time. The second time I still had to

look it up because the flags aren't memorable — they're precise.

fftool replaces the notes file. It's a terminal UI that walks through every operation, takes

the inputs, and shows you the exact command before running anything. The philosophy is memory

aid, not abstraction layer. You see the flags. You can copy the command and run it yourself.

Nothing is hidden.


The spec review

Before writing any code I read the spec — a detailed markdown document — and flagged every

ambiguity. A few things that would have caused real problems mid-build:

The Field struct was incomplete. The spec described select-style fields (cycling through

mp3/aac/flac/wav options) but the Field struct had no Type field and no Options []string.

That's a schema problem that touches every single form in the app.

Two-pass presets had no architecture. Stabilize and Normalize both require two ffmpeg

invocations. The spec defined Build() ([]string, error) — a single-pass signature. Who builds

the second command? Who runs it? The confirm screen, run screen, and result screen all had

implicit assumptions about one command, one run.

Normalize's second pass can't be built at form-submit time. Pass 2 of EBU R128 loudnorm

requires values measured in pass 1 — the JSON output from -print_format json. You don't have

those values until pass 1 actually runs. Build() needed a way to express "this argument will

be filled in later."

Resolving all of this before touching code saved a full rewrite. The cost is a longer design

session. The payoff is not discovering that your interface is wrong when you're halfway through

the preset files.


The architecture decisions that mattered

Build() returns [][]string

Changed from the spec's ([]string, error) to ([][]string, error) — a slice of argv slices.

Single-pass presets return one element. Two-pass return two. The runner iterates. Every preset

speaks the same interface regardless of how many passes it needs.

Sentinel values and placeholder tokens

Two presets need runtime data injected into their commands.

Concat needs a temp file path — the list of input files in ffmpeg's concat format. Build()

returns the literal string "CONCAT_LIST_FILE" in the argv slice. The runner sees it, creates

the temp file from the form's comma-separated input, substitutes the real path, and cleans up

on completion or error. Build() stays a pure function.

Normalize's second pass needs the loudness measurements from pass 1. Build() puts tokens like

{{norm:input_i}} in the pass 2 argv. After pass 1 runs, the runner extracts the JSON block

from ffmpeg's stderr, parses it, and substitutes all tokens. The format is unambiguous — it

won't collide with real ffmpeg arguments.

Both approaches keep the preset code simple and push the runtime complexity into one place: the

runner.

The Executor interface

ui/run.go needs to execute ffmpeg, but ui can't import main. The fix is an Executor

interface defined in the ui package. main wraps the FFmpeg struct in a thin type that

satisfies it. The ui package depends on nothing from main. Dependency direction stays clean.


The bug that looked like nothing

After the first working build, ./fftool did nothing. No output. No error. Just returned to

the prompt. ./fftool 2>/tmp/fftool.log produced an empty log.

Clean exit, no stderr, terminal restored — which meant something was sending tea.Quit almost

immediately. The only candidates in the code were a Ctrl+C handler, a q keypress handler,

and a terminal size check. No keys were being pressed. That left the size check.

Added file-based logging. The log showed:

Update: state=0 msg=tea.WindowSizeMsg {Width:100 Height:9}
Update: state=0 msg=tea.sequenceMsg [...]
Update: state=0 msg=tea.printLineMessage {fftool: terminal too small (need 60x20, got 100x9)}
bubbletea exited cleanly

Terminal was 9 rows tall. The size check was working exactly as written. But

tea.Printf inside tea.WithAltScreen() writes to the alt screen buffer. When the program

exits and the terminal restores the primary screen, that output is gone. The user sees nothing.

The fix: check terminal size in main.go before tea.NewProgram is called, using a direct

syscall.TIOCGWINSZ ioctl. If the terminal is too small, print to real stderr and exit before

bubbletea ever touches the terminal.

func termSize() (cols, rows int) {
    ws := &winsize{}
    syscall.Syscall(syscall.SYS_IOCTL,
        uintptr(os.Stdout.Fd()),
        syscall.TIOCGWINSZ,
        uintptr(unsafe.Pointer(ws)))
    return int(ws.Col), int(ws.Row)
}

The general lesson: anything that needs to survive program exit must be written to stderr

*before* p.Run(). Once you're in alt screen, you're in alt screen.


What's in it

40 Go source files, ~3,200 lines. The breakdown:

  • Foundation: ffmpeg detection, version parsing, subprocess execution with streaming stderr,

progress line parsing

  • UI layer: menu, form, confirm, run screen with live output and progress bar, result screen
  • 27 preset implementations across video, audio, image, generative, and info categories

The preset files average around 30 lines each. Most of the complexity is in ui/run.go (454

lines) which handles all four execution modes, the sentinel substitution, the normalize token

replacement, and the fallback retry logic for audio stream copying.


Get it

fftool — download and documentation

Linux x86-64 binary, ~3.5MB. Requires ffmpeg on PATH and a terminal at least 60×20.