Daemon Architecture: Process Model and Browser Lifecycle
Two Modes of Operation
Vibium operates in two modes, chosen per-command:
Daemon Mode (Default)
A background process keeps Chrome running between commands. Fast and stateful.
# First command: daemon starts, Chrome launches
vibe-check navigate https://example.com # ~2s (cold start)
# Subsequent commands: reuse existing browser
vibe-check text "h1" # ~100ms (hot)
vibe-check click "a" # ~200ms (hot)
vibe-check screenshot -o shot.png # ~300ms (hot)
# Explicit control
vibe-check daemon status # Check if running
vibe-check daemon stop # Kill daemon + browser
Best for:
- Interactive testing sessions
- Agent-driven multi-step flows
- Development and debugging
Oneshot Mode
Fresh browser per command. Isolated and stateless.
# Each command: launch → execute → teardown
vibe-check navigate https://example.com --oneshot # ~2s
vibe-check text "h1" --oneshot # ~2s (new browser!)
# Or via environment variable
VIBIUM_ONESHOT=1 vibe-check navigate https://example.com
Best for:
- CI/CD pipelines
- Parallel test execution
- Tests that need clean state
The Process Tree
When Vibium launches a browser session:
clicker (Go binary, ~10MB)
└── chromedriver
└── Chrome for Testing (main browser process)
├── chrome_crashpad_handler (crash reporting)
├── GPU helper
├── Network helper
├── Storage helper
├── Renderer helper (one per tab/frame)
└── ...more helpers
A single session spawns 8-12 OS processes. This matters for:
- Resource planning (memory, CPU)
- Process cleanup (killing one doesn't kill all)
- CI environments (container limits)
The Zombie Problem
What Goes Wrong
When chromedriver is killed, its children (Chrome + helpers) get reparented to PID 1 (launchd on macOS, init on Linux) before the cleanup code can reach them. They become orphans:
Before kill: After chromedriver dies:
clicker clicker
└── chromedriver (gone - killed)
└── Chrome Chrome (parent = PID 1, orphaned!)
└── GPU helper └── GPU helper
└── Renderer └── Renderer
Why Not Kill Chrome First?
The naive approach — "just kill Chrome before chromedriver" — fails because:
- Sending
DELETE /sessionto chromedriver can be interrupted by signals - The HTTP request might time out
- Chromedriver might die before Chrome fully terminates
- Race conditions between the cleanup sequence and OS process management
The Solution: Three-Phase Cleanup
Implemented in launcher.go:Close():
// Phase 1: Polite request
// Send DELETE /session to chromedriver → asks Chrome to quit gracefully
// Best effort, may fail
// Phase 2: Kill process tree
func killProcessTree(pid int) {
descendants := getDescendants(pid) // recursive pgrep -P
// Kill deepest children first (reverse order)
for i := len(descendants) - 1; i >= 0; i-- {
syscall.Kill(descendants[i], syscall.SIGKILL)
}
syscall.Kill(pid, syscall.SIGKILL)
}
// Phase 3: Orphan sweep
// Find Chrome/chromedriver processes with parent PID 1
// These escaped the tree kill — terminate them
killOrphanedChromeProcesses()
What If the Clicker Itself Dies?
| Scenario | What Happens | Cleanup |
|---|---|---|
| Normal exit | Close() runs all three phases |
Automatic |
| Ctrl+C (SIGINT) | Signal handler calls KillAll() |
Automatic |
kill -9 (SIGKILL) |
Nothing can intercept this | Orphans remain until next session |
| System crash | Process table wiped | OS handles it |
For development, make double-tap kills all Chrome for Testing and chromedriver processes:
# Manual cleanup during development
make double-tap
# Debugging: check for orphans
pgrep -lf 'Chrome for Testing'
pgrep -lf chromedriver
# Check parent PIDs (orphans have PPID = 1)
ps -o pid,ppid,comm -p $(pgrep -f 'Chrome for Testing')
Daemon Communication
The daemon listens on a local WebSocket. CLI commands connect to it:
vibe-check click "button"
│
▼
Daemon (clicker binary, persistent)
│
▼ WebSocket (BiDi)
Chrome (persistent)
If the daemon isn't running, the first command starts it automatically. This is the "auto-launch" behavior that makes the tool feel seamless.
Implications for Test Frameworks
Daemon Mode for Test Suites
# Start of test suite: ensure clean state
vibe-check daemon stop 2>/dev/null
vibe-check daemon start
# Run tests (all share the same browser)
run_test "login_test"
run_test "dashboard_test"
run_test "checkout_test"
# Cleanup
vibe-check daemon stop
Pro: Fast — no browser restart between tests Con: State leaks between tests (cookies, localStorage, etc.)
Oneshot Mode for Isolated Tests
# Each test gets a fresh browser
VIBIUM_ONESHOT=1 run_test "login_test"
VIBIUM_ONESHOT=1 run_test "dashboard_test"
Pro: Perfect isolation — no state leaks Con: Slow — ~2s overhead per test for browser launch
Hybrid: Daemon + Manual State Reset
# Keep daemon for speed, but reset state between tests
vibe-check daemon start
for test in tests:
vibe-check eval "localStorage.clear(); sessionStorage.clear()"
vibe-check eval "document.cookie.split(';').forEach(c => document.cookie = c.trim().split('=')[0] + '=;expires=Thu, 01 Jan 1970')"
vibe-check navigate "about:blank"
run_test "$test"
vibe-check daemon stop
Pro: Fast + mostly isolated Con: Manual state management, may miss server-side session state
CI/CD Considerations
Headless Mode
# No display needed
vibe-check navigate https://example.com --headless
Container Environments
# Chrome needs these dependencies on Linux
RUN apt-get install -y libgbm1 libnss3 libatk-bridge2.0-0 \
libdrm2 libxkbcommon0 libxcomposite1 libxdamage1 \
libxfixes3 libxrandr2 libasound2
Parallel Execution
Each daemon instance uses a separate port. For parallel tests, use oneshot mode or explicitly manage daemon instances.
Interview Talking Point
"Vibium's daemon architecture is a pragmatic trade-off between speed and isolation. The daemon keeps Chrome alive between commands — reducing per-command latency from ~2 seconds to ~100ms — while oneshot mode gives you fresh-browser isolation for CI. The interesting technical challenge is process cleanup: Chrome spawns 8-12 child processes, and killing the driver process orphans them. Vibium solves this with a three-phase cleanup: graceful shutdown request, recursive process tree kill, then an orphan sweep for any that escaped. For our test framework, we use daemon mode during development for speed and oneshot in CI for reliability."