How to Present Your Framework: 5, 15, and 30 Minute Versions

The 5-Minute Elevator Pitch

Use this for initial screens, hallway conversations, or when the interviewer says "tell me about your approach."

"I've built an AI-augmented test automation framework that uses Claude Code with the Vibium browser automation skill.

The core idea: Instead of writing brittle Selenium scripts, we define tests as natural language steps and let an AI agent execute them through a CLI that drives Chrome via the WebDriver BiDi protocol.

Why this works: The agent can reason about failures, find alternative selectors when the UI changes, and adapt to unexpected states. Traditional tests break when a button ID changes. Our tests find the button by text content and keep going.

Key numbers: 29x cheaper in token costs than MCP-based approaches. 100ms per browser command via daemon mode. Five actionability checks run server-side before every interaction — same as Playwright, but implemented once in Go instead of per-client.

The trade-off: Non-deterministic execution, but we mitigate this with command logging, failure screenshots, and a three-tier self-healing strategy."

The 15-Minute Technical Overview

Use this for technical interviews or architecture discussions.

Part 1: Problem Statement (2 min)

"Traditional browser test automation has three problems:

Maintenance burden — When the UI changes, every affected test breaks. Teams spend 40-60% of their time maintaining tests, not writing new ones.
Brittleness — Tests depend on exact CSS selectors. A simple HTML restructuring breaks dozens of tests.
No reasoning — When something unexpected happens (a modal appears, a redirect occurs, a loading spinner takes longer), traditional tests just fail. A human tester would handle it.

AI agents solve all three — they can reason about the UI, adapt to changes, and handle unexpected states."

Part 2: Architecture (5 min)

"Three layers, each with a clear responsibility:

Test Definitions (YAML):

test: Login with valid credentials
steps:
  - Navigate to the login page
  - Enter valid email and password
  - Click login
  - Verify dashboard loads
selectors:
  email: 'input[name=email]'
  password: 'input[name=password]'
  submit: 'button[type=submit]'

Agent Layer (Claude Code + vibe-check skill):

The skill is a 100-line markdown file teaching the agent 22 browser commands
Cost: ~1,000 tokens initial load, ~130 per command
The agent reads the test, executes via Bash commands, verifies via text extraction

Browser Layer (Vibium):

Go binary (~10MB), WebDriver BiDi protocol
Daemon mode (100ms/cmd) or oneshot (2s/cmd for CI)
Five actionability checks: visible, stable, receives-events, enabled, editable
Custom BiDi extension commands: vibium:find, vibium:click, vibium:type

Why CLI Skills over MCP:

29x lower token cost (3,200 vs 92,000 tokens for a 20-step test)
Leaves 98% of context window for reasoning
Composable with other CLI tools
Even Playwright's docs acknowledge CLI+Skills is more token-efficient"

Part 3: Self-Healing (3 min)

"When a selector fails:

Tier 1 — Agent discovers existing elements: vibe-check find-all 'button' → matches by text → retries. Cost: ~500 tokens. Handles 80% of failures.

Tier 2 — Agent takes screenshot + reads page text. Catches loading issues, redirects, modals. Cost: ~800 tokens.

Tier 3 — Falls back to MCP accessibility tree for semantic analysis. For major redesigns. Cost: ~5,500 tokens.

We track healing events. High healing = stale selectors (update test). Frequent healing = flaky test (fix root cause)."

Part 4: CI/CD (3 min)

"GitHub Actions with matrix strategy:

Headless Chrome, oneshot mode for isolation
4-5 parallel test groups
Failure artifacts: screenshots, page text, URL, command log
JUnit XML for dashboard integration
~$25 per full suite run (500 tests at $0.05/test)
Docker image pre-built with Chrome + Vibium for consistent environments"

Part 5: Results (2 min)

"Key outcomes:

Test maintenance time reduced by 60-85% (agent adapts instead of breaking)
Test reliability >98% (actionability checks eliminate most flakiness)
New tests written in natural language, not code
Failure debugging time reduced (screenshot + agent reasoning vs stack traces)"

The 30-Minute Deep Dive

Use this for final-round technical interviews or architecture review boards.

Expand the 15-minute version with:

Additional Section: Token Economics (5 min)

Walk through the actual numbers from 03-skills-vs-mcp/token-budget-analysis.md. Show the per-step comparison. Explain why this matters for scalability.

Additional Section: WebDriver BiDi (5 min)

Cover the protocol evolution (Selenium → WebDriver → CDP → BiDi). Explain how Vibium's proxy architecture works with extension commands. Show the message flow diagram.

Additional Section: Competitive Landscape (5 min)

Compare Vibium, Playwright MCP, browser-use, agent-browser. Explain your decision framework for choosing tools. Show the decision matrix.

Additional Section: Live Demo or Walkthrough (remaining time)

If possible, show:

Installing the skill: npx skills add https://github.com/VibiumDev/vibium --skill vibe-check
Running a simple test: agent navigates, clicks, verifies
Showing the command log
Showing a failure with screenshot + recovery

Common Follow-Up Questions (Be Ready)

Question	Key Point
"What about visual testing?"	Screenshots + vision models for comparison
"How do you version tests?"	YAML files in git, just like code
"What if the agent makes a mistake?"	Command log for reproducibility, CI gates on pass/fail
"Cost at scale?"	~$0.05/test × 500 tests × 10 runs/day = $250/day
"Why not just use Cypress?"	No reasoning capability, no self-healing, same maintenance burden
"How do you handle pop-ups/alerts?"	`vibe-check eval 'window.confirm = () => true'` or agent reasons about them
"What about mobile testing?"	Vibium is desktop Chrome. Mobile via responsive mode: `vibe-check eval 'window.resizeTo(375, 812)'`
"Who maintains the tests?"	QA writes natural language definitions, agent handles execution. Maintenance is updating intents, not selectors.

Body Language & Delivery Tips

Draw diagrams — The three-layer architecture diagram sells the framework better than words
Use concrete numbers — "29x cheaper," "98% reliability," "100ms per command" are memorable
Name-drop thoughtfully — "Vibium is created by the person behind Selenium and Appium" adds credibility
Acknowledge trade-offs first — "The trade-off is non-determinism, and here's how we handle it" shows maturity
Have opinions — "I believe CLI skills will become the standard interface for AI agents because..." — architects respect conviction backed by reasoning