Future Directions: Where AI-Driven Testing Is Heading
Near-Term (2026-2027)
AI-Powered Locators
What: Instead of CSS selectors, you describe elements in natural language:
await vibe.do("click the login button");
await vibe.check("verify the dashboard loaded");
const el = await vibe.find("the blue submit button");
Status: On Vibium's V2 roadmap. Open questions include:
- Local vision model (Qwen-VL) vs API (Claude Vision)?
- Screenshot → model → coordinates, or DOM → model → selector?
- How to handle ambiguity ("the button" when there are 5)?
- Caching/memoization of element locations?
Impact on QA: Eliminates the #1 maintenance burden — CSS selectors. Tests become truly natural language.
Timeline: 2026-2027 (high uncertainty — Vibium estimates 3-6 weeks of research)
Cortex: Application Memory
What: SQLite-backed datastore that builds an "app map" — a persistent model of the application's pages, flows, and elements.
How it helps:
- Agent doesn't rediscover the same flows across sessions
- Pathfinding: "How do I get from the login page to the settings page?" → Dijkstra shortest path
- Test intelligence: "This page used to have 5 buttons, now it has 4" → automatic regression detection
Components:
- SQLite database with schema for pages, actions, sessions
- sqlite-vec for embedding-based search
- Graph builder and pathfinding
- MCP server for agent queries
When to build (from Vibium's roadmap):
"When users report that agents are repeatedly rediscovering the same flows, losing context across sessions, or unable to plan multi-step navigation."
Retina: Passive Observation
What: Chrome extension that passively records ALL browser activity, regardless of what's driving it.
Use cases:
- Record human testing sessions for replay
- Debug what happened during agent runs
- Generate training data from human interaction patterns
Components:
- Chrome Manifest V3 extension
- Click/keypress/navigation listeners
- DOM snapshot capture
- Screenshot capture
- JSONL export to Cortex
Video Recording
What: Built-in screen recording of browser sessions.
Why it matters for QA: Video artifacts for:
- Test failure debugging (see the exact visual sequence)
- Demo generation (automated demos of features)
- Compliance/audit trails (proof of testing)
Implementation: Screenshot capture at ~10fps, encoded to MP4/WebM via FFmpeg.
Medium-Term (2027-2028)
Autonomous Test Generation
What: The agent explores the application and generates test cases automatically, without human-written test definitions.
How it works:
- Agent navigates the entire application using vibe-check
- Cortex builds the app map
- Agent identifies critical paths (login → checkout → confirmation)
- Agent generates test definitions for each path
- Agent executes generated tests and refines based on results
Example from OpenObserve: They built a "Council of Sub-Agents" — 8 specialized AI agents that:
- Analyze new features from code diffs
- Generate test scenarios
- Write and execute tests
- Review results
- Grew coverage from 380 to 700+ tests
Impact: QA engineers shift from writing tests to reviewing and curating AI-generated tests.
Multi-Browser Testing via BiDi
What: As Firefox, Edge, and Safari complete BiDi support, Vibium supports all major browsers through the same protocol.
Current state (2026):
- Chrome: Full BiDi support
- Firefox: Full BiDi support (native, no separate driver!)
- Edge: Full support (Chromium-based)
- Safari: Partial (improving each release)
Impact: Cross-browser testing without browser-specific code. One test, four browsers, same protocol.
Network Interception
What: Capture and mock network requests during testing.
Use cases:
- Mock API responses for deterministic testing
- Capture HAR files for debugging
- Simulate slow networks or error conditions
- Verify API calls are made correctly
Status: BiDi network module is being standardized. Vibium V2 plans to implement it.
Long-Term (2028+)
Self-Writing Test Suites
What: The AI continuously monitors the application, detects changes, generates tests, runs them, and reports issues — all without human intervention.
The loop:
Application updated →
Agent detects changes →
Agent generates new tests →
Agent runs all tests →
Agent reports issues →
Developer fixes bugs →
Application updated → (repeat)
Visual Testing at Scale
What: Vision models compare screenshots across versions, browsers, and devices to detect visual regressions.
Why it's hard today: Vision API calls are slow ($0.01-0.05 per screenshot) and require specific prompting. As local vision models improve and costs decrease, this becomes practical at scale.
Test-Driven Development with AI
What: Developer describes a feature in natural language → AI generates the test → AI generates the code → AI runs the test → if it passes, commit.
This is already partially possible with Claude Code:
- "Write a test for the new user registration flow"
- Claude generates a YAML test definition + vibe-check commands
- "Now implement the registration page to pass this test"
- Claude writes the code
- Claude runs the test and iterates
Industry Convergence Points
Protocol: Everyone Is Moving to BiDi
2024: Selenium 4 adds BiDi ──────────────────┐
2023: Puppeteer adds BiDi ───────────────────┤
2025: Vibium launches on BiDi ───────────────┤── W3C WebDriver BiDi
2026: Playwright acknowledges BiDi future ───┘
Within 2-3 years, BiDi will likely be the dominant protocol.
Interface: Skills Are Becoming Standard
2025: Agent Skills concept introduced ──┐
2025: Skills Directory (skills.sh) ─────┤
2025: Vibium vibe-check skill ──────────┤── Skills as standard
2026: 50,000+ skills in directory ──────┤ agent interface
2026: Playwright docs endorse CLI ──────┘
Intelligence: From Scripts to Reasoning
2004: Scripts ─────────────────────────────┐
2015: Page Objects ────────────────────────┤
2020: Self-healing (rule-based) ───────────┤── Testing intelligence
2025: Agent reasoning (LLM-based) ─────────┤ spectrum
2027: Autonomous generation ───────────────┘
What This Means for Your Career
Skills to Develop
- AI agent architectures — ReAct patterns, multi-agent systems, tool use
- Prompt engineering for testing — Writing effective test definitions and skill files
- WebDriver BiDi protocol — The technical standard underneath
- Token economics — Understanding and optimizing AI costs
- CI/CD for AI workflows — Running agent-driven tests in pipelines
Skills That Are Declining
- Manual selector management — AI finds elements
- Explicit wait strategies — Actionability checks handle this
- Page Object boilerplate — Agent reasoning replaces abstractions
- Browser-specific workarounds — BiDi standardization eliminates these
The QA Engineer of 2028
Instead of writing and maintaining test scripts, you'll:
- Define testing intent in natural language
- Curate AI-generated test suites
- Review self-healing decisions
- Manage AI test infrastructure costs
- Design test strategies that agents can execute
- Interpret AI-generated failure analysis
The job title might change from "QA Automation Engineer" to "AI Test Architect" or "Test Intelligence Engineer."
Interview Talking Point
"The industry is converging on three things: WebDriver BiDi as the standard protocol, CLI skills as the standard agent interface, and LLM reasoning as the standard intelligence layer. Vibium is positioned at the intersection of all three. The near-term developments I'm most excited about are AI-powered locators (eliminating selector maintenance) and application memory via Cortex (so agents don't rediscover flows). The long-term direction is autonomous test generation — the agent explores the app and creates the test suite. Our role as QA engineers is evolving from writing tests to architecting the AI systems that generate, execute, and maintain tests."