Future Directions: Where AI-Driven Testing Is Heading

Near-Term (2026-2027)

AI-Powered Locators

What: Instead of CSS selectors, you describe elements in natural language:

await vibe.do("click the login button");
await vibe.check("verify the dashboard loaded");
const el = await vibe.find("the blue submit button");

Status: On Vibium's V2 roadmap. Open questions include:

Local vision model (Qwen-VL) vs API (Claude Vision)?
Screenshot → model → coordinates, or DOM → model → selector?
How to handle ambiguity ("the button" when there are 5)?
Caching/memoization of element locations?

Impact on QA: Eliminates the #1 maintenance burden — CSS selectors. Tests become truly natural language.

Timeline: 2026-2027 (high uncertainty — Vibium estimates 3-6 weeks of research)

Cortex: Application Memory

What: SQLite-backed datastore that builds an "app map" — a persistent model of the application's pages, flows, and elements.

How it helps:

Agent doesn't rediscover the same flows across sessions
Pathfinding: "How do I get from the login page to the settings page?" → Dijkstra shortest path
Test intelligence: "This page used to have 5 buttons, now it has 4" → automatic regression detection

Components:

SQLite database with schema for pages, actions, sessions
sqlite-vec for embedding-based search
Graph builder and pathfinding
MCP server for agent queries

When to build (from Vibium's roadmap):

"When users report that agents are repeatedly rediscovering the same flows, losing context across sessions, or unable to plan multi-step navigation."

Retina: Passive Observation

What: Chrome extension that passively records ALL browser activity, regardless of what's driving it.

Use cases:

Record human testing sessions for replay
Debug what happened during agent runs
Generate training data from human interaction patterns

Components:

Chrome Manifest V3 extension
Click/keypress/navigation listeners
DOM snapshot capture
Screenshot capture
JSONL export to Cortex

Video Recording

What: Built-in screen recording of browser sessions.

Why it matters for QA: Video artifacts for:

Test failure debugging (see the exact visual sequence)
Demo generation (automated demos of features)
Compliance/audit trails (proof of testing)

Implementation: Screenshot capture at ~10fps, encoded to MP4/WebM via FFmpeg.

Medium-Term (2027-2028)

Autonomous Test Generation

What: The agent explores the application and generates test cases automatically, without human-written test definitions.

How it works:

Agent navigates the entire application using vibe-check
Cortex builds the app map
Agent identifies critical paths (login → checkout → confirmation)
Agent generates test definitions for each path
Agent executes generated tests and refines based on results

Example from OpenObserve: They built a "Council of Sub-Agents" — 8 specialized AI agents that:

Analyze new features from code diffs
Generate test scenarios
Write and execute tests
Review results
Grew coverage from 380 to 700+ tests

Impact: QA engineers shift from writing tests to reviewing and curating AI-generated tests.

Multi-Browser Testing via BiDi

What: As Firefox, Edge, and Safari complete BiDi support, Vibium supports all major browsers through the same protocol.

Current state (2026):

Chrome: Full BiDi support
Firefox: Full BiDi support (native, no separate driver!)
Edge: Full support (Chromium-based)
Safari: Partial (improving each release)

Impact: Cross-browser testing without browser-specific code. One test, four browsers, same protocol.

Network Interception

What: Capture and mock network requests during testing.

Use cases:

Mock API responses for deterministic testing
Capture HAR files for debugging
Simulate slow networks or error conditions
Verify API calls are made correctly

Status: BiDi network module is being standardized. Vibium V2 plans to implement it.

Long-Term (2028+)

Self-Writing Test Suites

What: The AI continuously monitors the application, detects changes, generates tests, runs them, and reports issues — all without human intervention.

The loop:

Application updated →
  Agent detects changes →
    Agent generates new tests →
      Agent runs all tests →
        Agent reports issues →
          Developer fixes bugs →
            Application updated → (repeat)

Visual Testing at Scale

What: Vision models compare screenshots across versions, browsers, and devices to detect visual regressions.

Why it's hard today: Vision API calls are slow ($0.01-0.05 per screenshot) and require specific prompting. As local vision models improve and costs decrease, this becomes practical at scale.

Test-Driven Development with AI

What: Developer describes a feature in natural language → AI generates the test → AI generates the code → AI runs the test → if it passes, commit.

This is already partially possible with Claude Code:

"Write a test for the new user registration flow"
Claude generates a YAML test definition + vibe-check commands
"Now implement the registration page to pass this test"
Claude writes the code
Claude runs the test and iterates

Industry Convergence Points

Protocol: Everyone Is Moving to BiDi

2024: Selenium 4 adds BiDi ──────────────────┐
2023: Puppeteer adds BiDi ───────────────────┤
2025: Vibium launches on BiDi ───────────────┤── W3C WebDriver BiDi
2026: Playwright acknowledges BiDi future ───┘

Within 2-3 years, BiDi will likely be the dominant protocol.

Interface: Skills Are Becoming Standard

2025: Agent Skills concept introduced ──┐
2025: Skills Directory (skills.sh) ─────┤
2025: Vibium vibe-check skill ──────────┤── Skills as standard
2026: 50,000+ skills in directory ──────┤   agent interface
2026: Playwright docs endorse CLI ──────┘

Intelligence: From Scripts to Reasoning

2004: Scripts ─────────────────────────────┐
2015: Page Objects ────────────────────────┤
2020: Self-healing (rule-based) ───────────┤── Testing intelligence
2025: Agent reasoning (LLM-based) ─────────┤   spectrum
2027: Autonomous generation ───────────────┘

What This Means for Your Career

Skills to Develop

AI agent architectures — ReAct patterns, multi-agent systems, tool use
Prompt engineering for testing — Writing effective test definitions and skill files
WebDriver BiDi protocol — The technical standard underneath
Token economics — Understanding and optimizing AI costs
CI/CD for AI workflows — Running agent-driven tests in pipelines

Skills That Are Declining

Manual selector management — AI finds elements
Explicit wait strategies — Actionability checks handle this
Page Object boilerplate — Agent reasoning replaces abstractions
Browser-specific workarounds — BiDi standardization eliminates these

The QA Engineer of 2028

Instead of writing and maintaining test scripts, you'll:

Define testing intent in natural language
Curate AI-generated test suites
Review self-healing decisions
Manage AI test infrastructure costs
Design test strategies that agents can execute
Interpret AI-generated failure analysis

The job title might change from "QA Automation Engineer" to "AI Test Architect" or "Test Intelligence Engineer."

Interview Talking Point

"The industry is converging on three things: WebDriver BiDi as the standard protocol, CLI skills as the standard agent interface, and LLM reasoning as the standard intelligence layer. Vibium is positioned at the intersection of all three. The near-term developments I'm most excited about are AI-powered locators (eliminating selector maintenance) and application memory via Cortex (so agents don't rediscover flows). The long-term direction is autonomous test generation — the agent explores the app and creates the test suite. Our role as QA engineers is evolving from writing tests to architecting the AI systems that generate, execute, and maintain tests."