QA Engineer Skills 2026QA-2026WebDriver BiDi Protocol Overview

WebDriver BiDi Protocol Overview

What It Is

WebDriver BiDi (Bidirectional) is a W3C standard for browser automation. It combines:

  • The standardization of WebDriver (W3C, cross-browser)
  • The bidirectionality of CDP (WebSocket, events, real-time)

Message Format

All communication is JSON over WebSocket.

Client → Browser (Commands)

{
  "id": 1,
  "method": "browsingContext.navigate",
  "params": {
    "context": "context-id-123",
    "url": "https://example.com",
    "wait": "complete"
  }
}

Browser → Client (Responses)

{
  "id": 1,
  "type": "success",
  "result": {
    "navigation": "nav-id-456",
    "url": "https://example.com"
  }
}

Browser → Client (Events — Unsolicited)

{
  "method": "log.entryAdded",
  "params": {
    "level": "error",
    "text": "Uncaught TypeError: Cannot read property 'foo' of undefined",
    "timestamp": 1707500000000
  }
}

Events are pushed by the browser without the client asking — this is the "bidirectional" part.


Core Modules

Session Module

Manages the connection between client and browser.

// Create session
{"id": 1, "method": "session.new", "params": {"capabilities": {}}}

// Subscribe to events
{"id": 2, "method": "session.subscribe", "params": {"events": ["log.entryAdded"]}}

Browsing Context Module

Manages tabs, iframes, and navigation.

// Navigate
{"id": 3, "method": "browsingContext.navigate", "params": {"context": "ctx-1", "url": "https://example.com"}}

// Get tree (all tabs and iframes)
{"id": 4, "method": "browsingContext.getTree", "params": {}}

// Create new tab
{"id": 5, "method": "browsingContext.create", "params": {"type": "tab"}}

// Close tab
{"id": 6, "method": "browsingContext.close", "params": {"context": "ctx-1"}}

Script Module

Execute JavaScript in the browser context.

// Evaluate expression
{"id": 7, "method": "script.evaluate", "params": {
  "expression": "document.title",
  "target": {"context": "ctx-1"},
  "awaitPromise": true
}}

// Call function (more powerful - can pass arguments)
{"id": 8, "method": "script.callFunction", "params": {
  "functionDeclaration": "(selector) => document.querySelector(selector)?.textContent",
  "arguments": [{"type": "string", "value": "h1"}],
  "target": {"context": "ctx-1"},
  "awaitPromise": true
}}

Input Module

Simulate user input (mouse, keyboard).

// Click at coordinates
{"id": 9, "method": "input.performActions", "params": {
  "context": "ctx-1",
  "actions": [{
    "type": "pointer",
    "id": "mouse",
    "actions": [
      {"type": "pointerMove", "x": 200, "y": 300},
      {"type": "pointerDown", "button": 0},
      {"type": "pointerUp", "button": 0}
    ]
  }]
}}

// Type text
{"id": 10, "method": "input.performActions", "params": {
  "context": "ctx-1",
  "actions": [{
    "type": "key",
    "id": "keyboard",
    "actions": [
      {"type": "keyDown", "value": "H"},
      {"type": "keyUp", "value": "H"},
      {"type": "keyDown", "value": "i"},
      {"type": "keyUp", "value": "i"}
    ]
  }]
}}

Network Module (Future)

Intercept and inspect network requests.

// Enable network tracking
{"id": 11, "method": "network.addIntercept", "params": {
  "phases": ["beforeRequestSent"]
}}

// Event: request made
{"method": "network.beforeRequestSent", "params": {
  "request": {"url": "https://api.example.com/data", "method": "GET"}
}}

Log Module

Console log events pushed from the browser.

// Subscribe
{"id": 12, "method": "session.subscribe", "params": {"events": ["log.entryAdded"]}}

// Event received
{"method": "log.entryAdded", "params": {
  "level": "error",
  "text": "TypeError: x is not a function",
  "source": {"realm": "realm-1"},
  "timestamp": 1707500000000
}}

Sessions

Standard Session (via HTTP upgrade)

Client                                        Browser
  │                                              │
  │── HTTP POST /session ───────────────────────►│
  │   {"capabilities": {"webSocketUrl": true}}   │
  │◄── HTTP 200 ─────────────────────────────────│
  │   {"sessionId": "...", "webSocketUrl": "ws://..."}
  │                                              │
  │── WebSocket connect ────────────────────────►│
  │◄── WebSocket connected ──────────────────────│
  │                                              │
  │   Now use BiDi commands...                   │

BiDi-Only Session (direct WebSocket)

Client                                        Browser
  │                                              │
  │── WebSocket connect to ws://.../ ───────────►│
  │◄── WebSocket connected ──────────────────────│
  │                                              │
  │── session.new ──────────────────────────────►│
  │◄── session.new result ───────────────────────│
  │                                              │
  │   Now use BiDi commands...                   │

Key Concepts

Browsing Context

A "browsing context" is a tab or iframe. Each has a unique ID.

Browser
├── Context "ctx-1" (Tab 1 - https://example.com)
│   ├── Context "ctx-2" (iframe - ad)
│   └── Context "ctx-3" (iframe - widget)
└── Context "ctx-4" (Tab 2 - https://docs.example.com)

User Context

User contexts provide isolation (like incognito profiles):

// Create isolated context
{"id": 1, "method": "browser.createUserContext", "params": {}}

// Create tab in isolated context
{"id": 2, "method": "browsingContext.create", "params": {
  "type": "tab",
  "userContext": "user-ctx-1"
}}

Event Subscriptions

Clients must explicitly subscribe to events they want:

// Subscribe to console logs and network events
{"id": 1, "method": "session.subscribe", "params": {
  "events": ["log.entryAdded", "network.beforeRequestSent"],
  "contexts": ["ctx-1"]  // Optional: only for specific tab
}}

Browser Support (2026)

Browser BiDi Support Driver
Chrome Full chromedriver
Firefox Full (native) No separate driver needed
Edge Full (Chromium-based) edgedriver
Safari Partial safaridriver

How Vibium Fits In

Vibium's clicker binary is a BiDi proxy that sits between clients and Chrome:

Client ──WebSocket──► Clicker Proxy ──WebSocket──► Chrome
                      │
                      ├── Forwards standard BiDi commands
                      ├── Intercepts vibium:* extension commands
                      ├── Runs actionability checks via script.callFunction
                      └── Performs clicks via input.performActions

The proxy pattern means:

  • Clients don't need to know Chrome's BiDi endpoint directly
  • Custom commands (vibium:click, etc.) are transparent to the client
  • The proxy can add features (auto-wait, screenshots) without changing the protocol

Interview Talking Point

"WebDriver BiDi is the W3C successor to both classic WebDriver and CDP. It uses WebSocket for bidirectional JSON messaging — the client can send commands AND the browser can push events. The protocol is organized into modules: session, browsing context, script, input, network, and log. Vibium uses BiDi as its transport layer, with the Go binary acting as a proxy that forwards standard commands and intercepts custom vibium:* extensions. The key advantage over CDP is standardization — BiDi is a W3C spec with buy-in from all major browser vendors, not a Google-internal protocol that can change without notice."