QA Engineer Skills 2026QA-2026Vibium Extension Commands Over BiDi

Vibium Extension Commands Over BiDi

The Extension Mechanism

The WebDriver BiDi specification explicitly supports implementation-defined extension modules. The naming convention requires a colon separator:

standard:    browsingContext.navigate
extension:   vibium:click

This is not a hack — it's a designed extension point in the W3C spec.


The Three Vibium Extension Commands

vibium:find

Purpose: Wait for an element to exist in the DOM.

Request:

{
  "id": 1,
  "method": "vibium:find",
  "params": {
    "context": "browsing-context-id",
    "selector": "button.submit",
    "timeout": 30000
  }
}

What the proxy does internally:

  1. Start a polling loop (100ms interval)
  2. On each iteration, send script.callFunction to Chrome:
    {
      "method": "script.callFunction",
      "params": {
        "functionDeclaration": "(s) => !!document.querySelector(s)",
        "arguments": [{"type": "string", "value": "button.submit"}],
        "target": {"context": "browsing-context-id"}
      }
    }
    
  3. If true, return success. If false, retry until timeout.

Success Response:

{"id": 1, "type": "success", "result": {"found": true}}

Timeout Response:

{
  "id": 1,
  "type": "error",
  "error": {"error": "timeout", "message": "timeout after 30s waiting for 'button.submit'"}
}

vibium:click

Purpose: Wait for an element to be actionable, then click it.

Request:

{
  "id": 2,
  "method": "vibium:click",
  "params": {
    "context": "browsing-context-id",
    "selector": "button.submit",
    "timeout": 30000
  }
}

What the proxy does internally:

Step 1: Find element (vibium:find behavior)
  → script.callFunction: document.querySelector(selector) exists?
  → Repeat until found or timeout

Step 2: Check Visible
  → script.callFunction: getBoundingClientRect + getComputedStyle
  → Must have non-zero size, not hidden

Step 3: Check Stable
  → script.callFunction: getBoundingClientRect at T
  → Wait 50ms
  → script.callFunction: getBoundingClientRect at T+50ms
  → Compare: must be identical

Step 4: Check ReceivesEvents
  → script.callFunction: elementFromPoint at center
  → Must hit the target element (or its child)

Step 5: Check Enabled
  → script.callFunction: check disabled, aria-disabled, fieldset
  → Must not be disabled

Step 6: Get bounding box
  → script.callFunction: getBoundingClientRect
  → Calculate center coordinates: (x + width/2, y + height/2)

Step 7: Perform click
  → input.performActions:
    pointerMove to (centerX, centerY)
    pointerDown button 0
    pointerUp button 0

All steps 1-5 are in a polling loop. If any check fails, wait 100ms and restart from step 1.

Success Response:

{"id": 2, "type": "success", "result": {"clicked": true}}

vibium:type

Purpose: Wait for an element to be actionable AND editable, then type text.

Request:

{
  "id": 3,
  "method": "vibium:type",
  "params": {
    "context": "browsing-context-id",
    "selector": "input[name=email]",
    "text": "user@example.com",
    "timeout": 30000
  }
}

What the proxy does internally:

Same as vibium:click steps 1-5, PLUS:

Step 5b: Check Editable
  → script.callFunction: check readonly, aria-readonly, input type
  → Must accept text input

Step 6: Focus element
  → script.callFunction: document.querySelector(selector).focus()

Step 7: Clear existing text (if any)
  → input.performActions: Ctrl+A, then Delete

Step 8: Type text character by character
  → input.performActions:
    For each character in "user@example.com":
      keyDown character
      keyUp character

Character-by-character typing triggers all expected DOM events: keydown, keypress, input, keyup. This is critical for:

  • Form validation that runs on input events
  • Autocomplete that triggers on keystroke
  • Character count limits
  • Real-time search

Why Not Standard BiDi Commands?

Standard BiDi has input.performActions for clicks and keyboard input. Why add custom commands?

Without Extension Commands (Client Must Implement)

Client                           Chrome
  │                                │
  │─ script.callFunction ─────────►│  (check if element exists)
  │◄─ result: false ───────────────│
  │   wait 100ms                   │
  │─ script.callFunction ─────────►│  (check again)
  │◄─ result: true ────────────────│
  │─ script.callFunction ─────────►│  (check visible)
  │◄─ result ──────────────────────│
  │─ script.callFunction ─────────►│  (check stable T1)
  │◄─ result ──────────────────────│
  │   wait 50ms                    │
  │─ script.callFunction ─────────►│  (check stable T2)
  │◄─ result ──────────────────────│
  │─ script.callFunction ─────────►│  (check receivesEvents)
  │◄─ result ──────────────────────│
  │─ script.callFunction ─────────►│  (check enabled)
  │◄─ result ──────────────────────│
  │─ script.callFunction ─────────►│  (get bounding box)
  │◄─ result ──────────────────────│
  │─ input.performActions ─────────►│  (click)
  │◄─ result ──────────────────────│

9 round trips minimum — each adding network latency if client is remote.

With Extension Commands (Proxy Handles)

Client              Proxy             Chrome
  │                   │                  │
  │─ vibium:click ───►│                  │
  │                   │─ script.call ───►│  (all checks happen locally)
  │                   │◄─ result ────────│
  │                   │─ script.call ───►│
  │                   │◄─ result ────────│
  │                   │  ... (local loop)│
  │                   │─ input.perform ─►│
  │                   │◄─ result ────────│
  │◄─ success ────────│                  │

Client sends 1 message, gets 1 response. All the complexity is in the proxy, which communicates with Chrome over a local WebSocket (essentially zero latency).


The Implementation in Go

Located in clicker/internal/proxy/router.go:

func (r *Router) OnClientMessage(msg []byte) {
    var req BiDiMessage
    json.Unmarshal(msg, &req)

    switch req.Method {
    case "vibium:find":
        r.handleVibiumFind(req)
    case "vibium:click":
        r.handleVibiumClick(req)
    case "vibium:type":
        r.handleVibiumType(req)
    default:
        r.forwardToChrome(msg)  // Standard command: pass through
    }
}

func (r *Router) handleVibiumClick(req BiDiMessage) {
    selector := req.Params.Selector
    timeout := req.Params.Timeout
    deadline := time.Now().Add(time.Duration(timeout) * time.Millisecond)

    for {
        if time.Now().After(deadline) {
            r.sendError(req.ID, "timeout", fmt.Sprintf(
                "timeout after %dms waiting for '%s': check '%s' failed",
                timeout, selector, lastFailedCheck))
            return
        }

        // Run all actionability checks via script.callFunction
        if !r.checkVisible(selector) { sleep(100ms); continue }
        if !r.checkStable(selector) { sleep(100ms); continue }
        if !r.checkReceivesEvents(selector) { sleep(100ms); continue }
        if !r.checkEnabled(selector) { sleep(100ms); continue }

        // All checks passed — perform the click
        box := r.getBoundingBox(selector)
        r.performClick(box.CenterX, box.CenterY)
        r.sendSuccess(req.ID, map[string]bool{"clicked": true})
        return
    }
}

Interview Talking Point

"Vibium uses WebDriver BiDi's extension mechanism — vibium:find, vibium:click, vibium:type — to push actionability logic into the proxy server. Without this, each click would require 8-9 WebSocket round trips between client and browser for the actionability checks alone. With extension commands, the client sends one message and gets one response. The proxy handles the polling loop locally where latency is negligible. This is a key architectural insight: by co-locating the intelligence with the browser connection, you get both simpler clients and lower latency. And it's not a protocol hack — BiDi explicitly supports extension modules with the colon naming convention."