The ReAct Core Loop for Testing

From Scripts to Agents: A Fundamental Shift

Traditional test automation is imperative: you write step-by-step scripts that execute deterministically. Agentic testing is fundamentally different. An agent observes the state of the system, reasons about what to do next, acts, and evaluates the outcome -- all within a loop that can adapt to unexpected situations.

ReAct (Reason + Act) is the foundational pattern for agentic systems. Originally described for general-purpose LLM agents, it maps directly to testing workflows.

The Four Phases

+---------------------------------------------------+
|                                                   |
|  OBSERVE --> THINK --> ACT --> EVALUATE --> LOOP  |
|     |           |         |          |            |
|  Read page   Decide    Execute    Check if        |
|  state,      what to   the test   assertion       |
|  logs,       test      action     passed or       |
|  errors      next                 failed          |
|                                                   |
+---------------------------------------------------+

Phase 1: OBSERVE

The agent gathers information about the current state of the system under test. In browser testing, this means reading the page content, URL, console errors, and network activity. In API testing, it means reading response status codes, headers, and bodies.

class Observation:
    url: str                    # Current page URL
    text: str                   # Visible text content (truncated)
    errors: list[str]           # Console errors
    network_requests: list[dict]  # Recent HTTP requests
    screenshot_path: str | None   # Path to current screenshot
    page_title: str              # Document title

Key insight: The quality of the observation determines the quality of the agent's reasoning. A rich observation (text + errors + network) produces better decisions than a sparse one (just the URL).

Phase 2: THINK

The agent analyzes the observation and decides what to do next. This is where the LLM's reasoning ability is essential. It considers:

The test objective (what are we trying to verify?)
The current state (where are we now?)
The history (what have we already tried?)
The constraints (how many steps remain? what actions are allowed?)

prompt = f"""
Objective: {test_objective}
Current URL: {observation.url}
Page content (truncated): {observation.text[:2000]}
Console errors: {observation.errors}
History: {self.history[-5:]}  # Last 5 actions

What should I do next? Choose one:
- NAVIGATE <url>
- CLICK <selector>
- TYPE <selector> <text>
- ASSERT <condition>
- DONE <pass|fail> <reason>
"""
decision = self.llm.generate(prompt)

Phase 3: ACT

The agent executes the decided action against the system under test. This is the only phase that changes state.

def execute(self, decision: str) -> ActionResult:
    if decision.startswith("NAVIGATE"):
        url = decision.split(" ", 1)[1]
        self.browser.navigate(url)
    elif decision.startswith("CLICK"):
        selector = decision.split(" ", 1)[1]
        self.browser.click(selector)
    elif decision.startswith("TYPE"):
        parts = decision.split(" ", 2)
        self.browser.type(parts[1], parts[2])
    elif decision.startswith("ASSERT"):
        condition = decision.split(" ", 1)[1]
        return self.evaluate_assertion(condition)
    elif decision.startswith("DONE"):
        return self.parse_final_result(decision)

Phase 4: EVALUATE

After acting, the agent evaluates whether the action succeeded and whether the test objective is met. If the objective is not yet met, the loop continues.

if result.is_terminal:
    return TestResult(
        status=result.status,
        reason=result.reason,
        steps_taken=self.step_count,
        history=self.history
    )
# Otherwise, loop back to OBSERVE

A Complete ReAct Agent Implementation

class ReActTestAgent:
    def __init__(self, llm, browser, max_steps=20):
        self.llm = llm
        self.browser = browser
        self.max_steps = max_steps
        self.history = []

    def run(self, test_objective: str) -> TestResult:
        """Execute a test using the ReAct loop."""
        for step in range(self.max_steps):
            # OBSERVE
            observation = self.browser.get_state()

            # THINK
            prompt = f"""
            Objective: {test_objective}
            Current URL: {observation.url}
            Page content (truncated): {observation.text[:2000]}
            Console errors: {observation.errors}
            History: {self.history[-5:]}

            What should I do next? Choose one:
            - NAVIGATE <url>
            - CLICK <selector>
            - TYPE <selector> <text>
            - ASSERT <condition>
            - DONE <pass|fail> <reason>
            """
            decision = self.llm.generate(prompt)

            # ACT
            action_result = self.execute(decision)
            self.history.append({
                "step": step,
                "observation": observation.summary,
                "decision": decision,
                "result": action_result
            })

            # EVALUATE
            if decision.startswith("DONE"):
                return self.parse_result(decision)

        return TestResult(
            status="TIMEOUT",
            reason=f"Exceeded {self.max_steps} steps"
        )

Why ReAct Matters for Testing

The key insight: the agent does not follow a fixed script. It adapts its next action based on what it observes. This creates capabilities that deterministic scripts cannot match:

Capability	Script-Based Testing	ReAct Agent Testing
Missing element	Test fails immediately	Agent looks for alternative selector
Unexpected popup	Test fails or blocks	Agent dismisses popup and continues
Slow page load	Timeout (fixed wait)	Agent observes loading state, waits intelligently
Changed layout	Test fails (selector broken)	Agent reasons about new layout
Error state	Test follows happy path only	Agent investigates the error
New feature	Test ignores it	Agent notices and can explore

Example: Adaptive Error Investigation

Step 1: OBSERVE — Login page loaded, email and password fields visible
Step 2: THINK  — "I should fill in valid credentials and click login"
Step 3: ACT    — Type email, type password, click submit
Step 4: OBSERVE — Error banner appeared: "Account locked after 5 failed attempts"
Step 5: THINK  — "Unexpected error. This is not a test failure, it is test
                  environment state. The account was locked by a previous test run.
                  I should report this as an environment issue."
Step 6: ACT    — DONE fail "Account locked - environment state issue, not a bug"

A deterministic script would simply fail with "Expected dashboard URL, got login URL." The agent provides diagnostic information.

The History Buffer

The history buffer is critical for preventing infinite loops and enabling learning within a test run.

# Keep the last N steps to fit in context window
self.history = self.history[-5:]

# Each history entry includes:
{
    "step": 3,
    "observation": "Page shows login form with email/password fields",
    "decision": "TYPE input[name=email] test@example.com",
    "result": "success"
}

Without history: The agent might click the same button repeatedly, not remembering it already tried that.

With history: The agent reasons, "I already tried clicking the submit button and it did not navigate. Let me check if there is a validation error I missed."

Limitations of ReAct for Testing

Non-deterministic. The same test objective may take different paths on different runs. This is a feature for exploration but a problem for regression testing.
Expensive. Each step requires an LLM call. A 20-step test costs 20x more in tokens than a script.
Slow. LLM latency (100-500ms per call) adds up. A 20-step agent takes 2-10 seconds just for reasoning, plus action execution time.
Hallucination risk. The agent might "see" elements that do not exist or misinterpret page content.
Debugging difficulty. When a ReAct agent fails, you have to trace through its reasoning history to understand why. This is harder than debugging a linear script.

When to Use ReAct vs Scripts

Use ReAct	Use Scripts
Exploratory testing	Regression testing
New feature discovery	Known flow verification
Self-healing test suites	CI/CD gate tests
Complex multi-step workflows	Simple CRUD operations
Environments with frequent UI changes	Stable environments

Key Takeaway

The ReAct loop is the building block of all agentic testing. It transforms testing from "follow these exact steps" to "achieve this objective by adapting to what you observe." Master this pattern and you understand the foundation of every agentic testing tool on the market.