QA Engineer Skills 2026QA-2026Synthetic Monitoring

Synthetic Monitoring

What Is Synthetic Monitoring?

Synthetic monitoring means running automated tests continuously against production -- not just during deployments, but 24/7. These are not load tests; they are lightweight probes that verify critical user journeys remain functional, detecting issues between deployments that no CI pipeline would catch.

Examples of what synthetic monitoring catches:

  • Third-party service degradation (payment provider down)
  • Certificate expiration
  • CDN misconfigurations
  • DNS resolution failures
  • Regional connectivity issues
  • Database connection pool exhaustion
  • Slow memory leaks that accumulate over hours

Synthetic Monitoring Architecture

  +-------------------+
  | Scheduler (cron)  |  -- Every 5 minutes from 3 regions
  +---------+---------+
            |
            v
  +---------+---------+     +---------+---------+     +---------+---------+
  | us-east-1 runner  |     | eu-west-1 runner  |     | ap-south-1 runner |
  | - Login flow      |     | - Login flow      |     | - Login flow      |
  | - Search flow     |     | - Search flow     |     | - Search flow     |
  | - Checkout flow   |     | - Checkout flow   |     | - Checkout flow   |
  +---------+---------+     +---------+---------+     +---------+---------+
            |                         |                         |
            +------------+------------+------------+------------+
                         |
                         v
              +----------+----------+
              | Metrics / Alerting  |   <-- Grafana, Datadog, PagerDuty
              | - Pass/fail status  |
              | - Response times    |
              | - Screenshot diffs  |
              +---------------------+

Playwright Synthetic Monitor Script

// synthetic/checkout-flow.spec.ts
// Runs every 5 minutes against production
import { test, expect } from '@playwright/test';

test.describe('Production Checkout Flow', () => {
  test.setTimeout(30_000); // 30s hard timeout for synthetic tests

  test('complete purchase of a test product', async ({ page }) => {
    // Step 1: Navigate and verify homepage
    const startTime = Date.now();
    await page.goto('https://store.example.com');
    await expect(page.locator('h1')).toContainText('Welcome');

    const homepageLoadTime = Date.now() - startTime;
    console.log(`METRIC homepage_load_ms=${homepageLoadTime}`);

    // Step 2: Search for test product
    await page.fill('[data-testid="search-input"]', 'synthetic-test-product');
    await page.click('[data-testid="search-button"]');
    await expect(page.locator('[data-testid="search-results"]')).toBeVisible();

    // Step 3: Add to cart
    await page.click(
      '[data-testid="product-card"]:first-child [data-testid="add-to-cart"]'
    );
    await expect(page.locator('[data-testid="cart-count"]')).toHaveText('1');

    // Step 4: Begin checkout (using test payment method)
    await page.click('[data-testid="cart-icon"]');
    await page.click('[data-testid="checkout-button"]');

    // Use test payment method that does not charge
    await page.fill('[data-testid="card-number"]', '4242424242424242');
    await page.fill('[data-testid="card-expiry"]', '12/28');
    await page.fill('[data-testid="card-cvc"]', '123');

    await page.click('[data-testid="place-order"]');

    // Step 5: Verify order confirmation
    await expect(page.locator('[data-testid="order-confirmation"]')).toBeVisible({
      timeout: 15_000,
    });
    await expect(page.locator('[data-testid="order-id"]')).toBeVisible();

    const totalFlowTime = Date.now() - startTime;
    console.log(`METRIC checkout_flow_total_ms=${totalFlowTime}`);

    // Assert performance budget
    expect(totalFlowTime).toBeLessThan(15_000);
  });
});

Running Synthetics as a Cron Job in CI

# .github/workflows/synthetic-monitor.yml
name: Synthetic Monitoring
on:
  schedule:
    - cron: '*/5 * * * *'  # every 5 minutes
  workflow_dispatch: {}     # allow manual trigger

jobs:
  synthetic-us-east:
    runs-on: ubuntu-latest  # GitHub-hosted runner (US East)
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
      - run: npm ci
      - name: Install Playwright browsers
        run: npx playwright install --with-deps chromium

      - name: Run synthetic tests
        run: npx playwright test synthetic/ --reporter=json
        env:
          BASE_URL: https://store.example.com
          SYNTHETIC_MODE: true

      - name: Push metrics to Datadog
        if: always()
        run: |
          python scripts/push_synthetic_metrics.py \
            --results playwright-report/results.json \
            --region us-east-1 \
            --datadog-api-key ${{ secrets.DATADOG_API_KEY }}

      - name: Alert on failure
        if: failure()
        run: |
          curl -X POST "${{ secrets.PAGERDUTY_EVENTS_URL }}" \
            -H "Content-Type: application/json" \
            -d '{
              "routing_key": "${{ secrets.PD_ROUTING_KEY }}",
              "event_action": "trigger",
              "payload": {
                "summary": "Synthetic checkout flow FAILED in us-east-1",
                "severity": "critical",
                "source": "github-actions-synthetic"
              }
            }'

Designing Effective Synthetic Tests

What to Monitor

Journey Priority Frequency Timeout
Homepage load Critical Every 1 min 10s
User login Critical Every 2 min 15s
Product search High Every 5 min 15s
Add to cart High Every 5 min 15s
Checkout flow Critical Every 5 min 30s
API health endpoints Critical Every 30s 5s

Best Practices

  1. Use dedicated test data. Create a "synthetic-test-product" that is always in stock, never on sale, and easily identified in analytics filters.

  2. Tag synthetic traffic. Add a header or query parameter (?synthetic=true) so analytics and billing systems can filter out synthetic activity.

  3. Run from multiple regions. A test passing in us-east-1 and failing in ap-south-1 immediately identifies regional issues.

  4. Keep tests simple. Synthetic tests should be the simplest possible path through a critical flow. Complex multi-branch test logic belongs in CI, not in production monitoring.

  5. Set aggressive timeouts. If a checkout flow takes more than 30 seconds in production, it is effectively broken regardless of whether it eventually succeeds.

  6. Capture screenshots on failure. Attach screenshots to alert payloads so the on-call engineer can see what the user would see.


Synthetic Monitoring Platforms

Platform Type Browser Support Regions Cost
Datadog Synthetic SaaS Chrome, API 100+ locations Per-test pricing
Grafana Synthetic SaaS / Self-hosted Chrome, API 30+ locations Included in Grafana Cloud
Checkly SaaS Playwright-native 20+ locations Per-check pricing
GitHub Actions (DIY) Self-hosted Any (Playwright) GitHub runner regions Runner minutes
AWS CloudWatch Synthetics SaaS Chrome (Puppeteer) All AWS regions Per-canary pricing

Choosing a Platform

  • Checkly is the best choice for teams already using Playwright, as it runs Playwright scripts natively
  • Datadog Synthetic integrates seamlessly if you already use Datadog for monitoring
  • GitHub Actions DIY is cost-effective for small teams willing to build their own reporting pipeline
  • Grafana Synthetic is ideal for teams invested in the Grafana ecosystem

Metrics to Track from Synthetic Monitoring

Metric Purpose Alert Threshold
Pass/fail rate per journey Overall health Any failure
Availability (% of passing runs) SLO compliance < 99.5% over 7 days
Response time per step Performance tracking > 2x baseline
Regional availability Geographic health Any region < 99%
Screenshot diff score Visual regression > 10% pixel difference

Synthetic monitoring is the night watch of your production environment. When your team is asleep, synthetic tests are verifying that critical user journeys still work.