Pipeline Optimization

Why Speed Matters

Slow pipelines erode developer trust. When the pipeline takes 45 minutes, developers stop waiting for it and merge anyway. When the pipeline takes 5 minutes, developers treat it as a reliable safety net and actually look at the results before merging.

The goal: under 5 minutes for the PR feedback loop (lint + unit tests) and under 20 minutes for the full pipeline (including browser tests).

Caching

Caching is the single most impactful optimization. Without caching, every pipeline run downloads and installs dependencies from scratch -- often taking 2-5 minutes that add zero value.

What to Cache

What	Cache Key	Typical Savings
Node modules	`package-lock.json` hash	1-3 minutes
Python virtualenvs	`requirements.txt` or `poetry.lock` hash	1-2 minutes
Playwright browsers	`package-lock.json` hash	2-4 minutes
Docker layers	Dockerfile hash	2-10 minutes
Gradle/Maven dependencies	`build.gradle` or `pom.xml` hash	1-5 minutes
Go modules	`go.sum` hash	30s-2 minutes

GitHub Actions Caching

# Cache node_modules (automatic with setup-node)
- uses: actions/setup-node@v4
  with:
    node-version: 20
    cache: 'npm'

# Cache Playwright browsers (manual)
- uses: actions/cache@v4
  id: playwright-cache
  with:
    path: ~/.cache/ms-playwright
    key: playwright-${{ runner.os }}-${{ hashFiles('package-lock.json') }}

- run: npx playwright install --with-deps
  if: steps.playwright-cache.outputs.cache-hit != 'true'

The cache-hit output lets you skip installation entirely when the cache is valid. This turns a 3-minute Playwright browser installation into a 5-second cache restore.

Cache Invalidation Strategy

Cache keys should change when dependencies change and remain stable otherwise:

# Good: Changes only when lockfile changes
key: deps-${{ hashFiles('package-lock.json') }}

# Bad: Changes on every commit (cache is never used)
key: deps-${{ github.sha }}

# Better: Fallback to partial cache match
key: deps-${{ hashFiles('package-lock.json') }}
restore-keys: |
  deps-

The restore-keys fallback finds the most recent cache that starts with deps-, which may be slightly stale but is much faster than installing from scratch.

Parallelization

Test Sharding

Split your test suite across multiple runners. Each runner executes a fraction of the tests, and the total wall-clock time is roughly total_time / number_of_shards.

# Playwright sharding
strategy:
  matrix:
    shard: [1, 2, 3, 4]
steps:
  - run: npx playwright test --shard=${{ matrix.shard }}/4

# Jest sharding
strategy:
  matrix:
    shard: [1, 2, 3]
steps:
  - run: npx jest --shard=${{ matrix.shard }}/3

# pytest-xdist (automatic parallelization within a single runner)
steps:
  - run: pytest -n auto  # Uses all available CPU cores

Choosing the right number of shards:

Start with 3-4 shards and measure
Each shard should take roughly the same time (balanced distribution)
Too many shards means overhead from setup/teardown dominates
Too few means each shard is still slow

Matrix Strategy for Cross-Cutting Concerns

Combine sharding with other dimensions:

strategy:
  fail-fast: false
  matrix:
    browser: [chromium, firefox]
    shard: [1, 2, 3]
# Creates 6 parallel jobs: chromium-1, chromium-2, chromium-3, firefox-1, firefox-2, firefox-3

Parallel Jobs vs Parallel Tests

Parallel jobs (matrix strategy): Each job runs on a separate runner. Good for isolation and cross-browser testing.
Parallel tests (within a job): Use multi-core runners and tools like pytest-xdist or Jest workers. Good for CPU-bound unit tests.

For maximum speed, combine both: run 4 shards on 4 runners, and within each shard, run tests on 2 CPU cores.

Fail-Fast Strategy

The fail-fast setting controls whether other matrix jobs are cancelled when one fails.

strategy:
  fail-fast: true   # Cancel all jobs if one fails
  fail-fast: false  # Let all jobs complete regardless

When to use fail-fast: true:

Unit tests: If one shard fails, the others are likely broken too. Cancel them to save time.
Lint/typecheck: If linting fails, there is no point running tests.

When to use fail-fast: false:

Browser tests: You want to know the full scope of failures across all browsers and shards. A test that fails in Firefox but passes in Chrome is a different bug than one that fails everywhere.
Integration tests with external services: Failures might be isolated to specific test cases.

Conditional Execution

Skip expensive work when it is not needed.

Path-Based Filtering

# Skip tests when only documentation changed
on:
  push:
    paths-ignore:
      - '**.md'
      - 'docs/**'
      - '.github/ISSUE_TEMPLATE/**'
      - 'LICENSE'

# Only run backend tests when backend code changed
on:
  push:
    paths:
      - 'src/api/**'
      - 'src/services/**'
      - 'tests/integration/**'

Job-Level Conditions

# Only run browser tests on PRs targeting main
browser-tests:
  if: github.event_name == 'pull_request' && github.base_ref == 'main'

# Skip expensive tests for draft PRs
e2e-tests:
  if: github.event.pull_request.draft == false

Step-Level Conditions

# Only upload artifacts on failure
- uses: actions/upload-artifact@v4
  if: failure()

# Only run deployment on main branch
- run: npm run deploy
  if: github.ref == 'refs/heads/main'

Pipeline Configuration for Test Performance

Environment variables that affect test speed:

env:
  TEST_TIMEOUT: 30000      # 30s timeout per test (prevent hanging tests)
  RETRY_COUNT: 2           # Retry failed tests twice (handle infrastructure flakes)
  HEADLESS: true           # Run browsers in headless mode (faster)
  SLOW_MO: 0               # No artificial delay between actions
  WORKERS: 4               # Number of parallel test workers

Retry Logic

Retries should handle infrastructure flakes (network timeouts, container startup delays), not test bugs. If a test consistently needs retries to pass, it is flaky and needs fixing.

# Playwright retry configuration
- run: npx playwright test --retries=2

# Job-level retry (GitHub Actions does not support this natively;
# use reusable workflows or third-party actions)

Timeout Settings

Always set timeouts to prevent hung jobs from consuming runners:

jobs:
  unit-tests:
    runs-on: ubuntu-latest
    timeout-minutes: 10     # Kill the job if it takes longer than 10 minutes

  browser-tests:
    runs-on: ubuntu-latest
    timeout-minutes: 30     # Browser tests get more time

Measuring Pipeline Performance

Track these metrics to identify optimization opportunities:

Metric	How to Measure	Target
PR feedback time	Time from push to unit test results	< 5 minutes
Full pipeline time	Time from push to all checks complete	< 20 minutes
Cache hit rate	Check cache action logs	> 90%
Flaky test rate	Tests that pass on retry	< 2%
Queue time	Time jobs spend waiting for a runner	< 1 minute

If queue time is high, you need more runners or better scheduling. If cache hit rate is low, your cache keys are too specific.

Advanced Optimization Techniques

Reusable Workflows

Extract common patterns into reusable workflows to avoid duplication:

# .github/workflows/reusable-test.yml
on:
  workflow_call:
    inputs:
      test-command:
        required: true
        type: string
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: 'npm'
      - run: npm ci
      - run: ${{ inputs.test-command }}

Dependency Graph Optimization

If your repository has multiple packages (monorepo), only test packages affected by the change:

# Use a tool like Nx or Turborepo to detect affected packages
- run: npx nx affected --target=test --base=origin/main --head=HEAD

Hands-On Exercise

Measure your current pipeline time end-to-end. Write down each stage's duration.
Add caching for dependencies and Playwright browsers. Measure the improvement.
Add test sharding with 3-4 shards. Measure the improvement.
Add path filtering to skip tests when only docs change.
Set appropriate fail-fast settings for each job type.
Set timeouts on all jobs to prevent hung pipelines.
Compare before and after metrics. Target: 50% reduction in total pipeline time.