Pipeline Optimization
Why Speed Matters
Slow pipelines erode developer trust. When the pipeline takes 45 minutes, developers stop waiting for it and merge anyway. When the pipeline takes 5 minutes, developers treat it as a reliable safety net and actually look at the results before merging.
The goal: under 5 minutes for the PR feedback loop (lint + unit tests) and under 20 minutes for the full pipeline (including browser tests).
Caching
Caching is the single most impactful optimization. Without caching, every pipeline run downloads and installs dependencies from scratch -- often taking 2-5 minutes that add zero value.
What to Cache
| What | Cache Key | Typical Savings |
|---|---|---|
| Node modules | package-lock.json hash |
1-3 minutes |
| Python virtualenvs | requirements.txt or poetry.lock hash |
1-2 minutes |
| Playwright browsers | package-lock.json hash |
2-4 minutes |
| Docker layers | Dockerfile hash | 2-10 minutes |
| Gradle/Maven dependencies | build.gradle or pom.xml hash |
1-5 minutes |
| Go modules | go.sum hash |
30s-2 minutes |
GitHub Actions Caching
# Cache node_modules (automatic with setup-node)
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
# Cache Playwright browsers (manual)
- uses: actions/cache@v4
id: playwright-cache
with:
path: ~/.cache/ms-playwright
key: playwright-${{ runner.os }}-${{ hashFiles('package-lock.json') }}
- run: npx playwright install --with-deps
if: steps.playwright-cache.outputs.cache-hit != 'true'
The cache-hit output lets you skip installation entirely when the cache is valid. This turns a 3-minute Playwright browser installation into a 5-second cache restore.
Cache Invalidation Strategy
Cache keys should change when dependencies change and remain stable otherwise:
# Good: Changes only when lockfile changes
key: deps-${{ hashFiles('package-lock.json') }}
# Bad: Changes on every commit (cache is never used)
key: deps-${{ github.sha }}
# Better: Fallback to partial cache match
key: deps-${{ hashFiles('package-lock.json') }}
restore-keys: |
deps-
The restore-keys fallback finds the most recent cache that starts with deps-, which may be slightly stale but is much faster than installing from scratch.
Parallelization
Test Sharding
Split your test suite across multiple runners. Each runner executes a fraction of the tests, and the total wall-clock time is roughly total_time / number_of_shards.
# Playwright sharding
strategy:
matrix:
shard: [1, 2, 3, 4]
steps:
- run: npx playwright test --shard=${{ matrix.shard }}/4
# Jest sharding
strategy:
matrix:
shard: [1, 2, 3]
steps:
- run: npx jest --shard=${{ matrix.shard }}/3
# pytest-xdist (automatic parallelization within a single runner)
steps:
- run: pytest -n auto # Uses all available CPU cores
Choosing the right number of shards:
- Start with 3-4 shards and measure
- Each shard should take roughly the same time (balanced distribution)
- Too many shards means overhead from setup/teardown dominates
- Too few means each shard is still slow
Matrix Strategy for Cross-Cutting Concerns
Combine sharding with other dimensions:
strategy:
fail-fast: false
matrix:
browser: [chromium, firefox]
shard: [1, 2, 3]
# Creates 6 parallel jobs: chromium-1, chromium-2, chromium-3, firefox-1, firefox-2, firefox-3
Parallel Jobs vs Parallel Tests
- Parallel jobs (matrix strategy): Each job runs on a separate runner. Good for isolation and cross-browser testing.
- Parallel tests (within a job): Use multi-core runners and tools like
pytest-xdistor Jest workers. Good for CPU-bound unit tests.
For maximum speed, combine both: run 4 shards on 4 runners, and within each shard, run tests on 2 CPU cores.
Fail-Fast Strategy
The fail-fast setting controls whether other matrix jobs are cancelled when one fails.
strategy:
fail-fast: true # Cancel all jobs if one fails
fail-fast: false # Let all jobs complete regardless
When to use fail-fast: true:
- Unit tests: If one shard fails, the others are likely broken too. Cancel them to save time.
- Lint/typecheck: If linting fails, there is no point running tests.
When to use fail-fast: false:
- Browser tests: You want to know the full scope of failures across all browsers and shards. A test that fails in Firefox but passes in Chrome is a different bug than one that fails everywhere.
- Integration tests with external services: Failures might be isolated to specific test cases.
Conditional Execution
Skip expensive work when it is not needed.
Path-Based Filtering
# Skip tests when only documentation changed
on:
push:
paths-ignore:
- '**.md'
- 'docs/**'
- '.github/ISSUE_TEMPLATE/**'
- 'LICENSE'
# Only run backend tests when backend code changed
on:
push:
paths:
- 'src/api/**'
- 'src/services/**'
- 'tests/integration/**'
Job-Level Conditions
# Only run browser tests on PRs targeting main
browser-tests:
if: github.event_name == 'pull_request' && github.base_ref == 'main'
# Skip expensive tests for draft PRs
e2e-tests:
if: github.event.pull_request.draft == false
Step-Level Conditions
# Only upload artifacts on failure
- uses: actions/upload-artifact@v4
if: failure()
# Only run deployment on main branch
- run: npm run deploy
if: github.ref == 'refs/heads/main'
Pipeline Configuration for Test Performance
Environment variables that affect test speed:
env:
TEST_TIMEOUT: 30000 # 30s timeout per test (prevent hanging tests)
RETRY_COUNT: 2 # Retry failed tests twice (handle infrastructure flakes)
HEADLESS: true # Run browsers in headless mode (faster)
SLOW_MO: 0 # No artificial delay between actions
WORKERS: 4 # Number of parallel test workers
Retry Logic
Retries should handle infrastructure flakes (network timeouts, container startup delays), not test bugs. If a test consistently needs retries to pass, it is flaky and needs fixing.
# Playwright retry configuration
- run: npx playwright test --retries=2
# Job-level retry (GitHub Actions does not support this natively;
# use reusable workflows or third-party actions)
Timeout Settings
Always set timeouts to prevent hung jobs from consuming runners:
jobs:
unit-tests:
runs-on: ubuntu-latest
timeout-minutes: 10 # Kill the job if it takes longer than 10 minutes
browser-tests:
runs-on: ubuntu-latest
timeout-minutes: 30 # Browser tests get more time
Measuring Pipeline Performance
Track these metrics to identify optimization opportunities:
| Metric | How to Measure | Target |
|---|---|---|
| PR feedback time | Time from push to unit test results | < 5 minutes |
| Full pipeline time | Time from push to all checks complete | < 20 minutes |
| Cache hit rate | Check cache action logs | > 90% |
| Flaky test rate | Tests that pass on retry | < 2% |
| Queue time | Time jobs spend waiting for a runner | < 1 minute |
If queue time is high, you need more runners or better scheduling. If cache hit rate is low, your cache keys are too specific.
Advanced Optimization Techniques
Reusable Workflows
Extract common patterns into reusable workflows to avoid duplication:
# .github/workflows/reusable-test.yml
on:
workflow_call:
inputs:
test-command:
required: true
type: string
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- run: npm ci
- run: ${{ inputs.test-command }}
Dependency Graph Optimization
If your repository has multiple packages (monorepo), only test packages affected by the change:
# Use a tool like Nx or Turborepo to detect affected packages
- run: npx nx affected --target=test --base=origin/main --head=HEAD
Hands-On Exercise
- Measure your current pipeline time end-to-end. Write down each stage's duration.
- Add caching for dependencies and Playwright browsers. Measure the improvement.
- Add test sharding with 3-4 shards. Measure the improvement.
- Add path filtering to skip tests when only docs change.
- Set appropriate
fail-fastsettings for each job type. - Set timeouts on all jobs to prevent hung pipelines.
- Compare before and after metrics. Target: 50% reduction in total pipeline time.