Combinatorial Test Suite Generation

The Combinatorial Explosion Problem

When you have multiple input parameters, exhaustive testing is impractical. A user registration form with 5 parameters, each with 3-5 values, produces hundreds or thousands of combinations. Testing all of them is expensive and most combinations do not reveal unique bugs.

Combinatorial test design solves this by selecting a subset of combinations that guarantees every pair (or n-tuple) of parameter values appears in at least one test case. This dramatically reduces test count while maintaining high defect detection.

Pairwise (2-Way) Testing

Pairwise testing ensures that every combination of any two parameters appears in at least one test case. Research consistently shows that most software defects are triggered by interactions between at most two parameters (approximately 70-90% of defects according to NIST studies).

Prompt for Pairwise Generation

Generate a pairwise combinatorial test suite for a user registration form with
these parameters:

- Browser: Chrome, Firefox, Safari, Edge
- OS: Windows, macOS, Linux
- Language: English, Spanish, Japanese
- Account type: Free, Pro, Enterprise
- Auth method: Email/password, Google SSO, SAML

Requirements:
- Every pair of parameter values must appear in at least one test case
- Minimize total number of test cases
- Output as a numbered table

Example Output

A strong LLM produces a near-optimal pairwise set of approximately 16-20 test configurations instead of the full 4 x 3 x 3 x 3 x 3 = 324 combinations:

#	Browser	OS	Language	Account	Auth
1	Chrome	Windows	English	Free	Email/password
2	Chrome	macOS	Spanish	Pro	Google SSO
3	Chrome	Linux	Japanese	Enterprise	SAML
4	Firefox	Windows	Spanish	Enterprise	SAML
5	Firefox	macOS	Japanese	Free	Email/password
6	Firefox	Linux	English	Pro	Google SSO
7	Safari	Windows	Japanese	Pro	Email/password
8	Safari	macOS	English	Enterprise	SAML
9	Safari	Linux	Spanish	Free	Google SSO
10	Edge	Windows	English	Free	Google SSO
11	Edge	macOS	Spanish	Enterprise	Email/password
12	Edge	Linux	Japanese	Pro	SAML
13	Chrome	Windows	Japanese	Pro	SAML
14	Firefox	macOS	English	Enterprise	Google SSO
15	Safari	Linux	English	Free	SAML
16	Edge	Windows	Spanish	Free	Email/password

That is 16 tests covering all pairs instead of 324 exhaustive combinations -- a 95% reduction in test count.

Validating Pairwise Coverage

After generation, verify that every pair actually appears:

from itertools import combinations

def verify_pairwise_coverage(test_suite: list[dict], parameters: dict) -> list[str]:
    """Check that every pair of parameter values appears in at least one test."""
    missing_pairs = []
    param_names = list(parameters.keys())

    for p1, p2 in combinations(param_names, 2):
        for v1 in parameters[p1]:
            for v2 in parameters[p2]:
                found = any(
                    test[p1] == v1 and test[p2] == v2
                    for test in test_suite
                )
                if not found:
                    missing_pairs.append(f"({p1}={v1}, {p2}={v2})")

    return missing_pairs

# Usage
parameters = {
    "browser": ["Chrome", "Firefox", "Safari", "Edge"],
    "os": ["Windows", "macOS", "Linux"],
    "language": ["English", "Spanish", "Japanese"],
    "account": ["Free", "Pro", "Enterprise"],
    "auth": ["Email/password", "Google SSO", "SAML"],
}

missing = verify_pairwise_coverage(test_suite, parameters)
if missing:
    print(f"INCOMPLETE: {len(missing)} pairs not covered:")
    for pair in missing:
        print(f"  - {pair}")
else:
    print("All pairs covered!")

Higher-Order Combinatorial Testing (3-Way, N-Way)

When defects are triggered by three or more interacting parameters, you need higher-order coverage.

When 3-Way Testing Is Necessary

Security-critical systems (authentication, authorization, encryption)
Financial calculations (currency + rounding + tax jurisdiction)
Hardware/firmware testing (signal combinations)
Configuration-dependent behavior (feature flags + environments + user roles)

Prompt for 3-Way Generation

Generate a 3-way combinatorial test suite for the payment processing module:

Parameters:
- Payment method: credit_card, debit_card, bank_transfer, paypal
- Currency: USD, EUR, GBP, JPY
- Amount range: small (< $10), medium ($10-$1000), large (> $1000)
- Customer type: new, returning, VIP
- Region: domestic, international

Requirements:
- Every combination of any 3 parameter values must appear in at least one test
- Minimize total number of test cases
- Output as a numbered table with all parameters

The 3-way test count is significantly higher than pairwise (typically 40-60 tests for this parameter set) but still far less than the exhaustive 4 x 4 x 3 x 3 x 2 = 288 combinations.

AI vs Dedicated Combinatorial Tools

Approach	Best For	Limitation
AI-generated pairwise	Quick exploration, small parameter spaces (5-7 params)	Not mathematically optimal for large spaces
PICT (Microsoft)	Large parameter spaces (10+ params), provable coverage	Requires installation and configuration
ACTS (NIST)	Research-grade n-way coverage with constraints	Java dependency, steeper learning curve
AI + PICT hybrid	AI identifies parameters and constraints, PICT generates combos	Extra setup, but most rigorous

The AI + PICT Hybrid Workflow

This is the most rigorous approach and worth mentioning in interviews:

Step 1: Ask AI to identify all relevant parameters and their values
        from the specification or requirements document.

Step 2: Ask AI to identify constraints (invalid combinations that
        should be excluded, e.g., "SAML auth is only available for
        Enterprise accounts").

Step 3: Generate PICT model file from AI output:

# registration.pict (generated by AI, verified by human)
Browser:  Chrome, Firefox, Safari, Edge
OS:       Windows, macOS, Linux
Language: English, Spanish, Japanese
Account:  Free, Pro, Enterprise
Auth:     EmailPassword, GoogleSSO, SAML

# Constraints (AI-identified)
IF [Auth] = "SAML" THEN [Account] = "Enterprise";
IF [OS] = "macOS" THEN [Browser] <> "Edge";

Step 4: Run PICT to generate the optimal test suite:
        $ pict registration.pict > test_combinations.tsv

Step 5: Ask AI to convert the TSV into executable test code
        matching your framework and style.

Converting Combinatorial Tables to Test Code

import pytest
import csv

def load_pairwise_tests(filepath: str) -> list[dict]:
    """Load pairwise test cases from a TSV file."""
    with open(filepath) as f:
        reader = csv.DictReader(f, delimiter='\t')
        return list(reader)

PAIRWISE_TESTS = load_pairwise_tests("test_combinations.tsv")

class TestRegistrationCombinatorial:
    """Pairwise combinatorial tests for user registration."""

    @pytest.mark.parametrize("combo", PAIRWISE_TESTS,
        ids=[f"combo-{i}" for i in range(len(PAIRWISE_TESTS))])
    def test_registration_combination(self, browser_driver, combo):
        """Each pairwise combination should either succeed or fail gracefully."""
        driver = browser_driver(
            browser=combo["Browser"],
            os=combo["OS"],
            language=combo["Language"]
        )

        # Navigate to registration
        driver.navigate("/register")

        # Fill form based on combination
        driver.select_account_type(combo["Account"])
        driver.select_auth_method(combo["Auth"])

        # Assert: registration either succeeds or shows a clear error
        # (no crashes, no blank pages, no 500 errors)
        assert driver.current_url in ["/dashboard", "/register"]
        if driver.current_url == "/register":
            assert driver.find_element(".error-message").is_displayed()

Practical Guidelines

How Many Parameters Warrant Combinatorial Testing?

2-3 parameters: Test all combinations manually. Combinatorial tools are overkill.
4-7 parameters: Pairwise is the sweet spot. AI generation is fast and sufficient.
8-15 parameters: Use PICT or ACTS. AI-generated pairwise may miss pairs.
15+ parameters: You probably need to reduce the parameter space first. Consult with the team about which interactions actually matter.

How to Present This in Interviews

"For features with multiple interacting parameters, I use pairwise
combinatorial testing to reduce the test space. For a 5-parameter form
with 3-5 values each, this cuts 300+ combinations to about 16-20 tests
while covering all parameter pair interactions. I generate the initial
matrix using AI from the spec, verify coverage programmatically, then
convert it to parametrized tests. For larger parameter spaces or
safety-critical features, I use Microsoft PICT with AI-identified
constraints to get mathematically optimal coverage."

Common Pitfalls

Forgetting constraints. Not all combinations are valid. SAML auth may only work with Enterprise accounts. If you do not model constraints, you get test cases that should never exist.
Confusing pairwise with random. Random sampling does not guarantee pair coverage. A random selection of 20 tests from 324 combinations will typically miss 30-40% of pairs.
Ignoring negative combinations. Pairwise covers valid interactions, but you also need tests for invalid combinations (e.g., negative quantity + expired coupon + missing auth).
Over-relying on AI optimality. AI-generated pairwise suites are good but not always mathematically minimal. For critical systems, verify with a dedicated tool.

Key Takeaway

Combinatorial testing is the bridge between "test everything" (impractical) and "test a few things" (insufficient). AI makes it accessible by generating pairwise suites from natural language descriptions of parameters. For maximum rigor, combine AI parameter identification with dedicated tools like PICT for generation.