Combinatorial Test Suite Generation
The Combinatorial Explosion Problem
When you have multiple input parameters, exhaustive testing is impractical. A user registration form with 5 parameters, each with 3-5 values, produces hundreds or thousands of combinations. Testing all of them is expensive and most combinations do not reveal unique bugs.
Combinatorial test design solves this by selecting a subset of combinations that guarantees every pair (or n-tuple) of parameter values appears in at least one test case. This dramatically reduces test count while maintaining high defect detection.
Pairwise (2-Way) Testing
Pairwise testing ensures that every combination of any two parameters appears in at least one test case. Research consistently shows that most software defects are triggered by interactions between at most two parameters (approximately 70-90% of defects according to NIST studies).
Prompt for Pairwise Generation
Generate a pairwise combinatorial test suite for a user registration form with
these parameters:
- Browser: Chrome, Firefox, Safari, Edge
- OS: Windows, macOS, Linux
- Language: English, Spanish, Japanese
- Account type: Free, Pro, Enterprise
- Auth method: Email/password, Google SSO, SAML
Requirements:
- Every pair of parameter values must appear in at least one test case
- Minimize total number of test cases
- Output as a numbered table
Example Output
A strong LLM produces a near-optimal pairwise set of approximately 16-20 test configurations instead of the full 4 x 3 x 3 x 3 x 3 = 324 combinations:
| # | Browser | OS | Language | Account | Auth |
|---|---|---|---|---|---|
| 1 | Chrome | Windows | English | Free | Email/password |
| 2 | Chrome | macOS | Spanish | Pro | Google SSO |
| 3 | Chrome | Linux | Japanese | Enterprise | SAML |
| 4 | Firefox | Windows | Spanish | Enterprise | SAML |
| 5 | Firefox | macOS | Japanese | Free | Email/password |
| 6 | Firefox | Linux | English | Pro | Google SSO |
| 7 | Safari | Windows | Japanese | Pro | Email/password |
| 8 | Safari | macOS | English | Enterprise | SAML |
| 9 | Safari | Linux | Spanish | Free | Google SSO |
| 10 | Edge | Windows | English | Free | Google SSO |
| 11 | Edge | macOS | Spanish | Enterprise | Email/password |
| 12 | Edge | Linux | Japanese | Pro | SAML |
| 13 | Chrome | Windows | Japanese | Pro | SAML |
| 14 | Firefox | macOS | English | Enterprise | Google SSO |
| 15 | Safari | Linux | English | Free | SAML |
| 16 | Edge | Windows | Spanish | Free | Email/password |
That is 16 tests covering all pairs instead of 324 exhaustive combinations -- a 95% reduction in test count.
Validating Pairwise Coverage
After generation, verify that every pair actually appears:
from itertools import combinations
def verify_pairwise_coverage(test_suite: list[dict], parameters: dict) -> list[str]:
"""Check that every pair of parameter values appears in at least one test."""
missing_pairs = []
param_names = list(parameters.keys())
for p1, p2 in combinations(param_names, 2):
for v1 in parameters[p1]:
for v2 in parameters[p2]:
found = any(
test[p1] == v1 and test[p2] == v2
for test in test_suite
)
if not found:
missing_pairs.append(f"({p1}={v1}, {p2}={v2})")
return missing_pairs
# Usage
parameters = {
"browser": ["Chrome", "Firefox", "Safari", "Edge"],
"os": ["Windows", "macOS", "Linux"],
"language": ["English", "Spanish", "Japanese"],
"account": ["Free", "Pro", "Enterprise"],
"auth": ["Email/password", "Google SSO", "SAML"],
}
missing = verify_pairwise_coverage(test_suite, parameters)
if missing:
print(f"INCOMPLETE: {len(missing)} pairs not covered:")
for pair in missing:
print(f" - {pair}")
else:
print("All pairs covered!")
Higher-Order Combinatorial Testing (3-Way, N-Way)
When defects are triggered by three or more interacting parameters, you need higher-order coverage.
When 3-Way Testing Is Necessary
- Security-critical systems (authentication, authorization, encryption)
- Financial calculations (currency + rounding + tax jurisdiction)
- Hardware/firmware testing (signal combinations)
- Configuration-dependent behavior (feature flags + environments + user roles)
Prompt for 3-Way Generation
Generate a 3-way combinatorial test suite for the payment processing module:
Parameters:
- Payment method: credit_card, debit_card, bank_transfer, paypal
- Currency: USD, EUR, GBP, JPY
- Amount range: small (< $10), medium ($10-$1000), large (> $1000)
- Customer type: new, returning, VIP
- Region: domestic, international
Requirements:
- Every combination of any 3 parameter values must appear in at least one test
- Minimize total number of test cases
- Output as a numbered table with all parameters
The 3-way test count is significantly higher than pairwise (typically 40-60 tests for this parameter set) but still far less than the exhaustive 4 x 4 x 3 x 3 x 2 = 288 combinations.
AI vs Dedicated Combinatorial Tools
| Approach | Best For | Limitation |
|---|---|---|
| AI-generated pairwise | Quick exploration, small parameter spaces (5-7 params) | Not mathematically optimal for large spaces |
| PICT (Microsoft) | Large parameter spaces (10+ params), provable coverage | Requires installation and configuration |
| ACTS (NIST) | Research-grade n-way coverage with constraints | Java dependency, steeper learning curve |
| AI + PICT hybrid | AI identifies parameters and constraints, PICT generates combos | Extra setup, but most rigorous |
The AI + PICT Hybrid Workflow
This is the most rigorous approach and worth mentioning in interviews:
Step 1: Ask AI to identify all relevant parameters and their values
from the specification or requirements document.
Step 2: Ask AI to identify constraints (invalid combinations that
should be excluded, e.g., "SAML auth is only available for
Enterprise accounts").
Step 3: Generate PICT model file from AI output:
# registration.pict (generated by AI, verified by human)
Browser: Chrome, Firefox, Safari, Edge
OS: Windows, macOS, Linux
Language: English, Spanish, Japanese
Account: Free, Pro, Enterprise
Auth: EmailPassword, GoogleSSO, SAML
# Constraints (AI-identified)
IF [Auth] = "SAML" THEN [Account] = "Enterprise";
IF [OS] = "macOS" THEN [Browser] <> "Edge";
Step 4: Run PICT to generate the optimal test suite:
$ pict registration.pict > test_combinations.tsv
Step 5: Ask AI to convert the TSV into executable test code
matching your framework and style.
Converting Combinatorial Tables to Test Code
import pytest
import csv
def load_pairwise_tests(filepath: str) -> list[dict]:
"""Load pairwise test cases from a TSV file."""
with open(filepath) as f:
reader = csv.DictReader(f, delimiter='\t')
return list(reader)
PAIRWISE_TESTS = load_pairwise_tests("test_combinations.tsv")
class TestRegistrationCombinatorial:
"""Pairwise combinatorial tests for user registration."""
@pytest.mark.parametrize("combo", PAIRWISE_TESTS,
ids=[f"combo-{i}" for i in range(len(PAIRWISE_TESTS))])
def test_registration_combination(self, browser_driver, combo):
"""Each pairwise combination should either succeed or fail gracefully."""
driver = browser_driver(
browser=combo["Browser"],
os=combo["OS"],
language=combo["Language"]
)
# Navigate to registration
driver.navigate("/register")
# Fill form based on combination
driver.select_account_type(combo["Account"])
driver.select_auth_method(combo["Auth"])
# Assert: registration either succeeds or shows a clear error
# (no crashes, no blank pages, no 500 errors)
assert driver.current_url in ["/dashboard", "/register"]
if driver.current_url == "/register":
assert driver.find_element(".error-message").is_displayed()
Practical Guidelines
How Many Parameters Warrant Combinatorial Testing?
- 2-3 parameters: Test all combinations manually. Combinatorial tools are overkill.
- 4-7 parameters: Pairwise is the sweet spot. AI generation is fast and sufficient.
- 8-15 parameters: Use PICT or ACTS. AI-generated pairwise may miss pairs.
- 15+ parameters: You probably need to reduce the parameter space first. Consult with the team about which interactions actually matter.
How to Present This in Interviews
"For features with multiple interacting parameters, I use pairwise
combinatorial testing to reduce the test space. For a 5-parameter form
with 3-5 values each, this cuts 300+ combinations to about 16-20 tests
while covering all parameter pair interactions. I generate the initial
matrix using AI from the spec, verify coverage programmatically, then
convert it to parametrized tests. For larger parameter spaces or
safety-critical features, I use Microsoft PICT with AI-identified
constraints to get mathematically optimal coverage."
Common Pitfalls
Forgetting constraints. Not all combinations are valid. SAML auth may only work with Enterprise accounts. If you do not model constraints, you get test cases that should never exist.
Confusing pairwise with random. Random sampling does not guarantee pair coverage. A random selection of 20 tests from 324 combinations will typically miss 30-40% of pairs.
Ignoring negative combinations. Pairwise covers valid interactions, but you also need tests for invalid combinations (e.g., negative quantity + expired coupon + missing auth).
Over-relying on AI optimality. AI-generated pairwise suites are good but not always mathematically minimal. For critical systems, verify with a dedicated tool.
Key Takeaway
Combinatorial testing is the bridge between "test everything" (impractical) and "test a few things" (insufficient). AI makes it accessible by generating pairwise suites from natural language descriptions of parameters. For maximum rigor, combine AI parameter identification with dedicated tools like PICT for generation.