Data Structures for QA

Understanding data structures is not academic trivia — it directly impacts how you write test assertions, process test data, validate API responses, and build efficient test utilities. You do not need to implement a red-black tree, but you must be fluent with lists, dictionaries, and sets.

Lists and Arrays

Lists (Python) and arrays (JavaScript) are the most common data structure in test automation. API responses contain lists of items. Test results are lists of pass/fail records. Log files are lists of lines.

Python Lists

# Filtering test results
results = [
    {"name": "test_login", "status": "PASS", "duration": 1.2},
    {"name": "test_signup", "status": "FAIL", "duration": 3.5},
    {"name": "test_logout", "status": "PASS", "duration": 0.8},
    {"name": "test_profile", "status": "FAIL", "duration": 2.1},
]

# List comprehension: filter failed tests
failed = [t for t in results if t["status"] == "FAIL"]
assert len(failed) == 2

# Sort by duration (slowest first)
slowest = sorted(results, key=lambda t: t["duration"], reverse=True)
assert slowest[0]["name"] == "test_signup"

# Extract just the names
names = [t["name"] for t in results]
assert "test_login" in names

# Check all tests passed (returns False if any failed)
all_passed = all(t["status"] == "PASS" for t in results)
assert not all_passed

# Check at least one test passed
any_passed = any(t["status"] == "PASS" for t in results)
assert any_passed

JavaScript/TypeScript Arrays

const results = [
    { name: "test_login", status: "PASS", duration: 1.2 },
    { name: "test_signup", status: "FAIL", duration: 3.5 },
    { name: "test_logout", status: "PASS", duration: 0.8 },
    { name: "test_profile", status: "FAIL", duration: 2.1 },
];

// Filter failed tests
const failed = results.filter(t => t.status === "FAIL");
expect(failed).toHaveLength(2);

// Sort by duration (slowest first) — note: sort mutates the array
const slowest = [...results].sort((a, b) => b.duration - a.duration);
expect(slowest[0].name).toBe("test_signup");

// Extract just the names
const names = results.map(t => t.name);
expect(names).toContain("test_login");

// Check all/any
const allPassed = results.every(t => t.status === "PASS");
const anyPassed = results.some(t => t.status === "PASS");

Common List Operations in Testing

Operation	Python	JavaScript
Filter	`[x for x in list if cond]` or `filter()`	`array.filter(fn)`
Transform	`[fn(x) for x in list]` or `map()`	`array.map(fn)`
Sort	`sorted(list, key=fn)`	`[...array].sort(fn)`
Find first	`next((x for x in list if cond), None)`	`array.find(fn)`
Check all	`all(cond for x in list)`	`array.every(fn)`
Check any	`any(cond for x in list)`	`array.some(fn)`
Flatten	`[item for sub in nested for item in sub]`	`array.flat()`
Unique	`list(set(list))`	`[...new Set(array)]`

Dictionaries and Objects

Dictionaries (Python) and objects (JavaScript) are the native format for JSON data, API responses, and configuration. You will work with them constantly.

Python Dictionaries

# API response handling
user = {"email": "test@example.com", "role": "admin", "active": True}
assert response.json()["email"] == user["email"]

# Safely access nested data
config = {
    "environments": {
        "staging": {"url": "https://staging.example.com", "timeout": 30},
        "production": {"url": "https://example.com", "timeout": 10}
    }
}
staging_url = config.get("environments", {}).get("staging", {}).get("url")
assert staging_url == "https://staging.example.com"

# Dictionary comparison for response validation
expected = {"id": 1, "name": "Alice", "role": "admin"}
actual = response.json()
# Check expected is a subset of actual (actual may have extra fields)
for key, value in expected.items():
    assert actual[key] == value, f"Mismatch on {key}: expected {value}, got {actual[key]}"

# Merge dictionaries (Python 3.9+)
default_headers = {"Content-Type": "application/json"}
auth_headers = {"Authorization": "Bearer token123"}
headers = default_headers | auth_headers

JavaScript/TypeScript Objects

// Destructuring API responses
const { id, name, email } = response.data;
expect(id).toBeDefined();
expect(name).toBe("Alice");

// Spread operator for merging
const defaultHeaders = { "Content-Type": "application/json" };
const authHeaders = { Authorization: "Bearer token123" };
const headers = { ...defaultHeaders, ...authHeaders };

// Optional chaining for safe access
const stagingUrl = config?.environments?.staging?.url;
expect(stagingUrl).toBe("https://staging.example.com");

// Check object shape
expect(Object.keys(user)).toEqual(expect.arrayContaining(["id", "name", "email"]));

Sets

Sets provide fast membership checks, deduplication, and set operations (union, intersection, difference). They are essential for validating response fields and detecting duplicates.

Python Sets

# Validate required fields in API response
required = {"id", "name", "email", "created_at"}
actual = set(response.json().keys())
missing = required - actual
assert not missing, f"Missing fields: {missing}"

# Check no sensitive fields are exposed
forbidden = {"password", "password_hash", "ssn", "credit_card"}
exposed = forbidden & actual  # intersection
assert not exposed, f"Sensitive fields exposed: {exposed}"

# Detect duplicate IDs across paginated responses
page1_ids = {u["id"] for u in page1_response.json()["items"]}
page2_ids = {u["id"] for u in page2_response.json()["items"]}
overlap = page1_ids & page2_ids
assert not overlap, f"Duplicate IDs across pages: {overlap}"

# Verify all expected statuses are present
expected_statuses = {"pending", "processing", "completed", "failed"}
actual_statuses = {o["status"] for o in orders}
missing_statuses = expected_statuses - actual_statuses
# (This tells you which statuses are not represented in the data)

JavaScript Sets

// Deduplication
const ids = responses.map(r => r.id);
const uniqueIds = new Set(ids);
expect(uniqueIds.size).toBe(ids.length);  // no duplicates

// Field validation
const required = new Set(["id", "name", "email", "created_at"]);
const actual = new Set(Object.keys(response.data));
const missing = [...required].filter(f => !actual.has(f));
expect(missing).toHaveLength(0);

Set Operations Reference

Operation	Python	Use Case
Union	`a \| b`	All fields from two responses combined
Intersection	`a & b`	Fields present in both responses
Difference	`a - b`	Fields in a but not in b
Symmetric Diff	`a ^ b`	Fields in one but not both
Subset check	`a <= b`	Are all required fields present?

Tuples and Named Tuples

Tuples are immutable sequences — useful for fixed data like locator pairs in page objects.

from selenium.webdriver.common.by import By

# Tuple for locators (cannot be accidentally modified)
EMAIL_INPUT = (By.CSS_SELECTOR, "input[name='email']")
PASSWORD_INPUT = (By.CSS_SELECTOR, "input[name='password']")

driver.find_element(*EMAIL_INPUT).send_keys("test@example.com")

# Named tuples for structured test data
from collections import namedtuple
TestUser = namedtuple("TestUser", ["email", "password", "role"])
admin = TestUser("admin@test.com", "AdminPass123!", "admin")
viewer = TestUser("viewer@test.com", "ViewerPass123!", "viewer")

assert admin.role == "admin"

Choosing the Right Data Structure

Need	Use	Why
Ordered collection of items	List / Array	Preserves order, allows duplicates
Key-value lookup	Dict / Object	O(1) lookup by key
Unique collection, membership checks	Set	O(1) membership check, automatic dedup
Immutable ordered data	Tuple	Cannot be accidentally modified
Counting occurrences	`collections.Counter`	Histogram of values
Ordered dict (insertion order)	`dict` (Python 3.7+)	Guaranteed insertion order

Practical Exercise

Given an API that returns paginated user lists, write functions that:

Collect all users across all pages into a single list
Verify no duplicate IDs exist across pages
Find users that exist in the "active users" endpoint but not in the "all users" endpoint (use sets)
Group users by role using a dictionary
Sort users by creation date and verify they are in chronological order

Key Takeaways

Lists/arrays: filtering, sorting, and transforming test data
Dicts/objects: JSON response handling, configuration, key-value lookups
Sets: field validation, duplicate detection, membership checks
Choose the right structure for the job — sets for uniqueness, dicts for lookup, lists for order
Master list comprehensions (Python) and array methods (JavaScript) for concise test code