Data Structures for QA
Understanding data structures is not academic trivia — it directly impacts how you write test assertions, process test data, validate API responses, and build efficient test utilities. You do not need to implement a red-black tree, but you must be fluent with lists, dictionaries, and sets.
Lists and Arrays
Lists (Python) and arrays (JavaScript) are the most common data structure in test automation. API responses contain lists of items. Test results are lists of pass/fail records. Log files are lists of lines.
Python Lists
# Filtering test results
results = [
{"name": "test_login", "status": "PASS", "duration": 1.2},
{"name": "test_signup", "status": "FAIL", "duration": 3.5},
{"name": "test_logout", "status": "PASS", "duration": 0.8},
{"name": "test_profile", "status": "FAIL", "duration": 2.1},
]
# List comprehension: filter failed tests
failed = [t for t in results if t["status"] == "FAIL"]
assert len(failed) == 2
# Sort by duration (slowest first)
slowest = sorted(results, key=lambda t: t["duration"], reverse=True)
assert slowest[0]["name"] == "test_signup"
# Extract just the names
names = [t["name"] for t in results]
assert "test_login" in names
# Check all tests passed (returns False if any failed)
all_passed = all(t["status"] == "PASS" for t in results)
assert not all_passed
# Check at least one test passed
any_passed = any(t["status"] == "PASS" for t in results)
assert any_passed
JavaScript/TypeScript Arrays
const results = [
{ name: "test_login", status: "PASS", duration: 1.2 },
{ name: "test_signup", status: "FAIL", duration: 3.5 },
{ name: "test_logout", status: "PASS", duration: 0.8 },
{ name: "test_profile", status: "FAIL", duration: 2.1 },
];
// Filter failed tests
const failed = results.filter(t => t.status === "FAIL");
expect(failed).toHaveLength(2);
// Sort by duration (slowest first) — note: sort mutates the array
const slowest = [...results].sort((a, b) => b.duration - a.duration);
expect(slowest[0].name).toBe("test_signup");
// Extract just the names
const names = results.map(t => t.name);
expect(names).toContain("test_login");
// Check all/any
const allPassed = results.every(t => t.status === "PASS");
const anyPassed = results.some(t => t.status === "PASS");
Common List Operations in Testing
| Operation | Python | JavaScript |
|---|---|---|
| Filter | [x for x in list if cond] or filter() |
array.filter(fn) |
| Transform | [fn(x) for x in list] or map() |
array.map(fn) |
| Sort | sorted(list, key=fn) |
[...array].sort(fn) |
| Find first | next((x for x in list if cond), None) |
array.find(fn) |
| Check all | all(cond for x in list) |
array.every(fn) |
| Check any | any(cond for x in list) |
array.some(fn) |
| Flatten | [item for sub in nested for item in sub] |
array.flat() |
| Unique | list(set(list)) |
[...new Set(array)] |
Dictionaries and Objects
Dictionaries (Python) and objects (JavaScript) are the native format for JSON data, API responses, and configuration. You will work with them constantly.
Python Dictionaries
# API response handling
user = {"email": "test@example.com", "role": "admin", "active": True}
assert response.json()["email"] == user["email"]
# Safely access nested data
config = {
"environments": {
"staging": {"url": "https://staging.example.com", "timeout": 30},
"production": {"url": "https://example.com", "timeout": 10}
}
}
staging_url = config.get("environments", {}).get("staging", {}).get("url")
assert staging_url == "https://staging.example.com"
# Dictionary comparison for response validation
expected = {"id": 1, "name": "Alice", "role": "admin"}
actual = response.json()
# Check expected is a subset of actual (actual may have extra fields)
for key, value in expected.items():
assert actual[key] == value, f"Mismatch on {key}: expected {value}, got {actual[key]}"
# Merge dictionaries (Python 3.9+)
default_headers = {"Content-Type": "application/json"}
auth_headers = {"Authorization": "Bearer token123"}
headers = default_headers | auth_headers
JavaScript/TypeScript Objects
// Destructuring API responses
const { id, name, email } = response.data;
expect(id).toBeDefined();
expect(name).toBe("Alice");
// Spread operator for merging
const defaultHeaders = { "Content-Type": "application/json" };
const authHeaders = { Authorization: "Bearer token123" };
const headers = { ...defaultHeaders, ...authHeaders };
// Optional chaining for safe access
const stagingUrl = config?.environments?.staging?.url;
expect(stagingUrl).toBe("https://staging.example.com");
// Check object shape
expect(Object.keys(user)).toEqual(expect.arrayContaining(["id", "name", "email"]));
Sets
Sets provide fast membership checks, deduplication, and set operations (union, intersection, difference). They are essential for validating response fields and detecting duplicates.
Python Sets
# Validate required fields in API response
required = {"id", "name", "email", "created_at"}
actual = set(response.json().keys())
missing = required - actual
assert not missing, f"Missing fields: {missing}"
# Check no sensitive fields are exposed
forbidden = {"password", "password_hash", "ssn", "credit_card"}
exposed = forbidden & actual # intersection
assert not exposed, f"Sensitive fields exposed: {exposed}"
# Detect duplicate IDs across paginated responses
page1_ids = {u["id"] for u in page1_response.json()["items"]}
page2_ids = {u["id"] for u in page2_response.json()["items"]}
overlap = page1_ids & page2_ids
assert not overlap, f"Duplicate IDs across pages: {overlap}"
# Verify all expected statuses are present
expected_statuses = {"pending", "processing", "completed", "failed"}
actual_statuses = {o["status"] for o in orders}
missing_statuses = expected_statuses - actual_statuses
# (This tells you which statuses are not represented in the data)
JavaScript Sets
// Deduplication
const ids = responses.map(r => r.id);
const uniqueIds = new Set(ids);
expect(uniqueIds.size).toBe(ids.length); // no duplicates
// Field validation
const required = new Set(["id", "name", "email", "created_at"]);
const actual = new Set(Object.keys(response.data));
const missing = [...required].filter(f => !actual.has(f));
expect(missing).toHaveLength(0);
Set Operations Reference
| Operation | Python | Use Case |
|---|---|---|
| Union | a | b |
All fields from two responses combined |
| Intersection | a & b |
Fields present in both responses |
| Difference | a - b |
Fields in a but not in b |
| Symmetric Diff | a ^ b |
Fields in one but not both |
| Subset check | a <= b |
Are all required fields present? |
Tuples and Named Tuples
Tuples are immutable sequences — useful for fixed data like locator pairs in page objects.
from selenium.webdriver.common.by import By
# Tuple for locators (cannot be accidentally modified)
EMAIL_INPUT = (By.CSS_SELECTOR, "input[name='email']")
PASSWORD_INPUT = (By.CSS_SELECTOR, "input[name='password']")
driver.find_element(*EMAIL_INPUT).send_keys("test@example.com")
# Named tuples for structured test data
from collections import namedtuple
TestUser = namedtuple("TestUser", ["email", "password", "role"])
admin = TestUser("admin@test.com", "AdminPass123!", "admin")
viewer = TestUser("viewer@test.com", "ViewerPass123!", "viewer")
assert admin.role == "admin"
Choosing the Right Data Structure
| Need | Use | Why |
|---|---|---|
| Ordered collection of items | List / Array | Preserves order, allows duplicates |
| Key-value lookup | Dict / Object | O(1) lookup by key |
| Unique collection, membership checks | Set | O(1) membership check, automatic dedup |
| Immutable ordered data | Tuple | Cannot be accidentally modified |
| Counting occurrences | collections.Counter |
Histogram of values |
| Ordered dict (insertion order) | dict (Python 3.7+) |
Guaranteed insertion order |
Practical Exercise
Given an API that returns paginated user lists, write functions that:
- Collect all users across all pages into a single list
- Verify no duplicate IDs exist across pages
- Find users that exist in the "active users" endpoint but not in the "all users" endpoint (use sets)
- Group users by role using a dictionary
- Sort users by creation date and verify they are in chronological order
Key Takeaways
- Lists/arrays: filtering, sorting, and transforming test data
- Dicts/objects: JSON response handling, configuration, key-value lookups
- Sets: field validation, duplicate detection, membership checks
- Choose the right structure for the job — sets for uniqueness, dicts for lookup, lists for order
- Master list comprehensions (Python) and array methods (JavaScript) for concise test code