QA Engineer Skills 2026QA-2026Structured Logging Best Practices

Structured Logging Best Practices

Why Structured Logging Matters for QA

Unstructured logs are human-readable but machine-hostile. When a production incident occurs at 3 AM, you need to query logs programmatically -- filter by service, time range, error type, and user ID. Structured logging outputs events in a parseable format (JSON), enabling automated analysis, filtering, and correlation that is impossible with free-text logs.


Unstructured vs. Structured Logging

# BAD: Unstructured -- human-readable, machine-hostile
2026-01-15 14:23:45 ERROR Failed to process order for user john@example.com: timeout after 30s

# GOOD: Structured JSON -- machine-parseable, human-readable with tooling
{
  "timestamp": "2026-01-15T14:23:45.123Z",
  "level": "error",
  "service": "order-service",
  "event": "order_processing_failed",
  "user_id": "usr_42",
  "order_id": "ord_789",
  "error_type": "timeout",
  "timeout_seconds": 30,
  "downstream_service": "payment-service",
  "trace_id": "abc123def456",
  "span_id": "span_789",
  "environment": "production",
  "version": "2.4.1"
}

The structured version enables queries like:

  • "Show all timeout errors in the last hour for the payment-service dependency"
  • "Count errors by type for user usr_42"
  • "Correlate this error with the trace abc123def456 to see the full request flow"

Implementing Structured Logging (Python)

# structured_logging_setup.py
import structlog
import logging

# Configure structlog for JSON output
structlog.configure(
    processors=[
        structlog.contextvars.merge_contextvars,      # thread-safe context
        structlog.processors.add_log_level,           # add "level" field
        structlog.processors.StackInfoRenderer(),     # include stack traces
        structlog.dev.set_exc_info,                   # attach exception info
        structlog.processors.TimeStamper(fmt="iso"),  # ISO 8601 timestamps
        structlog.processors.JSONRenderer(),          # output as JSON
    ],
    wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
    context_class=dict,
    logger_factory=structlog.PrintLoggerFactory(),
)

log = structlog.get_logger()

def process_order(user_id: str, order_id: str, items: list):
    # Bind context that will appear in ALL subsequent log entries
    log_ctx = log.bind(user_id=user_id, order_id=order_id, item_count=len(items))

    log_ctx.info("order_processing_started")

    try:
        total = calculate_total(items)
        log_ctx.info("order_total_calculated", total_cents=total)

        payment_result = charge_payment(user_id, total)
        log_ctx.info("payment_processed",
                     payment_id=payment_result.id,
                     payment_method=payment_result.method)

        reserve_inventory(items)
        log_ctx.info("inventory_reserved")

        log_ctx.info("order_processing_completed", duration_ms=elapsed())

    except PaymentTimeoutError as e:
        log_ctx.error("payment_timeout",
                      downstream_service="payment-service",
                      timeout_seconds=e.timeout,
                      retry_count=e.retries)
        raise

    except InsufficientInventoryError as e:
        log_ctx.warning("inventory_insufficient",
                        missing_items=e.missing_items,
                        available=e.available)
        raise

Node.js/TypeScript Example

// structured-logger.ts
import pino from 'pino';

const logger = pino({
  level: process.env.LOG_LEVEL || 'info',
  formatters: {
    level: (label) => ({ level: label }),
  },
  timestamp: pino.stdTimeFunctions.isoTime,
  base: {
    service: 'order-service',
    version: process.env.APP_VERSION,
    environment: process.env.NODE_ENV,
  },
});

export function processOrder(userId: string, orderId: string, items: Item[]) {
  const orderLog = logger.child({ userId, orderId, itemCount: items.length });

  orderLog.info('order_processing_started');

  try {
    const total = calculateTotal(items);
    orderLog.info({ totalCents: total }, 'order_total_calculated');

    const payment = chargePayment(userId, total);
    orderLog.info({ paymentId: payment.id, method: payment.method }, 'payment_processed');

    orderLog.info('order_processing_completed');
  } catch (err) {
    orderLog.error({ err, downstreamService: 'payment-service' }, 'order_processing_failed');
    throw err;
  }
}

Essential Fields for Every Log Entry

Field Required? Purpose
timestamp Yes When the event occurred (ISO 8601)
level Yes Severity (debug, info, warn, error)
service Yes Which service emitted the log
event Yes Machine-readable event name (snake_case)
trace_id Yes Correlation with distributed traces
span_id Recommended Correlation with specific trace span
environment Recommended production, staging, development
version Recommended Application version for deployment correlation
user_id When available User attribution (hashed for PII compliance)
request_id Recommended Correlation within a single request

Logging Dos and Don'ts for Testability

Do Don't
Log at every significant state transition Log raw request/response bodies (PII risk)
Include correlation IDs (trace_id, request_id) Use string interpolation for log messages
Use consistent event names across services Log at DEBUG level in production
Include timing data for operations Log secrets, tokens, or passwords
Separate business events from technical events Create logs only useful during development
Use structured fields for every variable value Embed values in the message string

Anti-Pattern: Values in Message Strings

# BAD -- values embedded in string, impossible to query
log.info(f"User {user_id} placed order {order_id} for ${total}")

# GOOD -- values as structured fields, queryable
log.info("order_placed", user_id=user_id, order_id=order_id, total_cents=total)

Log-Based Testing

Structured logs enable a powerful testing pattern: assert on log output to verify system behavior:

# test_order_logging.py
import json

def test_order_flow_produces_expected_logs(capsys, order_service):
    """Verify the order flow logs all expected events."""
    order_service.process_order("user_1", "order_1", [{"sku": "A", "qty": 1}])

    captured = capsys.readouterr()
    log_lines = [json.loads(line) for line in captured.out.strip().split('\n')]

    # Verify expected events are logged in order
    events = [log["event"] for log in log_lines]
    assert "order_processing_started" in events
    assert "order_total_calculated" in events
    assert "payment_processed" in events
    assert "inventory_reserved" in events
    assert "order_processing_completed" in events

    # Verify context is propagated
    for log in log_lines:
        assert log["user_id"] == "user_1"
        assert log["order_id"] == "order_1"

    # Verify timing data is present
    completed = next(l for l in log_lines if l["event"] == "order_processing_completed")
    assert "duration_ms" in completed
    assert completed["duration_ms"] > 0

Log Aggregation Architecture

[Service A]     [Service B]     [Service C]
  stdout          stdout          stdout
    |               |               |
    v               v               v
[Log Collector: Fluentd / Fluent Bit / Vector]
    |
    v
[Log Storage: Elasticsearch / Loki / CloudWatch Logs]
    |
    v
[Query UI: Kibana / Grafana / CloudWatch Insights]
    |
    v
[Alerting: based on log patterns and frequency]

Tool Recommendations

Component Recommended Alternative
Collection Fluent Bit (low resource) Fluentd, Vector
Storage Grafana Loki (cost-effective) Elasticsearch, CloudWatch
Query Grafana (LogQL) Kibana (KQL), CloudWatch Insights
Alerting Grafana Alerts ElastAlert, CloudWatch Alarms

PII in Logs: A Testing Concern

QA architects must verify that production logs do not contain personally identifiable information (PII). Add automated PII detection to your log pipeline:

# pii_log_scanner.py
import re

PII_PATTERNS = {
    "email": re.compile(r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"),
    "phone": re.compile(r"\b\d{3}[-.]?\d{3}[-.]?\d{4}\b"),
    "ssn": re.compile(r"\b\d{3}-\d{2}-\d{4}\b"),
    "credit_card": re.compile(r"\b(?:\d{4}[-\s]?){3}\d{4}\b"),
}

def scan_log_entry(log_json: dict) -> list:
    """Scan a structured log entry for PII."""
    findings = []
    text = json.dumps(log_json)

    for pii_type, pattern in PII_PATTERNS.items():
        matches = pattern.findall(text)
        real_matches = [m for m in matches if "example" not in m and "test" not in m]
        if real_matches:
            findings.append({"type": pii_type, "count": len(real_matches)})

    return findings

Structured logging is not just a developer convenience -- it is a prerequisite for every other observability practice in this chapter. Invest in it early.