QA Engineer Skills 2026QA-2026Context Feeding Strategies

Context Feeding Strategies

The Core Problem: LLMs Are Only as Good as Their Input

An LLM generating tests without context is like a QA engineer writing tests without reading the requirements. The output might look syntactically correct, but it will miss domain constraints, use wrong method names, and hallucinate APIs that do not exist.

The art of context feeding is deciding what to include, how much to include, and in what order -- given that every token of context costs money and competes for space in the model's attention window.


The Context Hierarchy

Not all context is equal. When your token budget is limited, prioritize ruthlessly:

Priority 1: The actual specification (OpenAPI schema, AC, Figma annotations)
Priority 2: Existing test patterns in the codebase (so AI matches style)
Priority 3: Domain constraints (business rules not in the spec)
Priority 4: Technical stack details (frameworks, helpers, fixtures)
Priority 5: Examples of good vs bad tests from prior reviews

Why this order? Priority 1 prevents hallucination (the AI tests what actually exists). Priority 2 prevents style drift (the AI writes tests your team recognizes). Priority 3 catches business logic that specs often omit. Priority 4 ensures the code compiles. Priority 5 is a bonus that improves quality over time.


What to Feed and How

Source How to Feed Why It Matters
OpenAPI/Swagger spec Paste the relevant endpoint JSON/YAML directly Exact field names, types, constraints -- eliminates guesswork
User story + AC Copy from Jira/Linear verbatim Preserves original intent, edge cases mentioned in comments
Existing test file Paste 2-3 representative tests as "style guide" AI matches naming, structure, assertion style, fixtures
Database schema Paste CREATE TABLE statements Reveals constraints AI can test (NOT NULL, UNIQUE, FK, CHECK)
Error code documentation Paste the error catalogue AI generates tests triggering each documented error
UI mockup/Figma Describe the layout or use screenshot + vision model Generates accessibility and layout tests
CI configuration Paste relevant test commands AI understands how tests will be run (parallel, coverage flags)

Feeding an OpenAPI Schema

Here is the OpenAPI schema for the endpoint under test:

```yaml
paths:
  /api/v2/orders:
    post:
      summary: Create a new order
      security:
        - BearerAuth: [customer]
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/CreateOrder'
      responses:
        '201':
          description: Order created
        '400':
          description: Validation error
        '401':
          description: Unauthorized
        '409':
          description: Duplicate order (idempotency key conflict)

components:
  schemas:
    CreateOrder:
      type: object
      required: [items, shipping_address, idempotency_key]
      properties:
        items:
          type: array
          minItems: 1
          maxItems: 50
          items:
            type: object
            required: [product_id, quantity]
            properties:
              product_id:
                type: string
                format: uuid
              quantity:
                type: integer
                minimum: 1
                maximum: 100
        shipping_address:
          $ref: '#/components/schemas/Address'
        idempotency_key:
          type: string
          format: uuid
        coupon_code:
          type: string
          pattern: "^[A-Z0-9]{8}$"

Notice that every constraint in the schema (minItems, maxItems, minimum, maximum, pattern, format) is a test case waiting to be generated. The LLM sees these constraints and produces boundary value tests automatically.

Feeding Existing Tests as a Style Guide

Here are two existing tests from our codebase. Match their style exactly:

```python
class TestOrderCreation:
    """Tests for POST /api/v2/orders endpoint."""

    def test_should_create_order_when_valid_payload(
        self, api_client, auth_headers, product_factory
    ):
        # Arrange
        product = product_factory.create()
        payload = {
            "items": [{"product_id": str(product.id), "quantity": 2}],
            "shipping_address": VALID_ADDRESS,
            "idempotency_key": str(uuid4()),
        }

        # Act
        response = api_client.post(
            "/api/v2/orders", json=payload, headers=auth_headers
        )

        # Assert
        assert response.status_code == 201
        order = response.json()
        assert order["status"] == "pending"
        assert len(order["items"]) == 1
        assert order["items"][0]["quantity"] == 2

    def test_should_reject_order_when_empty_items_list(
        self, api_client, auth_headers
    ):
        # Arrange
        payload = {
            "items": [],
            "shipping_address": VALID_ADDRESS,
            "idempotency_key": str(uuid4()),
        }

        # Act
        response = api_client.post(
            "/api/v2/orders", json=payload, headers=auth_headers
        )

        # Assert
        assert response.status_code == 400
        assert "items" in response.json()["detail"].lower()

By showing two tests -- one happy path, one validation error -- the AI learns:

  • Class structure with docstrings
  • Fixture-based dependency injection (api_client, auth_headers, product_factory)
  • Naming convention: test_should_X_when_Y
  • Comment markers for Arrange/Act/Assert
  • Assertion style (status code + specific field checks)
  • Use of VALID_ADDRESS constant and uuid4()

Anti-Pattern: The Context Dump

Do not paste your entire codebase into the prompt. LLMs degrade with irrelevant context. A focused 200-line excerpt produces better tests than a 5000-line dump.

This is called the needle-in-haystack problem -- the more hay, the harder the LLM works to find the needle. Research from 2024-2025 consistently shows that models perform best when relevant context is placed at the beginning or end of the prompt, and performance degrades with large amounts of irrelevant middle content.

Symptoms of Context Overload

  • Generated tests reference functions from the wrong file
  • Tests mix styles from different parts of the codebase
  • The LLM "forgets" constraints mentioned early in the prompt
  • Output is shorter and less detailed than expected (model ran out of output tokens processing bloated input)

The Fix: Context Windowing

Instead of dumping everything, use a context window approach:

Step 1: Feed the spec (Priority 1) -- generate initial tests
Step 2: Review output -- identify style mismatches
Step 3: Feed 2-3 existing tests as style examples (Priority 2) -- regenerate
Step 4: Review output -- identify missing domain rules
Step 5: Add domain constraints (Priority 3) -- regenerate specific tests

This iterative approach keeps each prompt focused and produces better results than a single massive prompt.


Advanced Strategy: Context Compression

When you must include a lot of context, compress it. Instead of pasting a 500-line source file, summarize it:

The UserService class has these public methods:
- create_user(dto: CreateUserDTO) -> User — validates email uniqueness, hashes password
- get_user(id: UUID) -> User — raises NotFoundError if missing
- update_user(id: UUID, dto: UpdateUserDTO) -> User — partial update, re-validates email if changed
- delete_user(id: UUID) -> None — soft delete (sets deleted_at timestamp)

Key constraints:
- Email must be unique (case-insensitive)
- Password minimum 8 chars, must include a number
- Soft-deleted users cannot log in but their data is retained for 30 days

This 10-line summary carries the same information as a 200-line source file for test generation purposes.


Strategy: Multi-Turn Context Building

For complex features, build context across multiple prompts in a conversation:

Turn 1: "Here is the OpenAPI schema for the payments endpoint. Summarize the
         test scenarios you would create."

Turn 2: "Good. Here are our existing payment tests for reference style.
         Now generate the first 10 tests matching this style."

Turn 3: "These look good. Now add tests for the edge cases: expired cards,
         insufficient funds, and currency conversion rounding."

Turn 4: "Review all generated tests. Which ones are testing the mock
         instead of the real behavior? Flag any tautology tests."

Each turn adds context incrementally, and the conversation history provides implicit context from prior turns. This is more effective than a single massive prompt because:

  • The LLM's attention is focused on one concern at a time
  • You can course-correct between turns
  • You build a review-as-you-go workflow

Context Feeding Checklist

Before sending a test generation prompt, verify:

[ ] Specification artifact is included (schema, AC, or story)
[ ] Only relevant portions are included (not the entire file)
[ ] Existing test examples are provided for style matching
[ ] Business rules not in the spec are stated explicitly
[ ] Framework and language are specified
[ ] Auth mechanism and test helpers are named exactly
[ ] Output format expectations are clear
[ ] Token budget is reasonable (< 8K input for focused generation)

Key Takeaway

Context feeding is the highest-leverage skill in AI-augmented test design. The right 200 lines of context produce better tests than 5000 lines of context dump. Prioritize specifications over code, feed incrementally rather than all at once, and always include style examples from your existing test suite.