Anatomy of a SKILL.md File

The Two-Part Structure

Every skill is a single markdown file with YAML frontmatter and markdown body:

┌─────────────────────────────┐
│  ---                        │  ← YAML frontmatter start
│  name: my-skill             │
│  description: ...           │  ← Metadata (selection signal)
│  allowed-tools: Bash,Read   │
│  ---                        │  ← YAML frontmatter end
│                             │
│  # Instructions             │
│  Step 1: Do this...         │  ← Markdown body (execution instructions)
│  Step 2: Then this...       │
└─────────────────────────────┘

YAML Frontmatter: The Selection Signal

Required Fields

name (string, max 64 chars)

Lowercase letters, numbers, hyphens only
This becomes the invocation command: /my-skill or skill: "my-skill"
Must be unique across all installed skills

name: vibe-check  # Good
name: Vibe Check  # Bad — no spaces or uppercase
name: vibe_check  # Bad — no underscores

description (string)

THE critical field. Claude reads all skill descriptions to decide which skill matches user intent
Must be specific enough to differentiate from other skills
Should describe the capability, not the implementation

# Good — tells Claude when to use this skill
description: |
  Browser automation via CLI. Navigate pages, click elements,
  fill forms, take screenshots, extract text from web pages.

# Bad — too vague, overlaps with other skills
description: "Helps with web stuff"

# Bad — describes implementation, not capability
description: "Runs vibe-check commands using the Bash tool"

Optional Fields

allowed-tools (comma-separated string)

Tools the skill may use, temporarily granted during execution
Without this, the skill inherits the session's current permissions
Supports wildcards: Bash(git:*) allows only git commands

allowed-tools: Bash           # Can run any shell command
allowed-tools: Read,Write     # Can read and write files only
allowed-tools: Bash,Read,Write,Glob,Grep  # Full access

model (string)

Override which Claude model executes the skill
Useful for cost optimization (use Haiku for simple skills)

model: claude-haiku-4-5-20251001  # Cheaper, faster
model: claude-sonnet-4-5-20250929 # Balanced

version (string)

version: "1.0.0"

disable-model-invocation (boolean)

When true, the skill can only be invoked explicitly (via /skill-name), never automatically

disable-model-invocation: true

Markdown Body: The Execution Instructions

The body is what Claude reads when the skill is invoked. It should be structured for an AI agent, not a human reader.

Best Practices

Lead with a one-line summary — What does this skill do?
List commands in a reference table — Quick lookup format
Show common patterns — The 80% use cases
Include tips — Gotchas the agent needs to know
Stay under 500 lines — Longer skills bloat context; use /references/ for details

Example: The vibe-check SKILL.md Structure

# Vibium Browser Automation — CLI Reference

The `vibe-check` CLI automates Chrome via the command line.
The browser auto-launches on first use.

## Commands

### Navigation
- `vibe-check navigate <url>` — go to a page
- `vibe-check url` — print current URL
...

### Common Patterns

**Read a page:**
```sh
vibe-check navigate https://example.com
vibe-check text
```

## Tips
- All click/type/hover actions auto-wait for the element
- Use `vibe-check find` to inspect before interacting

This structure works because:

The agent can scan the command table to find what it needs
The patterns section provides copy-paste workflows
The tips prevent common mistakes

Bundled Resources (Optional Directories)

Skills can include additional directories alongside SKILL.md:

`/scripts/` — Executable Code

my-skill/
├── SKILL.md
└── scripts/
    ├── setup.sh          # Run once on install
    ├── validate.py       # Called by agent via Bash
    └── generate-report.sh

The agent invokes scripts via Bash: bash {baseDir}/scripts/validate.py

`/references/` — Documentation the Agent Can Read

my-skill/
├── SKILL.md
└── references/
    ├── api-schema.json       # Loaded into context on demand
    ├── selector-patterns.md  # CSS selector cheat sheet
    └── error-codes.md        # Troubleshooting reference

The agent loads references via Read tool: Read {baseDir}/references/error-codes.md

This is progressive disclosure — SKILL.md is always loaded, but references are loaded only when needed.

`/assets/` — Templates and Static Files

my-skill/
├── SKILL.md
└── assets/
    ├── report-template.html  # Referenced by path, not loaded into context
    └── logo.png

The `vibe-check` Skill Specifically

The vibe-check skill is intentionally minimal — just a single SKILL.md file with zero bundled resources:

skills/vibe-check/
└── SKILL.md    # ~100 lines. That's it.

This is a deliberate design choice:

The Vibium CLI binary handles all complexity (browser management, actionability, BiDi proxy)
The skill just needs to teach the agent the command interface
No scripts needed because vibe-check is the script
No references needed because the SKILL.md itself is concise enough

This is the gold standard for CLI-wrapping skills: thin instruction layer over a capable binary.

How the Agent Uses the Skill at Runtime

Here's what happens when you say "Go to example.com and take a screenshot":

1. User: "Go to example.com and take a screenshot"

2. Claude reads available skills → finds vibe-check description matches

3. Claude invokes: Skill(skill="vibe-check")

4. System injects SKILL.md content into conversation context

5. Claude now knows all 22 vibe-check commands

6. Claude executes via Bash:
   → vibe-check navigate https://example.com
   → vibe-check screenshot -o screenshot.png

7. Claude reports: "Done. Screenshot saved to screenshot.png"

The key insight: steps 1-5 happen transparently. The user never sees the SKILL.md. They just see the agent driving a browser.

Interview Talking Point

"Agent skills are a fundamentally different architectural choice from MCP servers. Where MCP adds tool schemas to the context window — often thousands of tokens per server — skills inject procedural knowledge as markdown. The vibe-check skill is about 100 lines that teach the agent 22 browser commands. Those same capabilities via MCP would require exposing 22 separate tool definitions with JSON schemas, input validation, and response formats. Skills are the 'recipes'; MCP is the 'kitchen equipment.'"