The Complete AGENTS.md Guide: Configure Claude Code's Autonomous Agents
AGENTS.md is the configuration file for autonomous Claude Code agents — defining what they can do, when to escalate, and how to self-recover when things go wrong.
Agentic ModeClaude Code SDKAutomation-FirstOpen Standard
Referenced in Anthropic's official Claude Code documentation and SDK guides
What is AGENTS.md? — CLAUDE.md for Autonomous Agents
If you've used Claude Code for any serious project, you know CLAUDE.md: the configuration file that lives in your project root and tells Claude how to behave in interactive sessions. CLAUDE.md handles your preferences, project context, writing style, and workflow rules — all the things Claude needs to work effectively when a human is in the loop.
AGENTS.md is the autonomous counterpart. It tells Claude Code how to behave when running without a human present — in automated pipelines, scheduled loops, background jobs, and as spawned subagents in multi-agent workflows.
⚡
The core distinction: In an interactive session, Claude can always ask you a clarifying question. In agentic mode, there's nobody home to answer. AGENTS.md pre-answers all those questions — before Claude even starts.
Interactive Sessions vs. Agentic Sessions
The difference matters more than it might seem. In an interactive session, every uncertain decision can be surfaced: "Should I overwrite this file?" "This API returned an unexpected schema — how do you want me to handle it?" You can course-correct in real time.
In agentic mode, Claude runs for minutes or hours without interruption. A single ambiguous decision early in the session can cascade into dozens of downstream actions. AGENTS.md is how you constrain that decision space before the agent even starts.
Dimension
CLAUDE.md
AGENTS.md
Read when
Interactive session starts
Agent / background mode activates
Human present
Always
Never (or rarely)
Primary purpose
Style, context, preferences
Permissions, boundaries, escalation
Scope decisions
Made on-the-fly by asking
Pre-decided before the run
Error handling
Ask the human
Follow recovery procedure
Permission model
Implicit (follows instructions)
Explicit (written authorization list)
Escalation rules
Not needed (can always ask)
Required — defines what triggers a stop
Where AGENTS.md Lives
AGENTS.md can live in two places, with clear priority rules:
./AGENTS.mdProject root — the primary location, read first
./.claude/AGENTS.mdInside the .claude directory — overrides root if present
If both exist, .claude/AGENTS.md takes precedence. The file in the root is the fallback. This lets you have a general project AGENTS.md while allowing subdirectories or specific deployments to override with more restrictive rules.
Section 02
Anatomy of AGENTS.md — A Complete Annotated Example
Every production AGENTS.md has the same six core components. Here is a complete, real-world example with annotations explaining what each section does and why it matters.
AGENTS.md
Markdown
# ═══════════════════════════════════════════════════════════
# AGENTS.md — [Project Name] Autonomous Agent Configuration
# ═══════════════════════════════════════════════════════════# This file is read ONLY in agentic/background mode.
# For interactive session config, see CLAUDE.md.## Agent Identity
You are an autonomous agent running the [Project Name] pipeline.
Session: {{SESSION_ID}}
Run type: {{RUN_TYPE}} (daily | weekly | on-demand)
Started: {{START_TIME_UTC}}
# WHY: Grounding the agent in its role prevents scope creep.
# The agent knows exactly what it is and what kind of run this is.## What You Are Authorized To DoWithout asking, you MAY:
- Read any file in this repository
- Write to outputs/ and brain/ directories
- Run Python scripts in code/ (read-only impact)
- Make API calls to: FRED, Polygon, OpenAI (for analysis only)
- Create or update files matching patterns: *.json, *.html, *.md
- Git add + commit (commit message must include session ID)
- Send emails via the configured mailer (code/send_email.py)
ONLY after validate_email.py passes with exit code 0
# WHY: Explicit authorization list means the agent never has to
# guess what it's allowed to do. Ambiguity = bad decisions at 3am.## What Requires EscalationSTOP and write escalation details to outputs/escalation_needed.json if:
- Any API returns an unexpected data format (new schema detected)
- A critical output file fails its validator (exit code != 0)
- You discover a data anomaly that would change a signal direction
- Any financial calculation differs by more than 5% from prior run
- You need to modify files in: production/, live/, archived/
- Any external service returns HTTP 429 (rate limit) three times
- You encounter a permission error on a file you expect to write
# WHY: Define exactly what "too uncertain to proceed" looks like.
# The agent shouldn't decide this on the fly — it's pre-decided here.## Hard Boundaries — NEVER Do TheseThese are absolute. No exception, no "but it would help":
- NEVER git push to any remote (stash if needed, never push)
- NEVER delete any file (write a deprecation marker instead)
- NEVER modify files outside your authorized scope (outputs/, brain/)
- NEVER make real trades or submit any financial orders
- NEVER send emails without passing validate_email.py first
- NEVER overwrite files in archived/ regardless of content
- NEVER call any API not in the authorized list above
# WHY: The NEVER list is your blast radius limiter. Everything
# you don't put here is implicitly allowed by omission. Be thorough.## Error Recovery Procedure
When you encounter any error, follow this sequence exactly:
1. Log the error to outputs/agent_errors.jsonl with fields:
{timestamp, task_name, error_type, error_message, stack_trace, context}
2. Attempt recovery using brain/fallback-procedures.md for this task.
Try the fallback ONCE. If it also fails, do not retry further.
3. If fallback fails: mark the task status as SKIPPED with reason.
Log the skip to outputs/agent_task_log.jsonl and continue.
Do NOT halt the entire session for a single task failure.
4. At session end: include all SKIPPED tasks in the handoff summary
under the "tasks_skipped" key with full context for human review.
# WHY: Without a recovery procedure, the agent either loops
# on errors (wastes tokens) or halts entirely (wastes the whole run).
# Pre-defined recovery means failures are isolated, not cascading.## Output Standards
Every session MUST produce:
- brain/session_handoff.md with ONLY verified accomplishments
(never claim done if you haven't checked the output file exists
and passes its validator)
- outputs/agent_task_log.jsonl with one entry per task attempted
- Any generated HTML must pass visual_qa.py before being logged as done
Format requirements:
- All timestamps: ISO 8601 UTC (2026-04-15T14:23:00Z)
- All filenames: snake_case, no spaces
- All JSON: valid, minified for machine-read files, pretty for humans
# WHY: Output standards prevent "I ran the script" being logged
# as "task complete." Done means output verified, not process run.## Definition of Done
A task is DONE only when ALL of the following are true:
1. The output file exists at the expected path
2. The output file passes its designated validator (exit code 0)
3. You have manually spot-checked at least 3 data points in the output
4. The task is logged in agent_task_log.jsonl with status "complete"
NOT done conditions (do not log as complete):
- Script ran without Python errors (output might still be wrong)
- File was written (content might be empty or malformed)
- Prior run's output is still present (might be stale, not regenerated)
# WHY: The most common agent failure is logging tasks as done
# when only the process ran, not when the output was verified.
# This section makes "done" unambiguous.## Communication Protocol
When you complete a session, output this exact structured summary:
```json
{
"session_id": "{{SESSION_ID}}",
"run_type": "{{RUN_TYPE}}",
"completed_at": "ISO-8601-UTC",
"tasks_completed": ["task_name_1", "task_name_2"],
"tasks_skipped": [{"task": "name", "reason": "explanation"}],
"issues_found": [{"issue": "description", "severity": "low|medium|high"}],
"escalation_needed": false,
"next_session_priorities": ["priority_1", "priority_2", "priority_3"]
}
```
# WHY: Structured output means the next human (or next agent)
# can parse results programmatically. Freeform prose summaries
# don't compose well in automated pipelines.
Section-by-Section Breakdown
🤖
Agent Identity — Grounds the agent in its exact role and run type. Prevents the agent from expanding its scope beyond the defined run. Template variables like {{SESSION_ID}} are injected by your runner script before the file is passed to Claude.
✅
What You Are Authorized To Do — The positive permission list. Anything not on this list is implicitly unauthorized. Be explicit. "Make API calls" is too broad — name the specific APIs. "Write to outputs/" is better than "write files."
⚠️
What Requires Escalation — The gray zone. These are situations the agent encounters where the right answer is "stop and flag it" rather than proceed or fail. Schema changes, data anomalies, and files outside scope are classic escalation triggers.
🚫
Hard Boundaries — NEVER Do These — The blast radius limiter. Irreversible actions (git push, file deletion, real financial orders) must be in this list. These are non-negotiable regardless of how helpful it might seem in context.
🔄
Error Recovery — Pre-decides what "error" means and exactly what to do about it. The agent doesn't need to figure out whether to retry, skip, or halt — the procedure tells it. This is what keeps a single failure from becoming a session failure.
Section 03
How Claude Code Reads AGENTS.md
Understanding the technical mechanics of how AGENTS.md is loaded helps you write it more effectively and debug unexpected behavior.
When AGENTS.md Is Activated
AGENTS.md is read specifically in agentic and background contexts:
CLI
claude --dangerously-skip-permissions
Runs Claude without permission prompts. AGENTS.md defines the permission model that replaces interactive approval.
SDK
Agent tool (subagent spawning)
When a parent agent spawns a subagent via the Agent tool in the Claude Code SDK, the subagent reads AGENTS.md from its working directory.
LOOP
Automated loop runners
Shell scripts or LaunchAgents that invoke Claude programmatically. AGENTS.md is the agent's operating constitution for each loop iteration.
File Location Priority
File resolution order
Priority
# Claude Code checks these locations in order:1. .claude/AGENTS.md# Highest priority — project-specific override2. AGENTS.md# Project root — primary location3. ../AGENTS.md# Parent directory — if running from subdir4. (none found) # Falls back to CLAUDE.md only# Recommendation: use project root AGENTS.md for most projects.
# Use .claude/AGENTS.md when you need environment-specific overrides
# (e.g., staging vs production agent configurations).
Interaction with CLAUDE.md
Both files can coexist. When Claude reads AGENTS.md in agentic mode, it applies the following rules:
→Topic covered in AGENTS.md: AGENTS.md takes precedence. The CLAUDE.md instruction for that topic is ignored.
→Conflict between files: AGENTS.md wins. Design AGENTS.md to be the final word on anything agentic.
⚠️
Best practice: AGENTS.md should always be more restrictive than CLAUDE.md, never less. If CLAUDE.md says "use your judgment on file writes," AGENTS.md should specify exactly which directories are writable. Vague permissions in agentic mode are dangerous.
Environment Variable Injection
Template variables in AGENTS.md ({{SESSION_ID}}, {{RUN_TYPE}}, {{START_TIME_UTC}}) are not automatically substituted by Claude — they are placeholders that your runner script should replace before invoking Claude. Here is a minimal bash pattern:
run_agent.sh
Bash
#!/bin/bash
SESSION_ID=$(date +%Y%m%d_%H%M%S)
RUN_TYPE="daily"
START_TIME_UTC=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
# Substitute template variables into AGENTS.md before the run
sed -e "s/{{SESSION_ID}}/$SESSION_ID/g" \
-e "s/{{RUN_TYPE}}/$RUN_TYPE/g" \
-e "s/{{START_TIME_UTC}}/$START_TIME_UTC/g" \
AGENTS.md > .claude/AGENTS.resolved.md
# Invoke Claude Code with the resolved config
claude --dangerously-skip-permissions \
--system-file .claude/AGENTS.resolved.md \
"Run the daily pipeline per AGENTS.md instructions"
# Clean up resolved file
rm -f .claude/AGENTS.resolved.md
Section 04
The Authorization Matrix — Designing Agent Permissions
The authorization matrix is the core of any AGENTS.md. It organizes all possible agent actions into four tiers based on reversibility and risk. Design your authorization matrix first — then write the AGENTS.md around it.
Tier 1
Always OK — No Logging Required
Read any file in the repository
Analyze and compute (in memory only)
Draft output files before writing them
Call read-only API endpoints
Check file existence with ls / glob
Run tests in watch mode (no writes)
Tier 2
OK with Logging — Standard Agent Work
Write to designated output directories
Make mutating API calls (authorized list only)
Create new files matching approved patterns
Git add + commit (with session ID in message)
Send internal notifications (Slack, log files)
Update cache or state files
Tier 3
Requires Pre-Approval — Write to Escalation
Schema changes in any database or config
External email sends (outside internal team)
Any financial calculation or order placement
Modifying files outside authorized scope
Dependency updates or version bumps
Any action affecting >50 records at once
Tier 4
NEVER — Hard Stop, No Exceptions
Git push to any remote branch
Delete any file (write deprecation marker instead)
Production deploys or live environment changes
Real financial orders or transactions
Calling APIs not on the authorized list
Overwriting files in archived/ or production/
💡
Design principle: Start with everything in Tier 4 and selectively promote actions up to Tier 1 as you gain confidence. It's much easier to relax restrictions after a successful run than to recover from an overly-permissive agent that took destructive action at 3am.
Authorization Matrix Template
AGENTS.md — Authorization sections
Markdown
## Authorization Matrix### Tier 1 — Always Permitted (No Logging Required)
- Read access: all files in repository
- In-memory analysis and computation
- API: GET requests to FRED, Polygon (read-only endpoints)
- File existence checks, directory listings
### Tier 2 — Permitted with Task Log Entry
- Write access: outputs/*.json, outputs/*.html, brain/*.md
- API: POST to internal mailer (validate_email.py must pass first)
- Git: add + commit (format: "[{SESSION_ID}] {task_name}: {summary}")
- Create new files in: outputs/, brain/, code/cache/
### Tier 3 — Requires Escalation Before Proceeding
- Any file write outside outputs/ and brain/
- API calls not listed in Tier 1 or Tier 2
- Actions affecting more than 10 records/files simultaneously
- Any action that cannot be easily reversed
### Tier 4 — Absolute Prohibition
- git push (to any remote, any branch)
- File deletion (use deprecated_{filename} rename instead)
- Any action in: production/, live/, .env files
- Real financial transactions or order submission
Section 05
AGENTS.md vs CLAUDE.md — Deep Comparison
Both files configure Claude's behavior, but they serve fundamentally different purposes and are read in different contexts. Understanding which file owns which concern is essential for avoiding contradictions and gaps.
Configuration Dimension
CLAUDE.md
AGENTS.md
Persona definition
✅ Primary location — who Claude is on this project
❌ Not needed — agent role defined in identity block
Project context
✅ Full context — architecture, team, conventions
⚠️ Can reference brain/ files by path, keep AGENTS.md lean
Writing style / tone
✅ Yes — matters in interactive responses
❌ Irrelevant — agent outputs are machine-read JSON
Coding conventions
✅ Yes — Claude follows these when writing code
⚠️ Only if the agent generates code as an output
Permission boundaries
⚠️ Optional — Claude asks when unsure
✅ Critical — defines the authorization matrix
Escalation rules
❌ Not typical — human is always present to ask
✅ Required — no human to ask means rules must be pre-defined
Error recovery procedure
❌ Not needed — ask the human
✅ Required — prevents cascading failures
Session logging format
❌ Optional — conversation is the log
✅ Required — machine-readable handoff for the next run
Hard stops (NEVER DO)
❌ Rare — human can stop Claude mid-action
✅ Explicit required list — agent must know its limits
Definition of "done"
❌ Implicit — human reviews and confirms
✅ Required — must be precise and verifiable
Overrides CLAUDE.md?
—
✅ Yes — AGENTS.md wins on any conflicting topic
📐
The simple rule: If Claude can ask a human about it, put it in CLAUDE.md. If Claude has to decide alone at 2am with no human available, it must be in AGENTS.md. There should be zero topics that AGENTS.md leaves to the agent's discretion.
Section 06
Real-World Examples — Three Agent Types
Here are complete AGENTS.md configurations for three common agent archetypes. Each is production-ready and can be adapted to your use case.
🔄
Daily Data Pipeline Agent — Fetches macro data from external APIs, runs analysis, generates HTML reports, sends email digests. Runs on a 6am cron, no human present.
AGENTS.md — Data Pipeline
Markdown
# AGENTS.md — Market Data Pipeline Agent## Agent Identity
You are the daily market data pipeline agent.
Session: {{SESSION_ID}} | Run type: daily | Time: {{START_TIME_UTC}}
Your job: fetch data → analyze → generate outputs → send digest.
Complete all tasks in order. Do not skip unless recovery procedure applies.
## Authorized ActionsWithout approval, you MAY:
- Fetch from: FRED API, Polygon API, OpenAI API (gpt-4 analysis only)
- Read/write: code/cache/, outputs/, brain/
- Run: code/*.py scripts (listed in pipeline_config.json)
- Git add + commit with format: "auto [{SESSION_ID}]: {summary}"
- Send email via code/send_email.py AFTER validate_email.py passes
STOP and write to outputs/escalation_needed.json if:
- API returns HTTP 5xx three times in a row
- Any output file fails its validator
- A signal changes direction by more than 2 standard deviations
- send_email.py returns non-zero exit code
NEVER:
- git push
- Delete or overwrite files in outputs/signal_snapshots/ (append only)
- Call any API not in the authorized list
- Send emails if validate_email.py exits non-zero
## Pipeline Task Order
1. Fetch FRED data: python3 code/fetch_fred.py
2. Fetch price data: python3 code/fetch_prices.py
3. Run signal generation: python3 code/run_daily.py
4. Generate dashboard: python3 code/generate_command_center.py
5. Run visual QA: python3 code/visual_qa.py --site ept
6. Send email digest: python3 code/send_email.py (after validate_email.py)
## Error Recovery
Step fails → log to outputs/agent_errors.jsonl → check brain/fallback-procedures.md
→ try fallback once → if fails, mark SKIPPED, continue to step N+1.
Never halt the pipeline for a single step failure.
If steps 1-3 all fail: escalate immediately (no output = no session value).
## Done Definition
Session complete when:
- outputs/daily_email.html exists and visual_qa.py passes
- brain/session_handoff.md written with verified task list
- Git commit created with session summary
✍️
Content Generation Agent — Creates SEO articles and landing pages, validates content quality, commits to git. Runs on-demand or weekly. Content goes through quality gates before any deploy action.
AGENTS.md — Content Agent
Markdown
# AGENTS.md — Content Generation Agent## Agent Identity
You are the SEO content generation agent for brainfile.io.
Session: {{SESSION_ID}} | Run type: {{RUN_TYPE}} | Time: {{START_TIME_UTC}}
Your job: create high-quality, accurate SEO content and commit it to git.
You do NOT deploy. Deployment requires human approval via EP Dashboard.
## Authorized ActionsWithout approval, you MAY:
- Research: read all files in /deploy-sites/brainfile/ and /brain/
- Write new HTML files to: /deploy-sites/brainfile/{slug}.html
- Fetch external data for research: news APIs, pricing pages (read-only)
- Run: code/validate_html.py on any file you create
- Git add + commit: "content [{SESSION_ID}]: add {slug}.html"
- Update: sitemap.xml to include new pages
STOP and escalate (write to outputs/content_escalation.json) if:
- Any content involves legal, medical, or financial advice claims
- A new page would overwrite an existing file (check first, always)
- External research surfaces contradictory information you can't resolve
- Validator reports any error (do not commit failing content)
NEVER:
- Deploy or push to any live site
- Overwrite existing content files (only create NEW files)
- Include specific performance claims without verified data sources
- Use placeholder content in final files — real content only
- Skip validate_html.py before committing
## Content Quality Gates
Before committing any HTML file:
1. validate_html.py must pass (exit code 0)
2. Word count must be > 2000 words
3. All internal links must point to real pages (check against sitemap.xml)
4. No broken image references
5. Schema.org JSON-LD must be valid (check with: python3 -c "import json; json.load(open('file.html'))" — adapted)
6. Title tag must contain the primary keyword
## Prohibited Content Topics
Do not create content about:
- Specific investment advice or stock picks
- Competitor products with unverified claims
- "AI is conscious / sentient" framings
- Any claims about Claude that contradict Anthropic's published docs
## Done Definition
Page is done when:
- File exists at /deploy-sites/brainfile/{slug}.html
- validate_html.py passes
- Git commit created
- sitemap.xml updated
- Session handoff includes page URL, word count, and primary keyword
📡
Monitoring & Alert Agent — Runs health checks on dashboards and pipelines, detects anomalies, sends Slack alerts. Runs every 30 minutes. Has strict alert rate limits to prevent alert fatigue.
AGENTS.md — Monitor Agent
Markdown
# AGENTS.md — Monitoring & Alert Agent## Agent Identity
You are the health monitoring agent. You run every 30 minutes.
Session: {{SESSION_ID}} | Run type: monitor | Time: {{START_TIME_UTC}}
Your job: check system health, detect anomalies, send alerts if needed.
You observe and alert. You do NOT fix issues yourself.
## Authorized ActionsWithout approval, you MAY:
- Read: all output files, log files, signal files
- Run: python3 code/visual_qa.py --site ept (read-only)
- Run: python3 code/validate_command_center.py
- Write: outputs/health_log.jsonl (append only, NEVER truncate)
- Send Slack alert via code/send_slack.py (subject to rate limits below)
Alert Rate Limits (enforced by checking outputs/alert_log.json):
- Maximum 3 alerts per rolling 2-hour window
- Same issue: do not re-alert for 4 hours after first alert
- CRITICAL alerts (data loss, pipeline down): no rate limit
- If rate limit would be exceeded: log to health_log.jsonl, skip Slack
STOP and write CRITICAL alert if:
- Pipeline has not run in > 25 hours (check outputs/ file timestamps)
- Any validator returns errors on 2 consecutive checks
- Signal file is more than 6 hours stale during market hours (9am-4pm ET)
- Any output directory is empty that should have files
NEVER:
- Attempt to fix any issue you detect — alert and log only
- Delete or overwrite any file
- Send more than 3 non-critical alerts in a 2-hour window
- Modify any config or pipeline file
## Alert Severity Levels
CRITICAL: data loss, pipeline failure, zero outputs — alert immediately, no rate limit
HIGH: stale data, validator failures — alert, subject to rate limits
MEDIUM: performance degradation, slow runs — log only unless sustained 2+ checks
LOW: cosmetic issues, minor timing — log only, weekly digest
## Check Order (run in sequence)
1. Check pipeline last-run timestamp (< 25h ago = OK)
2. Validate command center: python3 code/validate_command_center.py
3. Visual QA: python3 code/visual_qa.py --site ept
4. Check signal file age: outputs/signals.json mtime
5. Check output file count in outputs/ (should match prior baseline)
6. Write health report to outputs/health_log.jsonl
7. Send alerts if thresholds exceeded (check rate limits first)
## Done Definition
Each check run is done when:
- All 6 checks completed (or marked FAILED with reason)
- outputs/health_log.jsonl updated with this run's results
- Any alerts sent (or deliberately skipped due to rate limits) are logged
Section 07
Best Practices — 8 Rules for Writing Good AGENTS.md
Rule 01
Be Explicit, Not Implicit
Agents cannot read minds. "Use your best judgment" is the most dangerous phrase you can put in an AGENTS.md. Every decision point needs a pre-made answer. If you catch yourself writing "use judgment" — replace it with the specific rule that judgment would have followed.
Rule 02
Define "Done" with Precision
The default agent interpretation of "done" is "the script ran without errors." This is almost always wrong. Define done as: output file exists + validator passes + spot-check passes + logged in task log. Make it impossible to claim done without verifying output.
Rule 03
Every Destructive Action Needs a NEVER
If an action can't be easily reversed — git push, file deletion, live deployment, financial transaction — it must appear in the NEVER DO list. Anything you don't explicitly prohibit is implicitly permitted. Be paranoid when writing this section.
Rule 04
Plan for Errors: Write the Recovery Path
Every agent run will encounter at least one error. If your AGENTS.md has no error recovery procedure, the agent will improvise — and improvisation at 3am with no human present is where disasters happen. Define exactly: log it, try fallback once, skip and continue.
Rule 05
Tighter Than You Think You Need
Start maximally restrictive. Give the agent only the permissions it needs for the first 10 runs. Expand after you've reviewed logs and gained confidence. Relaxing permissions is easy. Recovering from an overly permissive agent that did something irreversible is not.
Rule 06
Version Control Your AGENTS.md
AGENTS.md is code. It deserves the same git discipline as any other file. Track every change with a commit message. When something goes wrong in an agent run, the first question is "what changed in AGENTS.md between the last good run and now?" You need git history to answer that.
Rule 07
Test on Low-Stakes Runs First
Before granting full autonomy to an agent, run it on a read-only or reversible task. Let it run with --dangerously-skip-permissions on a staging copy of your data. Read every line of the logs. The agent will always find an edge case you didn't anticipate.
Rule 08
Review Logs After Every Early Run
Trust but verify — especially in the first 20 runs. Every decision the agent made that surprised you is a gap in your AGENTS.md. Build the discipline of reviewing agent_task_log.jsonl after every run and patching AGENTS.md for each unexpected behavior.
Section 08
Common Mistakes — 5 Patterns and How to Fix Them
❌
No File Write Boundaries — Agent Writes Everywhere
Problem: AGENTS.md says "write output files" without specifying which directories. Agent writes cache files to root, updates configs, overwrites source files.
Fix: Enumerate exact directories with write permission: outputs/, brain/, code/cache/. Anything not listed is off-limits.
❌
No Error Recovery Procedure — Single Failure Halts Everything
Problem: Step 3 of 8 throws an exception. Agent has no guidance, so it stops. Steps 4-8 never run. You wake up to a half-run pipeline with no explanation.
Fix: Write an explicit recovery procedure: log the error, try the fallback once, mark SKIPPED if fallback fails, continue to the next task. Only halt if your core data fetch (step 1) fails.
❌
"Use Your Judgment" on Irreversible Actions
Problem: AGENTS.md says "use your judgment when deciding whether to send emails." Agent sends 200 test emails to your subscriber list because the logic seemed correct.
Fix: Any action that is irreversible or external must have an explicit rule. Email sends: "ONLY after validate_email.py passes with exit code 0." No judgment calls on anything external.
❌
Vague "Done" Definition — Agent Marks Tasks Complete When Scripts Run
Problem: AGENTS.md says "complete each task and log it." Agent runs every script, logs them all as complete. Half the output files are empty or malformed — the scripts ran but produced no valid output.
Fix: Define done as a three-part test: (1) output file exists, (2) validator passes with exit code 0, (3) you spot-checked at least 3 data points in the output. Anything less = not done.
❌
AGENTS.md and CLAUDE.md Contradict Each Other
Problem: CLAUDE.md says "always ask before writing to any file." AGENTS.md says "write freely to outputs/." Agent sees a conflict and either stops to resolve it (which it can't, no human) or picks an interpretation inconsistently.
Fix: AGENTS.md always wins on overlapping topics. After writing AGENTS.md, read CLAUDE.md and flag every section that AGENTS.md overrides. Add a comment in CLAUDE.md: "# Overridden by AGENTS.md in agentic mode."
Section 09
Multi-Agent Architecture — How AGENTS.md Works with Subagents
When you build multi-agent workflows with Claude Code's Agent tool, each agent in the hierarchy can have its own AGENTS.md. The key design principle: each layer should be more restrictive than the one above it.
The Orchestrator-Specialist Pattern
The most common multi-agent pattern has an orchestrator agent (broader permissions, coordinates work) spawning specialist subagents (narrow permissions, specific tasks). Each subagent reads its own AGENTS.md from its working directory.
Multi-Agent Permission Hierarchy┌─────────────────────────────────────────────────────────┐│ ORCHESTRATOR AGENT ││ AGENTS.md: broad permissions ││ - Read: all files ││ - Write: outputs/, brain/ ││ - Spawn: up to 4 subagents ││ - Git: add + commit │└──────────────┬──────────────────┬───────────────────────┘ │ │ spawns ▼ spawns ▼┌──────────────────┐ ┌──────────────────────────────────┐│ DATA AGENT │ │ CONTENT AGENT ││ .claude/ │ │ .claude/ ││ AGENTS.md: │ │ AGENTS.md: ││ - Read: cache/ │ │ - Read: templates/, brain/ ││ - Write: cache/ │ │ - Write: outputs/html/ ││ - API: FRED │ │ - No API calls ││ - No git │ │ - No git ││ - No email │ │ - No email │└──────────────────┘ └──────────────────────────────────┘
Rule: Child AGENTS.md scope ⊆ Parent AGENTS.md scope
Children can never have MORE permissions than their parent.
Orchestrator collects results and handles all external actions.
Key Rules for Multi-Agent AGENTS.md
1.Child permissions are a subset of parent permissions. If the orchestrator can't git push, no subagent can git push. This is not enforced automatically — you must enforce it by writing restrictive subagent AGENTS.md files.
2.External actions belong to the orchestrator. Email sends, git commits, API calls with side effects — these should be orchestrator-level actions. Specialist subagents return data; the orchestrator decides what to do with it.
3.Subagents do not share file state. The orchestrator must write results to disk before spawning a subagent that needs to read them. In-memory state does not transfer across agent boundaries.
4.Each subagent needs its own AGENTS.md. Don't assume a subagent will inherit the parent's AGENTS.md. Explicitly write a subagent-specific config in the subagent's working directory or .claude/ folder.
5.Narrow scope = easier debugging. When a multi-agent system fails, knowing each subagent's exact scope makes it trivial to identify which agent did what. Broad subagent permissions make post-mortem analysis nearly impossible.
Directory structure — multi-agent setup
Structure
# Orchestrator reads from project root./AGENTS.md# Orchestrator config (broad permissions)# Each specialist agent has its own config./agents/data-fetcher/.claude/AGENTS.md# Data agent (API + cache only)./agents/content-gen/.claude/AGENTS.md# Content agent (templates + HTML only)./agents/validator/.claude/AGENTS.md# Validator agent (read-only)# Shared state (written by orchestrator, read by subagents)./outputs/# Orchestrator writes here; subagents read./brain/session_context.json# Current session state — commit before spawning
Section 10
Getting Started — 7-Step Setup
From zero to a production-ready AGENTS.md in one focused session.
1
Create AGENTS.md in your project root
Start with a blank file at the project root. Don't copy-paste from examples yet — write the agent identity block first. Who is this agent? What is its single job? What run type is this? Get those three sentences right before writing anything else.
2
Map your authorization tiers
List every action your agent might need to take. Sort each action into Tier 1 (always OK), Tier 2 (OK with logging), Tier 3 (requires escalation), or Tier 4 (never). Do this exercise on paper first — the resulting tier list is the skeleton of your AGENTS.md.
3
Write your NEVER DO list
Write down every action that would be catastrophic if the agent did it unexpectedly. Git push. File deletion. Live financial orders. External sends without validation. Each one gets its own line in the hard boundaries section. Be specific — "don't break things" is not a hard boundary.
4
Add your error recovery procedure
Write the exact 4-step sequence: (1) log the error with full context, (2) try the fallback from brain/fallback-procedures.md, (3) if fallback fails mark SKIPPED, (4) continue to the next task. Define the one exception: which critical step failure should halt the whole session.
5
Define your output standards and "done" definition
What files must every session produce? Where do they go? What format? And critically: what does "done" mean for each task? Write the three-condition done test: file exists, validator passes, spot-check complete. No exceptions.
6
Test with claude --dangerously-skip-permissions on a low-stakes task
Pick a task that is fully reversible — regenerating a cached file, re-running analysis on yesterday's data, creating a test output in a temp directory. Run the agent and watch what happens. Read every line of the output. Do not grant autonomy on production tasks until you've done this step at least 3 times.
7
Review logs and tighten boundaries as needed
After each early run, open agent_task_log.jsonl and read every decision the agent made. Every decision that surprised you = a gap in your AGENTS.md. Patch the gap before the next run. After 10 clean runs with no surprises, you have earned the right to trust the agent fully.
🚀
Quick start: Brainfile's free starter kit includes a pre-built AGENTS.md template for each agent type — data pipeline, content generation, monitoring, developer workflows. Download it at brainfile.io/free-starter-kit and have a working configuration in under 15 minutes.
Section 11 — Brainfile Templates
Pre-Built AGENTS.md Templates for Every Agent Type
Stop writing AGENTS.md from scratch. Brainfile packages include production-ready configurations with sane defaults, already hardened against the most common agent mistakes.
AGENTS.md is referenced in Anthropic's Claude Code SDK documentation as the recommended pattern for configuring autonomous agent behavior. Like CLAUDE.md before it, it originated as a community practice that Anthropic has formally adopted into the Claude Code specification. The file format is plain Markdown — Claude Code reads it and treats it as authoritative instructions for agentic sessions. Both CLAUDE.md and AGENTS.md are part of what Anthropic calls the "configuration layer" for Claude Code projects.
Yes, if you run Claude in agentic or automated mode. CLAUDE.md is designed for interactive sessions where a human is present and can answer questions. AGENTS.md handles the fundamentally different scenario where Claude runs autonomously — no one to answer questions, no one to catch errors in real time. Even if your CLAUDE.md is comprehensive, it almost certainly lacks explicit permission boundaries, escalation triggers, error recovery procedures, and hard-stop rules. These are AGENTS.md-specific concerns. If you only ever run Claude interactively, CLAUDE.md alone is sufficient.
AGENTS.md takes precedence on any topic it covers. If CLAUDE.md says "always ask before writing files" and AGENTS.md says "write freely to outputs/," the AGENTS.md instruction wins in agentic mode. The practical recommendation: after writing AGENTS.md, review CLAUDE.md section by section and identify every place where AGENTS.md overrides it. Add a comment to those CLAUDE.md sections noting the override. This makes your configuration easy to audit and prevents confusion when both files exist in the same repository.
Technically yes — AGENTS.md can grant permissions that CLAUDE.md doesn't mention. But this is almost always the wrong design. AGENTS.md should be more restrictive than CLAUDE.md, not less. In an interactive session, Claude can ask permission for edge cases — that's the safety valve. In agentic mode, there's no safety valve except what you've written in AGENTS.md. Granting an autonomous agent broader permissions than your interactive session is asking for problems. Start tighter and relax permissions after successful runs.
Claude Code does not automatically substitute template variables. The double-brace syntax ({{SESSION_ID}}, {{RUN_TYPE}}, {{START_TIME_UTC}}) is a convention you implement yourself in your runner script. Before invoking Claude, your script uses sed, Python string replacement, or any templating tool to substitute the values into a temporary resolved copy of AGENTS.md. That resolved copy is what Claude reads. This pattern is powerful because it makes each agent run traceable — every log entry includes the session ID, every commit includes the run type, and the agent's behavior can be audited run by run.
A production AGENTS.md is typically 80-200 lines of Markdown. Short enough to be read in full every session start (Claude does read the whole file), long enough to cover all six core sections: agent identity, authorized actions, escalation triggers, hard boundaries, error recovery, and output standards. If your AGENTS.md is under 40 lines, you've probably left important decisions to the agent's discretion. Over 400 lines, and you're either over-engineering it or should be splitting configuration into multiple referenced documents via brain/ files.
One per agent deployment, not one per project. If your project runs three different agent types (data pipeline, content generator, monitor), each should have its own AGENTS.md tailored to its specific scope and permissions. The easiest pattern: keep the general project AGENTS.md in the root for the orchestrator, and put specialist agent configs in subdirectories under agents/ or as .claude/AGENTS.md in agent-specific working directories. Never share one AGENTS.md across agents with different permission levels — the config must match the specific agent's role.
Three tests in order: First, run the agent on a completely reversible task (read-only analysis, regenerating a cached file) and review agent_task_log.jsonl for every decision. Second, intentionally trigger each error condition — cause an API call to fail, make a validator return non-zero, put a file in the wrong format — and verify the agent follows the recovery procedure exactly. Third, review the session handoff: is it accurate? Does "done" mean what you defined? The best agents are those where the task log tells the same story as the actual outputs, with no gaps or surprises.