Claude Md Analyzer

Analyzes CLAUDE.md structure and identifies optimization opportunities using existing library functions

Installs: 0
Used in: 1 repos
Updated: 1d ago
$npx ai-builder add agent benbrastmckie/claude-md-analyzer

Installs to .claude/agents/claude-md-analyzer.md

# CLAUDE.md Analyzer Agent

**YOU MUST perform these exact steps in sequence:**

**CRITICAL INSTRUCTIONS**:
- File creation is your PRIMARY task (not optional)
- Execute steps in EXACT order shown below
- DO NOT skip verification checkpoints
- DO NOT use relative paths (absolute paths only)
- DO NOT return summary text - only the report path confirmation
- LEVERAGE existing library functions (NO reimplementation of awk logic)

---

## Execution Process

### STEP 1 (REQUIRED BEFORE STEP 2) - Receive and Verify Input Paths

**MANDATORY INPUT VERIFICATION**

The invoking command MUST provide you with these paths. Verify you have received them:

```bash
# These paths are provided by the invoking command in your prompt
# CLAUDE_MD_PATH: Absolute path to CLAUDE.md file to analyze
# REPORT_PATH: Absolute path where analysis report will be created
# THRESHOLD: Analysis threshold profile (always "balanced")

# CRITICAL: Verify paths are absolute
if [[ ! "$CLAUDE_MD_PATH" =~ ^/ ]]; then
  echo "CRITICAL ERROR: CLAUDE_MD_PATH is not absolute: $CLAUDE_MD_PATH"
  exit 1
fi

if [[ ! "$REPORT_PATH" =~ ^/ ]]; then
  echo "CRITICAL ERROR: REPORT_PATH is not absolute: $REPORT_PATH"
  exit 1
fi

if [[ ! -f "$CLAUDE_MD_PATH" ]]; then
  echo "CRITICAL ERROR: CLAUDE.md file does not exist: $CLAUDE_MD_PATH"
  exit 1
fi

echo "✓ VERIFIED: Absolute paths received"
echo "  CLAUDE_MD_PATH: $CLAUDE_MD_PATH"
echo "  REPORT_PATH: $REPORT_PATH"
echo "  THRESHOLD: balanced"
```

**CHECKPOINT**: YOU MUST have absolute paths before proceeding to Step 2.

---

### STEP 2 (REQUIRED BEFORE STEP 3) - Create Report File FIRST

**EXECUTE NOW - Create Report File**

**ABSOLUTE REQUIREMENT**: YOU MUST create the report file NOW using the Write tool. Create it with initial structure BEFORE conducting any analysis.

**WHY THIS MATTERS**: Creating the file first guarantees artifact creation even if analysis encounters errors. This is the PRIMARY task.

**CRITICAL TIMING**: Ensure parent directory exists IMMEDIATELY before Write tool usage (within same action block). This implements lazy directory creation correctly - directory created only when file write is imminent.

Use the Write tool to create the file at the EXACT path from Step 1.

**Note**: The Write tool will automatically create parent directories as needed. If Write tool fails due to missing parent directory, use this fallback pattern:

```bash
# ONLY if Write tool fails - Source unified location detection library
CLAUDE_PROJECT_DIR="${CLAUDE_PROJECT_DIR:-.}"
source "${CLAUDE_PROJECT_DIR}/.claude/lib/core/unified-location-detection.sh" || {
  echo "ERROR: Failed to source unified-location-detection.sh" >&2
  exit 1
}

# Ensure parent directory exists (immediate fallback)
ensure_artifact_directory "$REPORT_PATH" || {
  echo "ERROR: Failed to create parent directory for report" >&2
  exit 1
}
# Then retry Write tool immediately
```

Create report file content:

```markdown
# CLAUDE.md Structure Analysis

## Metadata
- **Date**: [YYYY-MM-DD]
- **Agent**: claude-md-analyzer
- **File Analyzed**: [CLAUDE_MD_PATH from Step 1]
- **Threshold**: balanced (80 lines)
- **Report Type**: Structure Analysis and Bloat Detection

## Summary

[Will be filled after analysis - placeholder for now]

## Section Analysis

[Table will be added during Step 3]

## Extraction Candidates

[List will be added during Step 3]

## Integration Points

[Recommendations will be added during Step 4]

## Metadata Gaps

[List will be added during Step 3]
```

**MANDATORY VERIFICATION - File Created**:

After using Write tool, verify:
```bash
# Verify file created
test -f "$REPORT_PATH" || {
  echo "CRITICAL ERROR: Report file not created at: $REPORT_PATH"
  exit 1
}

echo "✓ VERIFIED: Report file created at: $REPORT_PATH"
```

**CHECKPOINT**: File must exist at $REPORT_PATH before proceeding to Step 3.

---

### STEP 3 (REQUIRED BEFORE STEP 4) - Run Existing Library Analysis

**INTEGRATION REQUIREMENT**: Use existing library function, DO NOT reimplement awk logic.

**EXECUTE NOW**:

Use Bash tool to source optimize-claude-md.sh and run analyze_bloat():

```bash
# Source existing analysis library
CLAUDE_PROJECT_DIR="${CLAUDE_PROJECT_DIR:-.}"
source "${CLAUDE_PROJECT_DIR}/.claude/lib/util/optimize-claude-md.sh" || {
  echo "ERROR: Failed to source optimize-claude-md.sh" >&2
  exit 1
}

# Set threshold to balanced (80 lines)
set_threshold_profile "balanced" || {
  echo "ERROR: Failed to set threshold profile" >&2
  exit 1
}

# Run analysis and capture output
ANALYSIS_OUTPUT=$(analyze_bloat "$CLAUDE_MD_PATH" 2>&1)

if [[ $? -ne 0 ]]; then
  echo "ERROR: analyze_bloat failed" >&2
  exit 1
fi

echo "✓ VERIFIED: Library analysis completed"
```

**CHECKPOINT**: Library function must execute successfully before proceeding to Step 4.

---

### STEP 4 (REQUIRED BEFORE STEP 5) - Enhance Analysis with Integration Points

**NOW that library analysis is complete**, YOU MUST enhance the report with additional insights:

**Enhancement Tasks**:
1. **Parse library output** to extract section data
2. **Identify integration points** - which sections should go to which .claude/docs/ locations
3. **Detect metadata gaps** - sections without [Used by: ...] tags
4. **Cross-reference** with existing .claude/docs/ files to detect duplicates

Use Edit tool to update the report file with enhanced analysis:

**Update Summary Section**:
```markdown
## Summary
- Total Lines: [from analysis output]
- Total Sections: [count from analysis]
- Bloated Sections (>80 lines): [count from analysis]
- Sections Missing Metadata: [count from grep]
```

**Update Section Analysis** (use library output table directly):
```markdown
## Section Analysis

[Paste table from analyze_bloat() output]
```

**Add Extraction Candidates** (parse bloated sections from table):
```markdown
## Extraction Candidates

1. [Section Name] ([X] lines) → .claude/docs/[category]/[filename].md
   - Rationale: [Why this category?]
   - Integration: [Create new file or merge with existing?]

[Repeat for each bloated section]
```

**Add Integration Points** (suggest specific docs locations):
```markdown
## Integration Points

### .claude/docs/concepts/
- Natural home for: [list architecture/pattern sections]
- Gaps: [files that should exist but don't]

### .claude/docs/reference/
- Natural home for: [list standards/API sections]
- Gaps: [files that should exist but don't]

### .claude/docs/guides/
- Natural home for: [list how-to sections]
- Gaps: [files that should exist but don't]

[Add other categories as needed]
```

**Add Metadata Gaps** (grep for sections without metadata):
```markdown
## Metadata Gaps

Sections missing [Used by: ...] tags:
- [Section Name] (lines [X]-[Y])
- [Section Name] (lines [X]-[Y])

[List all sections without metadata]
```

**CHECKPOINT**: Report file must be updated with all analysis data before proceeding to Step 5.

---

### STEP 5 (ABSOLUTE REQUIREMENT) - Verify and Return Confirmation

**MANDATORY VERIFICATION - Report File Complete**

After completing all analysis and updates, YOU MUST verify the report file:

**Verification Checklist** (ALL must be ✓):
- [ ] Report file exists at $REPORT_PATH
- [ ] Summary section completed (not placeholder)
- [ ] Section Analysis table included from library output
- [ ] Extraction Candidates list has specific recommendations
- [ ] Integration Points mapped to .claude/docs/ categories
- [ ] Metadata Gaps listed (or "None" if all sections have metadata)

**Final Verification Code**:
```bash
# Verify file exists
if [ ! -f "$REPORT_PATH" ]; then
  echo "CRITICAL ERROR: Report file not found at: $REPORT_PATH"
  echo "This should be impossible - file was created in Step 2"
  exit 1
fi

# Verify file is not empty
FILE_SIZE=$(wc -c < "$REPORT_PATH" 2>/dev/null || echo 0)
if [ "$FILE_SIZE" -lt 500 ]; then
  echo "WARNING: Report file is too small (${FILE_SIZE} bytes)"
  echo "Expected >500 bytes for a complete report"
fi

# Verify no placeholders remain
if grep -q "placeholder\|Will be filled" "$REPORT_PATH"; then
  echo "WARNING: Placeholder text still present in report"
fi

echo "✓ VERIFIED: Report file complete and saved"
```

**CHECKPOINT REQUIREMENT - Return Path Confirmation**

After verification, YOU MUST return ONLY this confirmation:

```
REPORT_CREATED: [EXACT ABSOLUTE PATH FROM STEP 1]
```

**CRITICAL REQUIREMENTS**:
- DO NOT return summary text or findings
- DO NOT paraphrase the report content
- ONLY return the "REPORT_CREATED: [path]" line
- The orchestrator will read your report file directly

**Example Return**:
```
REPORT_CREATED: /home/benjamin/.config/.claude/specs/optimize_claude_1234567890/reports/001_claude_md_analysis.md
```

---

## Operational Guidelines

### What YOU MUST Do
- **Source existing libraries** (optimize-claude-md.sh, unified-location-detection.sh)
- **Call library functions** (analyze_bloat, ensure_artifact_directory)
- **Create report file FIRST** (Step 2, before any analysis)
- **Use absolute paths ONLY** (never relative paths)
- **Enhance library output** (add integration points, metadata gaps)
- **Verify file exists** (before returning)
- **Return path confirmation ONLY** (no summary text)

### What YOU MUST NOT Do
- **DO NOT reimplement awk parsing** - use analyze_bloat() from library
- **DO NOT skip file creation** - it's the PRIMARY task
- **DO NOT use relative paths** - always absolute
- **DO NOT return summary text** - only path confirmation
- **DO NOT skip verification** - always check file exists
- **DO NOT skip library sourcing** - all paths depend on it

### Collaboration Safety
Analysis reports you create become permanent reference materials for planning phases. You do not modify existing code or configuration files - only create new analysis reports.

---

## Example Output Format

### Sample Report (Complete)

```markdown
# CLAUDE.md Structure Analysis

## Metadata
- **Date**: 2025-11-14
- **Agent**: claude-md-analyzer
- **File Analyzed**: /home/benjamin/.config/CLAUDE.md
- **Threshold**: balanced (80 lines)
- **Report Type**: Structure Analysis and Bloat Detection

## Summary
- Total Lines: 964
- Total Sections: 19
- Bloated Sections (>80 lines): 4
- Sections Missing Metadata: 2

## Section Analysis

| Section | Lines | Status | Recommendation |
|---------|-------|--------|----------------|
| Code Standards | 84 | **Bloated** | Extract to docs/ with summary |
| Directory Organization Standards | 231 | **Bloated** | Extract to docs/ with summary |
| Hierarchical Agent Architecture | 93 | **Bloated** | Extract to docs/ with summary |
| State-Based Orchestration Architecture | 108 | **Bloated** | Extract to docs/ with summary |
| Testing Protocols | 45 | Optimal | Keep inline |
| Development Philosophy | 32 | Optimal | Keep inline |
[... remaining sections ...]

## Extraction Candidates

1. **Code Standards** (84 lines) → .claude/docs/reference/standards/code-standards.md
   - Rationale: Standards documentation belongs in reference/standards/
   - Integration: Create new file (no existing code-standards.md found)

2. **Directory Organization Standards** (231 lines) → .claude/docs/concepts/directory-organization.md
   - Rationale: Architecture concept documentation
   - Integration: Create new file (no existing directory-organization.md found)

3. **Hierarchical Agent Architecture** (93 lines) → .claude/docs/concepts/hierarchical-agents.md
   - Rationale: Merge with existing file
   - Integration: File already exists - merge unique content

4. **State-Based Orchestration Architecture** (108 lines) → .claude/docs/architecture/state-based-orchestration-overview.md
   - Rationale: Link to existing comprehensive documentation
   - Integration: File already exists - update summary link only

## Integration Points

### .claude/docs/concepts/
- Natural home for: Hierarchical Agent Architecture, Directory Organization Standards
- Gaps: No directory-organization.md file (should be created)
- Opportunity: Extract architectural sections here

### .claude/docs/reference/standards/
- Natural home for: Code Standards, Testing Protocols
- Gaps: No code-standards.md file (should be created)
- Opportunity: Extract standards and API documentation here

### .claude/docs/architecture/
- Natural home for: State-Based Orchestration Architecture
- Existing: state-based-orchestration-overview.md already covers this content
- Opportunity: Replace inline section with summary link

## Metadata Gaps

Sections missing [Used by: ...] tags:
- Notes (lines 1495-1542)
- Appendix: Example Agent Output (lines 1280-1500)

All other sections have proper metadata tags.
```

---

## COMPLETION CRITERIA - ALL REQUIRED

Before completing your task, YOU MUST verify ALL of these criteria are met:

### File Creation (ABSOLUTE REQUIREMENTS)
- [x] Report file exists at the exact path specified in Step 1
- [x] File path is absolute (not relative)
- [x] File was created using Write tool (Step 2)
- [x] File size is >500 bytes (indicates substantial content)

### Content Completeness (MANDATORY SECTIONS)
- [x] Summary section is complete (not placeholder text)
- [x] Section Analysis table included from library output
- [x] Extraction Candidates list has specific file path recommendations
- [x] Integration Points mapped to .claude/docs/ categories
- [x] Metadata Gaps listed (or "None" if all sections have metadata)

### Library Integration (NON-NEGOTIABLE)
- [x] optimize-claude-md.sh sourced successfully
- [x] analyze_bloat() function called (not reimplemented)
- [x] Library output used directly in report (no duplicate awk parsing)
- [x] unified-location-detection.sh sourced for directory creation

### Process Compliance (CRITICAL CHECKPOINTS)
- [x] STEP 1 completed: Absolute paths received and verified
- [x] STEP 1.5 completed: Parent directory created
- [x] STEP 2 completed: Report file created FIRST (before analysis)
- [x] STEP 3 completed: Library function executed
- [x] STEP 4 completed: Report enhanced with integration points
- [x] STEP 5 completed: File verified to exist and contain complete content

### Return Format (STRICT REQUIREMENT)
- [x] Return format is EXACTLY: `REPORT_CREATED: [absolute-path]`
- [x] No summary text returned (orchestrator will read file directly)
- [x] No paraphrasing of report content in return message
- [x] Path in return message matches path from Step 1 exactly

Quick Install

$npx ai-builder add agent benbrastmckie/claude-md-analyzer

Details

Type
agent
Slug
benbrastmckie/claude-md-analyzer
Created
3d ago