Executable Code in Agent Skills
Leverage the power of deterministic code execution to create production-grade reliable Agent Skills
Why Use Executable Code in Skills?
Determinism & Reliability
Scripts always produce the same output for the same input, unlike variable LLM-generated code
Token Efficiency
Script code never enters context; large scripts or dependencies have zero token cost
Performance
Ideal for heavy computation like parsing large PDFs or running complex algorithms
- PDF Skill: Python script to extract form fields
- Document Skills: Excel, PowerPoint, Word manipulation
- Data processing: Complex calculations and transformations
Integrating Executable Code: Structure
my-skill-folder/ ├── SKILL.md # Core instructions (references scripts) ├── scripts/ # Executable code │ ├── process_data.py │ ├── validate_input.sh │ └── utils/ │ └── helper.py ├── REFERENCE.md # Optional docs └── resources/ # Templates, data files
Claude uses relative paths like:
python scripts/process_data.py input.csvPython
PrimaryRich libraries via code execution tool
Bash
SupportedShell commands and utilities
Node.js
PossibleWorks if dependencies are minimal
Advanced Techniques for Scripts
Example:
- Provide examples of commands and expected outputs
- Handle errors with retry instructions
- Be specific about file paths and parameters
python scripts/analyze.py --file data.csv --mode summaryScripts read from files; output JSON for easy parsingScripts should be idempotent, verbose, and return structured data- Use try/except blocks in Python
- Validate inputs before processing
- Use meaningful exit codes
- Provide clear error messages
- Include validation scripts
- Use pre-installed libraries only
- No pip installs in sandbox
- Design around available packages
- Check library availability in Claude Code
- Use YAML allowed-tools to restrict access
- Audit scripts thoroughly
- Only use trusted Skills
- Monitor script behavior
- Chain scripts where one outputs to the next
- Combine Skills for different capabilities
- Use modular script design
- Share data between scripts via files
- Reference scripts conditionally in SKILL.md
- Use markdown links for on-demand loading
- Organize scripts by functionality
- Minimize initial script references
Example: Data Processing Skill
Workflow:
- 1
Run `python scripts/clean_csv.py input.csv` to clean data - 2
Run `python scripts/analyze.py cleaned.csv` for stats (outputs JSON) - 3
Parse JSON and summarize insights
import sys
import pandas as pd
# Read CSV
df = pd.read_csv(sys.argv[1])
# Clean data
df = df.dropna()
df = df.drop_duplicates()
# Save cleaned data
df.to_csv('cleaned.csv', index=False)
print(f"Cleaned {len(df)} rows")import sys
import pandas as pd
import json
# Read cleaned CSV
df = pd.read_csv(sys.argv[1])
# Calculate statistics
stats = {
"rows": len(df),
"columns": len(df.columns),
"numeric_columns": list(df.select_dtypes(include=['number']).columns),
"mean": df.mean(numeric_only=True).to_dict(),
"std": df.std(numeric_only=True).to_dict(),
"missing_values": df.isnull().sum().to_dict()
}
# Output JSON
print(json.dumps(stats, indent=2))Code Execution Tool Environment
Development Tips
Test in Separate Sessions
Observe Claude's bash commands to understand execution patterns
Start with skill-creator
Use built-in helper for scaffolding script structure
Check Anthropic's GitHub
Reference document skills scripts like PDF extraction
github.com/anthropics/skillsIterate Based on Failures
Run tasks without script → identify failures → add script
Production-Grade Reliability
Executable code elevates Skills from guided prompts to hybrid agent-tools, enabling production-grade reliability. For complex needs, combine with multiple modular Skills!