14 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
AI Research Skills Library - A comprehensive open-source library of 90 AI research skills enabling AI agents to autonomously conduct AI research — from idea to paper. Each skill provides expert-level guidance (200-500 lines) with real code examples, troubleshooting guides, and production-ready workflows.
Mission: Enable AI agents to autonomously conduct AI research from hypothesis to experimental verification, covering the full lifecycle: literature survey, ideation, dataset preparation, training pipelines, model deployment, evaluation, and paper writing.
Repository Architecture
Directory Structure (90 Skills Across 23 Categories)
Skills are organized into numbered categories representing the AI research lifecycle:
0-autoresearch-skill/- Autonomous research orchestration (1 skill: Autoresearch — central layer that manages the full lifecycle and routes to all other skills)01-model-architecture/- Model architectures (5 skills: Megatron-Core, LitGPT, Mamba, RWKV, NanoGPT)02-tokenization/- Tokenizers (2 skills: HuggingFace Tokenizers, SentencePiece)03-fine-tuning/- Fine-tuning frameworks (4 skills: Axolotl, LLaMA-Factory, Unsloth, PEFT)04-mechanistic-interpretability/- Interpretability tools (4 skills: TransformerLens, SAELens, NNsight, Pyvene)05-data-processing/- Data curation (2 skills: Ray Data, NeMo Curator)06-post-training/- RLHF/DPO/GRPO (8 skills: TRL, GRPO, OpenRLHF, SimPO, verl, slime, miles, torchforge)07-safety-alignment/- Safety and guardrails (4 skills: Constitutional AI, LlamaGuard, NeMo Guardrails, Prompt Guard)08-distributed-training/- Distributed systems (6 skills: Megatron-Core, DeepSpeed, FSDP, Accelerate, PyTorch Lightning, Ray Train)09-infrastructure/- Cloud compute (3 skills: Modal, SkyPilot, Lambda Labs)10-optimization/- Optimization techniques (6 skills: Flash Attention, bitsandbytes, GPTQ, AWQ, HQQ, GGUF)11-evaluation/- Benchmarking (3 skills: lm-evaluation-harness, BigCode, NeMo Evaluator)12-inference-serving/- Inference engines (4 skills: vLLM, TensorRT-LLM, llama.cpp, SGLang)13-mlops/- Experiment tracking (3 skills: Weights & Biases, MLflow, TensorBoard)14-agents/- Agent frameworks (4 skills: LangChain, LlamaIndex, CrewAI, AutoGPT)15-rag/- Retrieval-augmented generation (5 skills: Chroma, FAISS, Sentence Transformers, Pinecone, Qdrant)16-prompt-engineering/- Structured output (4 skills: DSPy, Instructor, Guidance, Outlines)17-observability/- LLM observability (2 skills: LangSmith, Phoenix)18-multimodal/- Vision and speech (7 skills: CLIP, Whisper, LLaVA, Stable Diffusion, SAM, BLIP-2, AudioCraft)19-emerging-techniques/- Advanced methods (6 skills: MoE Training, Model Merging, Long Context, Speculative Decoding, Knowledge Distillation, Model Pruning)20-ml-paper-writing/- Paper writing (1 skill: ML Paper Writing with LaTeX templates for NeurIPS, ICML, ICLR, ACL, AAAI, COLM)21-research-ideation/- Ideation (2 skills: Research Brainstorming, Creative Thinking)22-agent-native-research-artifact/- Agent-Native Research Artifact tooling (3 skills: ARA Compiler, ARA Research Manager, ARA Rigor Reviewer — ingestion, post-task provenance recording, and Seal Level 2 epistemic review)
Skill File Structure
Each skill follows a standardized format:
skill-name/
├── SKILL.md # Main guidance (200-600 lines with YAML frontmatter)
├── references/ # Deep documentation (300KB+ target)
│ ├── README.md # From official docs
│ ├── api.md # API reference
│ ├── tutorials.md # Step-by-step guides
│ ├── issues.md # Real GitHub issues & solutions
│ └── releases.md # Version history
├── scripts/ # Helper scripts (optional)
├── templates/ # Code templates (optional)
└── examples/ # Example implementations (optional)
Skill Quality Standards
YAML Frontmatter Requirements (CRITICAL)
All SKILL.md files MUST include YAML frontmatter with these exact fields:
---
name: skill-name-here # kebab-case, no quotes, gerund form preferred
description: Third-person description of what AND when to use this skill # No quotes, max 1024 chars
version: 1.0.0 # Semantic versioning
author: Orchestra Research # Standard author
license: MIT # Standard license
tags: [Tag One, Tag Two] # Title Case (except UPPERCASE acronyms like GRPO, TRL, RLHF)
dependencies: [pkg>=1.0.0] # Optional, with version constraints
---
Critical Rules:
name: Use gerund form (e.g.,serving-llms,processing-data,grpo-rl-training)description: Third person ("Provides guidance for..."), include WHAT it does AND WHEN to use ittags: Title Case for regular words, UPPERCASE for acronyms (GRPO, TRL, RLHF, DPO, PPO)- No quotes around any field values (except in arrays)
- Dependencies should include version constraints:
transformers>=4.47.0
Content Quality Standards
Core Requirements (based on Anthropic official best practices):
- ✅ SKILL.md body: 200-500 lines (under 500 lines is critical for performance)
- ✅ Progressive disclosure: SKILL.md as overview, details in separate reference files
- ✅ Workflows with copy-paste checklists for complex tasks
- ✅ "When to use vs alternatives" guidance section
- ✅ Common issues section with solutions
- ✅ Concise content: assume Claude is smart, no over-explaining basics
- ✅ Code examples with language detection (
python,bash, etc.) - ✅ References ONE level deep from SKILL.md (no nested references)
Gold Standard (aim for this - see 06-post-training/grpo-rl-training/):
- ✅ 2-3 complete workflows with step-by-step checklists
- ✅ Reference files for advanced topics (one level deep)
- ✅ Feedback loops (validate → fix → repeat) for quality-critical operations
- ✅ Consistent terminology throughout
- ✅ Concrete input/output examples
- ✅ Real GitHub issues with solutions (when available)
NOT Acceptable:
- ❌ SKILL.md over 500 lines (split into reference files instead)
- ❌ Over-explaining basics that Claude already knows
- ❌ First-person descriptions ("I can help you...")
- ❌ Vague skill names ("helper", "utils", "tools")
- ❌ Nested references (SKILL.md → ref1.md → ref2.md)
- ❌ Missing workflows with checklists for complex tasks
Development Workflow
Adding a New Skill
- Choose skill from roadmap (see CONTRIBUTING.md or README.md)
- Create directory structure in appropriate category (01-19)
- Write SKILL.md with YAML frontmatter following standards above
- Add reference documentation (target 300KB+ from official sources)
- Validate quality:
- Check SKILL.md has YAML frontmatter
- Verify SKILL.md is 200-500 lines
- Ensure code blocks have language tags
- Confirm references are one level deep from SKILL.md
- Check documentation size:
du -sh skill-name/references/
- Test the skill with real use cases before submitting
Improving Existing Skills
When updating skills:
- Maintain YAML frontmatter format and fields
- Keep SKILL.md under 500 lines - split into reference files if needed
- Add workflows with checklists for complex operations
- Update version number in YAML frontmatter
- Test changes with representative tasks
Quality Validation Commands
# Check YAML frontmatter exists
head -20 skill-name/SKILL.md
# Verify SKILL.md line count (target 200-500 lines)
wc -l skill-name/SKILL.md
# Check documentation size (target 300KB+)
du -sh skill-name/references/
# Verify code blocks have language tags
grep -A 1 '```' skill-name/SKILL.md | head -20
# Validate YAML frontmatter syntax
python -c "import yaml; yaml.safe_load(open('skill-name/SKILL.md').read().split('---')[1])"
Key Files
- README.md - Project overview, all 90 skills listed with descriptions and stats
- CONTRIBUTING.md - Complete contribution guidelines and quality standards
- SKILL_TEMPLATE.md - Copy-paste scaffold for new skills
- ROADMAP.md - Development roadmap (90 skills achieved)
- anthropic_official_docs/ - Anthropic's official best practices for skills
Git Workflow
Standard Git workflow:
# Create feature branch
git checkout -b add-skill-name
# Add and commit changes
git add category/skill-name/
git commit -m "Add [Skill Name] skill
- X lines of documentation
- Y GitHub issues with solutions
- API reference and examples included"
# Push to fork and create PR
git push origin add-skill-name
Automation: Orchestra Skill Marketplace Sync
How Auto-Sync Works
When skills are committed to the main branch, GitHub Actions automatically syncs them to the Orchestra skill marketplace:
- GitHub Actions detects changed skill folders on push to
main - For each changed skill:
- Extracts metadata from SKILL.md frontmatter (
name,author, etc.) - Creates ZIP file containing entire skill directory (SKILL.md, references/, scripts/, etc.)
- Uploads to Orchestra API endpoint
- Extracts metadata from SKILL.md frontmatter (
- Orchestra stores ZIP in Supabase Storage and creates database record
- Skill appears in marketplace at
https://orchestra.com/research-skills
Workflow File Location
- File:
.github/workflows/sync-skills.yml - Triggers: Push to
mainbranch, manual workflow dispatch - What syncs: Only skill directories that changed in the commit
Author Detection (Orchestra vs Community)
The workflow reads the author: field from SKILL.md frontmatter to determine badge:
Official Orchestra Skill:
---
author: Orchestra Research # Contains "Orchestra"
---
- Result: Source =
orchestra(Official badge) - Storage:
research-skills/orchestra/skill-name.zip
Community Skill:
---
author: Jane Doe # Does NOT contain "Orchestra"
---
- Result: Source =
community(Community badge) - Storage:
research-skills/community/skill-name.zip
What Gets Synced
The workflow zips ALL contents of skill directory:
- ✅ SKILL.md
- ✅ references/ (all subdirectories)
- ✅ scripts/ (if exists)
- ✅ assets/ (if exists)
- ✅ examples/ (if exists)
- ✅ templates/ (if exists)
- ❌ Hidden files (
.gitkeep,.DS_Store)
Testing the Sync
Manual trigger:
- Go to GitHub Actions tab
- Select "Sync Skills to Orchestra" workflow
- Click "Run workflow"
Test with commit:
# Make a small change to any skill
echo "\n<!-- Updated $(date) -->" >> 01-model-architecture/litgpt/SKILL.md
# Commit and push to main
git add .
git commit -m "test: trigger auto-sync"
git push origin main
Verify sync worked:
- Check GitHub Actions tab for workflow run status
- Check Orchestra marketplace for updated skill
- Check Supabase Storage for ZIP file
Important Notes
- GitHub Secrets required:
ORCHESTRA_API_URL,ORCHESTRA_SYNC_API_KEY(already configured) - Only syncs changed skills: Workflow detects which skill directories changed in commit
- SKILL.md required: Skills without SKILL.md are skipped with warning
- See detailed setup:
dev_data/GITHUB_SKILLS_SYNC_SETUP.md
npm Package Publishing
How It Works
The publish-npm.yml workflow auto-publishes to npm when the version in packages/ai-research-skills/package.json changes on main.
- Auth: Uses OIDC trusted publishing (no npm tokens). Configured on npmjs.com under the package's Trusted Publishers settings.
- Provenance:
--provenanceflag signs packages with Sigstore for supply chain security. - Workflow:
.github/workflows/publish-npm.yml
Bumping Versions
Always use npm version (not manual edits) to keep package-lock.json in sync:
cd packages/ai-research-skills
npm version patch # 1.3.6 → 1.3.7
npm version minor # 1.3.7 → 1.4.0
npm version major # 1.4.0 → 2.0.0
Use --no-git-tag-version if you want to commit manually.
Common Issues
npm cifails in CI:package-lock.jsonis out of sync. Runnpm installlocally and commit the lockfile.- OIDC auth fails: The trusted publisher config on npmjs.com must match the repo exactly (case-sensitive:
Orchestra-Research/AI-Research-SKILLs, workflow:publish-npm.yml). NODE_AUTH_TOKENblocks OIDC:actions/setup-nodewithregistry-urlauto-sets this token. The workflow unsets it before publish so OIDC takes over.- Version unchanged skip: The workflow compares
HEADvsHEAD~1. If only the lockfile changed (notpackage.jsonversion), publish is skipped. Bump the version to trigger.
Important Conventions
Naming Conventions
- Skill names: Use gerund form (verb + -ing) in kebab-case:
processing-pdfs,serving-llms,grpo-rl-training - Tags: Title Case for words, UPPERCASE for acronyms (GRPO, TRL, RLHF, DPO, PPO, FSDP, MoE)
- Descriptions: Third person, include what AND when to use
Code Examples
Always use language detection in code blocks:
# Good - has language tag
from transformers import AutoModel
NOT:
# Bad - no language tag
from transformers import AutoModel
Progressive Disclosure Pattern
SKILL.md should link directly to reference files (one level deep):
## Advanced Features
**API Reference**: See [references/api.md](references/api.md)
**Troubleshooting**: See [references/issues.md](references/issues.md)
Philosophy
Quality over Quantity: This library maintains high standards by:
- Requiring 200-500 line SKILL.md files (focused, actionable guidance)
- Including 300KB+ documentation from official sources
- Providing real GitHub issues with solutions
- Following Anthropic's official best practices for skills
- Testing skills with real use cases before inclusion
Each skill represents expert-level knowledge distilled into a format optimized for AI agent consumption.