Files

T

nanobot b275af2b4d fix: dereference orchestra-skills submodule, add as plain files

2026-05-05 23:28:24 +02:00

8.3 KiB

Raw Blame History

Skill Routing: When to Use Which Domain Skill

The autoresearch skill orchestrates — domain skills execute. This reference maps research activities to the skills library.

Routing Principle

When you encounter a domain-specific task during research, search the skills library for the right tool. Read the SKILL.md of the relevant skill before starting — it contains workflows, common issues, and production-ready code examples.

Complete Routing Map

Data and Preprocessing

Task	Skill	Location
Large-scale data processing	Ray Data	`05-data-processing/ray-data/`
Data curation and filtering	NeMo Curator	`05-data-processing/nemo-curator/`
Custom tokenizer training	HuggingFace Tokenizers	`02-tokenization/hf-tokenizers/`
Subword tokenization	SentencePiece	`02-tokenization/sentencepiece/`

Model Architecture and Training

Task	Skill	Location
Large-scale pretraining	Megatron-Core	`01-model-architecture/megatron-core/`
Lightweight LLM training	LitGPT	`01-model-architecture/litgpt/`
State-space models	Mamba	`01-model-architecture/mamba/`
Linear attention models	RWKV	`01-model-architecture/rwkv/`
Small-scale pretraining	NanoGPT	`01-model-architecture/nanogpt/`

Fine-tuning

Task	Skill	Location
Multi-method fine-tuning	Axolotl	`03-fine-tuning/axolotl/`
Template-based fine-tuning	LLaMA-Factory	`03-fine-tuning/llama-factory/`
Fast LoRA fine-tuning	Unsloth	`03-fine-tuning/unsloth/`
PyTorch-native fine-tuning	Torchtune	`03-fine-tuning/torchtune/`

Post-training (RL / Alignment)

Task	Skill	Location
PPO, DPO, SFT pipelines	TRL	`06-post-training/trl/`
Group Relative Policy Optimization	GRPO	`06-post-training/grpo-rl-training/`
Scalable RLHF	OpenRLHF	`06-post-training/openrlhf/`
Reference-free alignment	SimPO	`06-post-training/simpo/`

Interpretability

Task	Skill	Location
Transformer circuit analysis	TransformerLens	`04-mechanistic-interpretability/transformerlens/`
Sparse autoencoder training	SAELens	`04-mechanistic-interpretability/saelens/`
Intervention experiments	NNsight	`04-mechanistic-interpretability/nnsight/`
Causal tracing	Pyvene	`04-mechanistic-interpretability/pyvene/`

Distributed Training

Task	Skill	Location
ZeRO optimization	DeepSpeed	`08-distributed-training/deepspeed/`
Fully sharded data parallel	FSDP	`08-distributed-training/fsdp/`
Multi-GPU abstraction	Accelerate	`08-distributed-training/accelerate/`
Training framework	PyTorch Lightning	`08-distributed-training/pytorch-lightning/`
Distributed data + training	Ray Train	`08-distributed-training/ray-train/`

Evaluation

Task	Skill	Location
Standard LLM benchmarks	lm-evaluation-harness	`11-evaluation/lm-eval-harness/`
NeMo-integrated evaluation	NeMo Evaluator	`11-evaluation/nemo-evaluator/`
Custom eval tasks	Inspect AI	`11-evaluation/inspect-ai/`

Inference and Serving

Task	Skill	Location
High-throughput serving	vLLM	`12-inference-serving/vllm/`
NVIDIA-optimized inference	TensorRT-LLM	`12-inference-serving/tensorrt-llm/`
CPU / edge inference	llama.cpp	`12-inference-serving/llama-cpp/`
Structured generation serving	SGLang	`12-inference-serving/sglang/`

Experiment Tracking

Task	Skill	Location
Full experiment tracking	Weights & Biases	`13-mlops/wandb/`
Open-source tracking	MLflow	`13-mlops/mlflow/`
Training visualization	TensorBoard	`13-mlops/tensorboard/`

Optimization Techniques

Task	Skill	Location
Efficient attention	Flash Attention	`10-optimization/flash-attention/`
4/8-bit quantization	bitsandbytes	`10-optimization/bitsandbytes/`
GPTQ quantization	GPTQ	`10-optimization/gptq/`
AWQ quantization	AWQ	`10-optimization/awq/`
GGUF format (llama.cpp)	GGUF	`10-optimization/gguf/`
PyTorch-native quantization	Quanto	`10-optimization/quanto/`

Safety and Alignment

Task	Skill	Location
Constitutional AI training	Constitutional AI	`07-safety-alignment/constitutional-ai/`
Content safety classification	LlamaGuard	`07-safety-alignment/llamaguard/`
Guardrail pipelines	NeMo Guardrails	`07-safety-alignment/nemo-guardrails/`
Prompt injection detection	Prompt Guard	`07-safety-alignment/prompt-guard/`

Infrastructure

Task	Skill	Location
Serverless GPU compute	Modal	`09-infrastructure/modal/`
Multi-cloud orchestration	SkyPilot	`09-infrastructure/skypilot/`
GPU cloud instances	Lambda Labs	`09-infrastructure/lambda-labs/`

Agents and RAG

Task	Skill	Location
Agent pipelines	LangChain	`14-agents/langchain/`
Knowledge retrieval agents	LlamaIndex	`14-agents/llamaindex/`
Lightweight agents	Smolagents	`14-agents/smolagents/`
Claude-based agents	Claude Agent SDK	`14-agents/claude-agent-sdk/`
Vector store (local)	Chroma	`15-rag/chroma/`
Vector similarity search	FAISS	`15-rag/faiss/`
Text embeddings	Sentence Transformers	`15-rag/sentence-transformers/`
Managed vector DB	Pinecone	`15-rag/pinecone/`
Scalable vector DB	Milvus	`15-rag/milvus/`

Prompt Engineering and Structured Output

Task	Skill	Location
Prompt optimization	DSPy	`16-prompt-engineering/dspy/`
Structured LLM output	Instructor	`16-prompt-engineering/instructor/`
Constrained generation	Guidance	`16-prompt-engineering/guidance/`
Grammar-based generation	Outlines	`16-prompt-engineering/outlines/`

Multimodal

Task	Skill	Location
Vision-language models	CLIP	`18-multimodal/clip/`
Speech recognition	Whisper	`18-multimodal/whisper/`
Visual instruction tuning	LLaVA	`18-multimodal/llava/`
Vision-language (Qwen)	Qwen2-VL	`18-multimodal/qwen2-vl/`
Vision-language (Mistral)	Pixtral	`18-multimodal/pixtral/`
Visual understanding	Florence-2	`18-multimodal/florence-2/`
Document retrieval	ColPali	`18-multimodal/colpali/`

Observability

Task	Skill	Location
LLM tracing and debugging	LangSmith	`17-observability/langsmith/`
LLM observability platform	Phoenix	`17-observability/phoenix/`

Emerging Techniques

Task	Skill	Location
Mixture of Experts training	MoE Training	`19-emerging-techniques/moe-training/`
Combining trained models	Model Merging	`19-emerging-techniques/model-merging/`
Extended context windows	Long Context	`19-emerging-techniques/long-context/`
Faster inference via drafting	Speculative Decoding	`19-emerging-techniques/speculative-decoding/`
Teacher-student compression	Knowledge Distillation	`19-emerging-techniques/knowledge-distillation/`
Reducing model size	Model Pruning	`19-emerging-techniques/model-pruning/`

Research Output

Task	Skill	Location
Generate research ideas	Research Ideation	`21-research-ideation/`
Write publication-ready paper	ML Paper Writing	`20-ml-paper-writing/`

Common Research Workflows

"I need to fine-tune a model and evaluate it"

Pick fine-tuning skill based on needs (Unsloth for speed, Axolotl for flexibility)
Use lm-evaluation-harness for standard benchmarks
Track with W&B or MLflow

"I need to understand what the model learned"

Use TransformerLens for circuit-level analysis
Train SAEs with SAELens for feature-level understanding
Run interventions with NNsight or Pyvene

"I need to do RL training"

Start with TRL for standard PPO/DPO
Use GRPO skill for DeepSeek-R1 style training
Scale with OpenRLHF if needed

"I need to run experiments on cloud GPUs"

Modal for quick serverless runs
SkyPilot for multi-cloud optimization
Lambda Labs for dedicated instances

Finding Skills

If you're not sure which skill to use:

# Search by keyword in skill names
ls */*/SKILL.md | head -20

# Search skill descriptions for a keyword
grep -l "keyword" */*/SKILL.md

Or search the repository's README.md which lists all skills with descriptions.

8.3 KiB Raw Blame History