Files
ara/orchestra-skills/0-autoresearch-skill/references/skill-routing.md
T

8.3 KiB

Skill Routing: When to Use Which Domain Skill

The autoresearch skill orchestrates — domain skills execute. This reference maps research activities to the skills library.

Routing Principle

When you encounter a domain-specific task during research, search the skills library for the right tool. Read the SKILL.md of the relevant skill before starting — it contains workflows, common issues, and production-ready code examples.

Complete Routing Map

Data and Preprocessing

Task Skill Location
Large-scale data processing Ray Data 05-data-processing/ray-data/
Data curation and filtering NeMo Curator 05-data-processing/nemo-curator/
Custom tokenizer training HuggingFace Tokenizers 02-tokenization/hf-tokenizers/
Subword tokenization SentencePiece 02-tokenization/sentencepiece/

Model Architecture and Training

Task Skill Location
Large-scale pretraining Megatron-Core 01-model-architecture/megatron-core/
Lightweight LLM training LitGPT 01-model-architecture/litgpt/
State-space models Mamba 01-model-architecture/mamba/
Linear attention models RWKV 01-model-architecture/rwkv/
Small-scale pretraining NanoGPT 01-model-architecture/nanogpt/

Fine-tuning

Task Skill Location
Multi-method fine-tuning Axolotl 03-fine-tuning/axolotl/
Template-based fine-tuning LLaMA-Factory 03-fine-tuning/llama-factory/
Fast LoRA fine-tuning Unsloth 03-fine-tuning/unsloth/
PyTorch-native fine-tuning Torchtune 03-fine-tuning/torchtune/

Post-training (RL / Alignment)

Task Skill Location
PPO, DPO, SFT pipelines TRL 06-post-training/trl/
Group Relative Policy Optimization GRPO 06-post-training/grpo-rl-training/
Scalable RLHF OpenRLHF 06-post-training/openrlhf/
Reference-free alignment SimPO 06-post-training/simpo/

Interpretability

Task Skill Location
Transformer circuit analysis TransformerLens 04-mechanistic-interpretability/transformerlens/
Sparse autoencoder training SAELens 04-mechanistic-interpretability/saelens/
Intervention experiments NNsight 04-mechanistic-interpretability/nnsight/
Causal tracing Pyvene 04-mechanistic-interpretability/pyvene/

Distributed Training

Task Skill Location
ZeRO optimization DeepSpeed 08-distributed-training/deepspeed/
Fully sharded data parallel FSDP 08-distributed-training/fsdp/
Multi-GPU abstraction Accelerate 08-distributed-training/accelerate/
Training framework PyTorch Lightning 08-distributed-training/pytorch-lightning/
Distributed data + training Ray Train 08-distributed-training/ray-train/

Evaluation

Task Skill Location
Standard LLM benchmarks lm-evaluation-harness 11-evaluation/lm-eval-harness/
NeMo-integrated evaluation NeMo Evaluator 11-evaluation/nemo-evaluator/
Custom eval tasks Inspect AI 11-evaluation/inspect-ai/

Inference and Serving

Task Skill Location
High-throughput serving vLLM 12-inference-serving/vllm/
NVIDIA-optimized inference TensorRT-LLM 12-inference-serving/tensorrt-llm/
CPU / edge inference llama.cpp 12-inference-serving/llama-cpp/
Structured generation serving SGLang 12-inference-serving/sglang/

Experiment Tracking

Task Skill Location
Full experiment tracking Weights & Biases 13-mlops/wandb/
Open-source tracking MLflow 13-mlops/mlflow/
Training visualization TensorBoard 13-mlops/tensorboard/

Optimization Techniques

Task Skill Location
Efficient attention Flash Attention 10-optimization/flash-attention/
4/8-bit quantization bitsandbytes 10-optimization/bitsandbytes/
GPTQ quantization GPTQ 10-optimization/gptq/
AWQ quantization AWQ 10-optimization/awq/
GGUF format (llama.cpp) GGUF 10-optimization/gguf/
PyTorch-native quantization Quanto 10-optimization/quanto/

Safety and Alignment

Task Skill Location
Constitutional AI training Constitutional AI 07-safety-alignment/constitutional-ai/
Content safety classification LlamaGuard 07-safety-alignment/llamaguard/
Guardrail pipelines NeMo Guardrails 07-safety-alignment/nemo-guardrails/
Prompt injection detection Prompt Guard 07-safety-alignment/prompt-guard/

Infrastructure

Task Skill Location
Serverless GPU compute Modal 09-infrastructure/modal/
Multi-cloud orchestration SkyPilot 09-infrastructure/skypilot/
GPU cloud instances Lambda Labs 09-infrastructure/lambda-labs/

Agents and RAG

Task Skill Location
Agent pipelines LangChain 14-agents/langchain/
Knowledge retrieval agents LlamaIndex 14-agents/llamaindex/
Lightweight agents Smolagents 14-agents/smolagents/
Claude-based agents Claude Agent SDK 14-agents/claude-agent-sdk/
Vector store (local) Chroma 15-rag/chroma/
Vector similarity search FAISS 15-rag/faiss/
Text embeddings Sentence Transformers 15-rag/sentence-transformers/
Managed vector DB Pinecone 15-rag/pinecone/
Scalable vector DB Milvus 15-rag/milvus/

Prompt Engineering and Structured Output

Task Skill Location
Prompt optimization DSPy 16-prompt-engineering/dspy/
Structured LLM output Instructor 16-prompt-engineering/instructor/
Constrained generation Guidance 16-prompt-engineering/guidance/
Grammar-based generation Outlines 16-prompt-engineering/outlines/

Multimodal

Task Skill Location
Vision-language models CLIP 18-multimodal/clip/
Speech recognition Whisper 18-multimodal/whisper/
Visual instruction tuning LLaVA 18-multimodal/llava/
Vision-language (Qwen) Qwen2-VL 18-multimodal/qwen2-vl/
Vision-language (Mistral) Pixtral 18-multimodal/pixtral/
Visual understanding Florence-2 18-multimodal/florence-2/
Document retrieval ColPali 18-multimodal/colpali/

Observability

Task Skill Location
LLM tracing and debugging LangSmith 17-observability/langsmith/
LLM observability platform Phoenix 17-observability/phoenix/

Emerging Techniques

Task Skill Location
Mixture of Experts training MoE Training 19-emerging-techniques/moe-training/
Combining trained models Model Merging 19-emerging-techniques/model-merging/
Extended context windows Long Context 19-emerging-techniques/long-context/
Faster inference via drafting Speculative Decoding 19-emerging-techniques/speculative-decoding/
Teacher-student compression Knowledge Distillation 19-emerging-techniques/knowledge-distillation/
Reducing model size Model Pruning 19-emerging-techniques/model-pruning/

Research Output

Task Skill Location
Generate research ideas Research Ideation 21-research-ideation/
Write publication-ready paper ML Paper Writing 20-ml-paper-writing/

Common Research Workflows

"I need to fine-tune a model and evaluate it"

  1. Pick fine-tuning skill based on needs (Unsloth for speed, Axolotl for flexibility)
  2. Use lm-evaluation-harness for standard benchmarks
  3. Track with W&B or MLflow

"I need to understand what the model learned"

  1. Use TransformerLens for circuit-level analysis
  2. Train SAEs with SAELens for feature-level understanding
  3. Run interventions with NNsight or Pyvene

"I need to do RL training"

  1. Start with TRL for standard PPO/DPO
  2. Use GRPO skill for DeepSeek-R1 style training
  3. Scale with OpenRLHF if needed

"I need to run experiments on cloud GPUs"

  1. Modal for quick serverless runs
  2. SkyPilot for multi-cloud optimization
  3. Lambda Labs for dedicated instances

Finding Skills

If you're not sure which skill to use:

# Search by keyword in skill names
ls */*/SKILL.md | head -20

# Search skill descriptions for a keyword
grep -l "keyword" */*/SKILL.md

Or search the repository's README.md which lists all skills with descriptions.