Skill Routing: When to Use Which Domain Skill
The autoresearch skill orchestrates — domain skills execute. This reference maps research activities to the skills library.
Routing Principle
When you encounter a domain-specific task during research, search the skills library for the right tool. Read the SKILL.md of the relevant skill before starting — it contains workflows, common issues, and production-ready code examples.
Complete Routing Map
Data and Preprocessing
| Task |
Skill |
Location |
| Large-scale data processing |
Ray Data |
05-data-processing/ray-data/ |
| Data curation and filtering |
NeMo Curator |
05-data-processing/nemo-curator/ |
| Custom tokenizer training |
HuggingFace Tokenizers |
02-tokenization/hf-tokenizers/ |
| Subword tokenization |
SentencePiece |
02-tokenization/sentencepiece/ |
Model Architecture and Training
| Task |
Skill |
Location |
| Large-scale pretraining |
Megatron-Core |
01-model-architecture/megatron-core/ |
| Lightweight LLM training |
LitGPT |
01-model-architecture/litgpt/ |
| State-space models |
Mamba |
01-model-architecture/mamba/ |
| Linear attention models |
RWKV |
01-model-architecture/rwkv/ |
| Small-scale pretraining |
NanoGPT |
01-model-architecture/nanogpt/ |
Fine-tuning
| Task |
Skill |
Location |
| Multi-method fine-tuning |
Axolotl |
03-fine-tuning/axolotl/ |
| Template-based fine-tuning |
LLaMA-Factory |
03-fine-tuning/llama-factory/ |
| Fast LoRA fine-tuning |
Unsloth |
03-fine-tuning/unsloth/ |
| PyTorch-native fine-tuning |
Torchtune |
03-fine-tuning/torchtune/ |
Post-training (RL / Alignment)
| Task |
Skill |
Location |
| PPO, DPO, SFT pipelines |
TRL |
06-post-training/trl/ |
| Group Relative Policy Optimization |
GRPO |
06-post-training/grpo-rl-training/ |
| Scalable RLHF |
OpenRLHF |
06-post-training/openrlhf/ |
| Reference-free alignment |
SimPO |
06-post-training/simpo/ |
Interpretability
| Task |
Skill |
Location |
| Transformer circuit analysis |
TransformerLens |
04-mechanistic-interpretability/transformerlens/ |
| Sparse autoencoder training |
SAELens |
04-mechanistic-interpretability/saelens/ |
| Intervention experiments |
NNsight |
04-mechanistic-interpretability/nnsight/ |
| Causal tracing |
Pyvene |
04-mechanistic-interpretability/pyvene/ |
Distributed Training
| Task |
Skill |
Location |
| ZeRO optimization |
DeepSpeed |
08-distributed-training/deepspeed/ |
| Fully sharded data parallel |
FSDP |
08-distributed-training/fsdp/ |
| Multi-GPU abstraction |
Accelerate |
08-distributed-training/accelerate/ |
| Training framework |
PyTorch Lightning |
08-distributed-training/pytorch-lightning/ |
| Distributed data + training |
Ray Train |
08-distributed-training/ray-train/ |
Evaluation
| Task |
Skill |
Location |
| Standard LLM benchmarks |
lm-evaluation-harness |
11-evaluation/lm-eval-harness/ |
| NeMo-integrated evaluation |
NeMo Evaluator |
11-evaluation/nemo-evaluator/ |
| Custom eval tasks |
Inspect AI |
11-evaluation/inspect-ai/ |
Inference and Serving
| Task |
Skill |
Location |
| High-throughput serving |
vLLM |
12-inference-serving/vllm/ |
| NVIDIA-optimized inference |
TensorRT-LLM |
12-inference-serving/tensorrt-llm/ |
| CPU / edge inference |
llama.cpp |
12-inference-serving/llama-cpp/ |
| Structured generation serving |
SGLang |
12-inference-serving/sglang/ |
Experiment Tracking
| Task |
Skill |
Location |
| Full experiment tracking |
Weights & Biases |
13-mlops/wandb/ |
| Open-source tracking |
MLflow |
13-mlops/mlflow/ |
| Training visualization |
TensorBoard |
13-mlops/tensorboard/ |
Optimization Techniques
| Task |
Skill |
Location |
| Efficient attention |
Flash Attention |
10-optimization/flash-attention/ |
| 4/8-bit quantization |
bitsandbytes |
10-optimization/bitsandbytes/ |
| GPTQ quantization |
GPTQ |
10-optimization/gptq/ |
| AWQ quantization |
AWQ |
10-optimization/awq/ |
| GGUF format (llama.cpp) |
GGUF |
10-optimization/gguf/ |
| PyTorch-native quantization |
Quanto |
10-optimization/quanto/ |
Safety and Alignment
| Task |
Skill |
Location |
| Constitutional AI training |
Constitutional AI |
07-safety-alignment/constitutional-ai/ |
| Content safety classification |
LlamaGuard |
07-safety-alignment/llamaguard/ |
| Guardrail pipelines |
NeMo Guardrails |
07-safety-alignment/nemo-guardrails/ |
| Prompt injection detection |
Prompt Guard |
07-safety-alignment/prompt-guard/ |
Infrastructure
| Task |
Skill |
Location |
| Serverless GPU compute |
Modal |
09-infrastructure/modal/ |
| Multi-cloud orchestration |
SkyPilot |
09-infrastructure/skypilot/ |
| GPU cloud instances |
Lambda Labs |
09-infrastructure/lambda-labs/ |
Agents and RAG
| Task |
Skill |
Location |
| Agent pipelines |
LangChain |
14-agents/langchain/ |
| Knowledge retrieval agents |
LlamaIndex |
14-agents/llamaindex/ |
| Lightweight agents |
Smolagents |
14-agents/smolagents/ |
| Claude-based agents |
Claude Agent SDK |
14-agents/claude-agent-sdk/ |
| Vector store (local) |
Chroma |
15-rag/chroma/ |
| Vector similarity search |
FAISS |
15-rag/faiss/ |
| Text embeddings |
Sentence Transformers |
15-rag/sentence-transformers/ |
| Managed vector DB |
Pinecone |
15-rag/pinecone/ |
| Scalable vector DB |
Milvus |
15-rag/milvus/ |
Prompt Engineering and Structured Output
| Task |
Skill |
Location |
| Prompt optimization |
DSPy |
16-prompt-engineering/dspy/ |
| Structured LLM output |
Instructor |
16-prompt-engineering/instructor/ |
| Constrained generation |
Guidance |
16-prompt-engineering/guidance/ |
| Grammar-based generation |
Outlines |
16-prompt-engineering/outlines/ |
Multimodal
| Task |
Skill |
Location |
| Vision-language models |
CLIP |
18-multimodal/clip/ |
| Speech recognition |
Whisper |
18-multimodal/whisper/ |
| Visual instruction tuning |
LLaVA |
18-multimodal/llava/ |
| Vision-language (Qwen) |
Qwen2-VL |
18-multimodal/qwen2-vl/ |
| Vision-language (Mistral) |
Pixtral |
18-multimodal/pixtral/ |
| Visual understanding |
Florence-2 |
18-multimodal/florence-2/ |
| Document retrieval |
ColPali |
18-multimodal/colpali/ |
Observability
| Task |
Skill |
Location |
| LLM tracing and debugging |
LangSmith |
17-observability/langsmith/ |
| LLM observability platform |
Phoenix |
17-observability/phoenix/ |
Emerging Techniques
| Task |
Skill |
Location |
| Mixture of Experts training |
MoE Training |
19-emerging-techniques/moe-training/ |
| Combining trained models |
Model Merging |
19-emerging-techniques/model-merging/ |
| Extended context windows |
Long Context |
19-emerging-techniques/long-context/ |
| Faster inference via drafting |
Speculative Decoding |
19-emerging-techniques/speculative-decoding/ |
| Teacher-student compression |
Knowledge Distillation |
19-emerging-techniques/knowledge-distillation/ |
| Reducing model size |
Model Pruning |
19-emerging-techniques/model-pruning/ |
Research Output
| Task |
Skill |
Location |
| Generate research ideas |
Research Ideation |
21-research-ideation/ |
| Write publication-ready paper |
ML Paper Writing |
20-ml-paper-writing/ |
Common Research Workflows
"I need to fine-tune a model and evaluate it"
- Pick fine-tuning skill based on needs (Unsloth for speed, Axolotl for flexibility)
- Use lm-evaluation-harness for standard benchmarks
- Track with W&B or MLflow
"I need to understand what the model learned"
- Use TransformerLens for circuit-level analysis
- Train SAEs with SAELens for feature-level understanding
- Run interventions with NNsight or Pyvene
"I need to do RL training"
- Start with TRL for standard PPO/DPO
- Use GRPO skill for DeepSeek-R1 style training
- Scale with OpenRLHF if needed
"I need to run experiments on cloud GPUs"
- Modal for quick serverless runs
- SkyPilot for multi-cloud optimization
- Lambda Labs for dedicated instances
Finding Skills
If you're not sure which skill to use:
Or search the repository's README.md which lists all skills with descriptions.