docs: add PR testing workflow guide
Comprehensive guide for using the staging environment: - Quick start with test-pr.sh script - Manual testing methods - Cache verification procedures - Session management - Troubleshooting tips Includes examples for multi-turn testing and cache validation.
This commit is contained in:
@@ -0,0 +1,214 @@
|
||||
# PR Testing Workflow
|
||||
|
||||
Guide for testing Pull Requests using the local staging environment.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
./test-pr.sh <pr-number> "test message"
|
||||
```
|
||||
|
||||
## Staging Environment
|
||||
|
||||
**Location:** `/config/workspace/.nanobot-staging/`
|
||||
|
||||
**Components:**
|
||||
- `config.json` — Staging configuration (channels disabled, shared OAuth)
|
||||
- `workspace/` — Isolated workspace for tool operations
|
||||
- `workspace/sessions/` — Session storage (separate from production)
|
||||
|
||||
**Key differences from production:**
|
||||
- No external channels (Telegram disabled)
|
||||
- Uses `NANOBOT_CONFIG` environment variable
|
||||
- Gateway runs on localhost:18791 (vs production's 18790)
|
||||
- `restrictToWorkspace: true` for safety
|
||||
|
||||
## Testing a PR
|
||||
|
||||
### Method 1: Helper Script (Recommended)
|
||||
|
||||
```bash
|
||||
# Test PR with default message
|
||||
./test-pr.sh 31
|
||||
|
||||
# Test with custom message
|
||||
./test-pr.sh 31 "test the hidden message feature"
|
||||
```
|
||||
|
||||
**What it does:**
|
||||
1. Fetches PR from `wylab` remote (force updates if branch exists)
|
||||
2. Checks out PR branch locally
|
||||
3. Installs in editable mode with `uv pip install -e .`
|
||||
4. Runs test with staging config via `NANOBOT_CONFIG` env var
|
||||
5. Leaves branch checked out for further testing
|
||||
|
||||
**After testing:**
|
||||
```bash
|
||||
git checkout main # Return to main branch
|
||||
```
|
||||
|
||||
### Method 2: Manual Testing
|
||||
|
||||
```bash
|
||||
# 1. Fetch and checkout PR
|
||||
cd /config/workspace/nanobot-oauth-port/nanobot-fork
|
||||
git fetch wylab pull/<N>/head:pr-<N>
|
||||
git checkout pr-<N>
|
||||
|
||||
# 2. Install in editable mode
|
||||
uv pip install -e .
|
||||
|
||||
# 3. Test with staging config
|
||||
NANOBOT_CONFIG=/config/workspace/.nanobot-staging/config.json \
|
||||
.venv/bin/nanobot agent -m "test message"
|
||||
|
||||
# 4. For multi-turn testing
|
||||
NANOBOT_CONFIG=/config/workspace/.nanobot-staging/config.json \
|
||||
.venv/bin/nanobot agent # Interactive mode
|
||||
|
||||
# 5. Return to main
|
||||
git checkout main
|
||||
```
|
||||
|
||||
### Method 3: Gateway Validation
|
||||
|
||||
Test that gateway starts without errors:
|
||||
|
||||
```bash
|
||||
NANOBOT_CONFIG=/config/workspace/.nanobot-staging/config.json \
|
||||
.venv/bin/nanobot gateway
|
||||
|
||||
# Kill with Ctrl+C when validated
|
||||
```
|
||||
|
||||
## Verifying Cache Behavior
|
||||
|
||||
To verify prompt caching works correctly (important for performance):
|
||||
|
||||
```bash
|
||||
# Enable logs to see cache metrics
|
||||
NANOBOT_CONFIG=/config/workspace/.nanobot-staging/config.json \
|
||||
.venv/bin/nanobot agent --logs -m "Turn 1: list files"
|
||||
|
||||
# Look for cache metrics in output:
|
||||
# - cache_write: New cache entries created
|
||||
# - cache_read: Tokens read from cache
|
||||
```
|
||||
|
||||
**What to look for:**
|
||||
- Turn 1: High `cache_write`, moderate `cache_read`
|
||||
- Turn 2+: Low `cache_write`, high `cache_read` (reusing cache)
|
||||
- `cache_read` should increase across turns as context grows
|
||||
|
||||
**Example healthy pattern:**
|
||||
```
|
||||
Turn 1: cache_write=354 cache_read=3563
|
||||
Turn 2: cache_write=255 cache_read=3917 ← Same as Turn 1 end
|
||||
Turn 3: cache_write=113 cache_read=4172 ← Growing with context
|
||||
```
|
||||
|
||||
## Session Management
|
||||
|
||||
### Clear session for fresh test
|
||||
|
||||
```bash
|
||||
rm -f /config/workspace/.nanobot-staging/workspace/sessions/cli_direct.jsonl
|
||||
```
|
||||
|
||||
### View session contents
|
||||
|
||||
```bash
|
||||
cat /config/workspace/.nanobot-staging/workspace/sessions/cli_direct.jsonl | jq
|
||||
```
|
||||
|
||||
### Check for specific features (e.g., hidden signatures)
|
||||
|
||||
```bash
|
||||
cat /config/workspace/.nanobot-staging/workspace/sessions/cli_direct.jsonl | grep "_hidden_sig"
|
||||
```
|
||||
|
||||
## Common Testing Scenarios
|
||||
|
||||
### Test tool execution
|
||||
|
||||
```bash
|
||||
./test-pr.sh 31 "List all Python files in the current directory"
|
||||
```
|
||||
|
||||
### Test multi-turn conversation
|
||||
|
||||
```bash
|
||||
NANOBOT_CONFIG=/config/workspace/.nanobot-staging/config.json \
|
||||
.venv/bin/nanobot agent
|
||||
|
||||
# Then interact naturally:
|
||||
> list files in current directory
|
||||
> how many python files are there?
|
||||
> what's the total size?
|
||||
```
|
||||
|
||||
### Test error handling
|
||||
|
||||
```bash
|
||||
./test-pr.sh 31 "Try to read a file that doesn't exist: /nonexistent.txt"
|
||||
```
|
||||
|
||||
### Test with thinking mode
|
||||
|
||||
The staging config has `thinking_budget: 10000` enabled by default, so all tests use extended thinking.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "No API key configured" error
|
||||
|
||||
- **Cause:** `NANOBOT_CONFIG` env var not set
|
||||
- **Fix:** Ensure you're using `NANOBOT_CONFIG=/config/workspace/.nanobot-staging/config.json`
|
||||
|
||||
### "Module not found" after checkout
|
||||
|
||||
- **Cause:** Need to reinstall after switching branches
|
||||
- **Fix:** Run `uv pip install -e .` after checkout
|
||||
|
||||
### Changes not applying
|
||||
|
||||
- **Cause:** Using cached `.pyc` files
|
||||
- **Fix:** Clear pycache: `find . -type d -name __pycache__ -exec rm -rf {} + 2>/dev/null || true`
|
||||
|
||||
### Session has stale data
|
||||
|
||||
- **Cause:** Previous test left session data
|
||||
- **Fix:** `rm /config/workspace/.nanobot-staging/workspace/sessions/cli_direct.jsonl`
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Clear session between PR tests** to avoid cross-contamination
|
||||
2. **Test with tool use** to trigger agentic behavior (not just simple Q&A)
|
||||
3. **Check cache metrics** for performance-sensitive PRs
|
||||
4. **Run with `--logs`** to see detailed behavior during development
|
||||
5. **Return to main** after testing to avoid accidental commits on PR branches
|
||||
|
||||
## Integration with CI/CD
|
||||
|
||||
The staging environment is currently manual-only. Future enhancements:
|
||||
|
||||
- [ ] Automated PR testing via Gitea Actions
|
||||
- [ ] Cache validation in CI pipeline
|
||||
- [ ] Multi-PR parallel testing using git worktrees
|
||||
- [ ] Regression test suite against production behavior
|
||||
|
||||
## File Locations Reference
|
||||
|
||||
| Path | Purpose |
|
||||
|------|---------|
|
||||
| `/config/workspace/nanobot-oauth-port/nanobot-fork/` | Local nanobot repository |
|
||||
| `/config/workspace/.nanobot-staging/` | Staging environment root |
|
||||
| `/config/workspace/.nanobot-staging/config.json` | Staging configuration |
|
||||
| `/config/workspace/.nanobot-staging/workspace/` | Staging workspace |
|
||||
| `/config/workspace/.nanobot-staging/workspace/sessions/` | Session storage |
|
||||
| `/config/workspace/nanobot-oauth-port/nanobot-fork/test-pr.sh` | Helper script |
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [nanobot README](../README.md) - Main project documentation
|
||||
- [CLAUDE.md](../CLAUDE.md) - Development guide for Claude Code
|
||||
- [config/schema.py](../nanobot/config/schema.py) - Configuration schema
|
||||
Reference in New Issue
Block a user