# PR Testing Workflow

Guide for testing Pull Requests using the local staging environment.

## Quick Start

```bash
./test-pr.sh <pr-number> "test message"
```

## Staging Environment

**Location:** `/config/workspace/.nanobot-staging/`

**Components:**
- `config.json` — Staging configuration (channels disabled, shared OAuth)
- `workspace/` — Isolated workspace for tool operations
- `workspace/sessions/` — Session storage (separate from production)

**Key differences from production:**
- No external channels (Telegram disabled)
- Uses `NANOBOT_CONFIG` environment variable
- Gateway runs on localhost:18791 (vs production's 18790)
- `restrictToWorkspace: true` for safety

## Testing a PR

### Method 1: Helper Script (Recommended)

```bash
# Test PR with default message
./test-pr.sh 31

# Test with custom message
./test-pr.sh 31 "test the hidden message feature"
```

**What it does:**
1. Fetches PR from `wylab` remote (force updates if branch exists)
2. Checks out PR branch locally
3. Installs in editable mode with `uv pip install -e .`
4. Runs test with staging config via `NANOBOT_CONFIG` env var
5. Leaves branch checked out for further testing

**After testing:**
```bash
git checkout main  # Return to main branch
```

### Method 2: Manual Testing

```bash
# 1. Fetch and checkout PR
cd /config/workspace/nanobot-oauth-port/nanobot-fork
git fetch wylab pull/<N>/head:pr-<N>
git checkout pr-<N>

# 2. Install in editable mode
uv pip install -e .

# 3. Test with staging config
NANOBOT_CONFIG=/config/workspace/.nanobot-staging/config.json \
  .venv/bin/nanobot agent -m "test message"

# 4. For multi-turn testing
NANOBOT_CONFIG=/config/workspace/.nanobot-staging/config.json \
  .venv/bin/nanobot agent  # Interactive mode

# 5. Return to main
git checkout main
```

### Method 3: Gateway Validation

Test that gateway starts without errors:

```bash
NANOBOT_CONFIG=/config/workspace/.nanobot-staging/config.json \
  .venv/bin/nanobot gateway

# Kill with Ctrl+C when validated
```

## Verifying Cache Behavior

To verify prompt caching works correctly (important for performance):

```bash
# Enable logs to see cache metrics
NANOBOT_CONFIG=/config/workspace/.nanobot-staging/config.json \
  .venv/bin/nanobot agent --logs -m "Turn 1: list files"

# Look for cache metrics in output:
# - cache_write: New cache entries created
# - cache_read: Tokens read from cache
```

**What to look for:**
- Turn 1: High `cache_write`, moderate `cache_read`
- Turn 2+: Low `cache_write`, high `cache_read` (reusing cache)
- `cache_read` should increase across turns as context grows

**Example healthy pattern:**
```
Turn 1: cache_write=354 cache_read=3563
Turn 2: cache_write=255 cache_read=3917  ← Same as Turn 1 end
Turn 3: cache_write=113 cache_read=4172  ← Growing with context
```

## Session Management

### Clear session for fresh test

```bash
rm -f /config/workspace/.nanobot-staging/workspace/sessions/cli_direct.jsonl
```

### View session contents

```bash
cat /config/workspace/.nanobot-staging/workspace/sessions/cli_direct.jsonl | jq
```

### Check for specific features (e.g., hidden signatures)

```bash
cat /config/workspace/.nanobot-staging/workspace/sessions/cli_direct.jsonl | grep "_hidden_sig"
```

## Common Testing Scenarios

### Test tool execution

```bash
./test-pr.sh 31 "List all Python files in the current directory"
```

### Test multi-turn conversation

```bash
NANOBOT_CONFIG=/config/workspace/.nanobot-staging/config.json \
  .venv/bin/nanobot agent

# Then interact naturally:
> list files in current directory
> how many python files are there?
> what's the total size?
```

### Test error handling

```bash
./test-pr.sh 31 "Try to read a file that doesn't exist: /nonexistent.txt"
```

### Test with thinking mode

The staging config has `thinking_budget: 10000` enabled by default, so all tests use extended thinking.

## Troubleshooting

### "No API key configured" error

- **Cause:** `NANOBOT_CONFIG` env var not set
- **Fix:** Ensure you're using `NANOBOT_CONFIG=/config/workspace/.nanobot-staging/config.json`

### "Module not found" after checkout

- **Cause:** Need to reinstall after switching branches
- **Fix:** Run `uv pip install -e .` after checkout

### Changes not applying

- **Cause:** Using cached `.pyc` files
- **Fix:** Clear pycache: `find . -type d -name __pycache__ -exec rm -rf {} + 2>/dev/null || true`

### Session has stale data

- **Cause:** Previous test left session data
- **Fix:** `rm /config/workspace/.nanobot-staging/workspace/sessions/cli_direct.jsonl`

## Best Practices

1. **Clear session between PR tests** to avoid cross-contamination
2. **Test with tool use** to trigger agentic behavior (not just simple Q&A)
3. **Check cache metrics** for performance-sensitive PRs
4. **Run with `--logs`** to see detailed behavior during development
5. **Return to main** after testing to avoid accidental commits on PR branches

## Integration with CI/CD

The staging environment is currently manual-only. Future enhancements:

- [ ] Automated PR testing via Gitea Actions
- [ ] Cache validation in CI pipeline
- [ ] Multi-PR parallel testing using git worktrees
- [ ] Regression test suite against production behavior

## File Locations Reference

| Path | Purpose |
|------|---------|
| `/config/workspace/nanobot-oauth-port/nanobot-fork/` | Local nanobot repository |
| `/config/workspace/.nanobot-staging/` | Staging environment root |
| `/config/workspace/.nanobot-staging/config.json` | Staging configuration |
| `/config/workspace/.nanobot-staging/workspace/` | Staging workspace |
| `/config/workspace/.nanobot-staging/workspace/sessions/` | Session storage |
| `/config/workspace/nanobot-oauth-port/nanobot-fork/test-pr.sh` | Helper script |

## Related Documentation

- [nanobot README](../README.md) - Main project documentation
- [CLAUDE.md](../CLAUDE.md) - Development guide for Claude Code
- [config/schema.py](../nanobot/config/schema.py) - Configuration schema