- use YAML for test configurations - remove most uses of simple-responder, opting to use process.testHandler Fixes #655
7.0 KiB
Improve Testability (#655)
Current Pain Points
-
Tests bypass config loading - ~80% of tests build
config.Configstructs directly, skipping YAML parsing, env var substitution, macro expansion, and${PORT}assignment. Config bugs in those paths go untested. -
simple-responder is everywhere - Every proxy/routing test launches a real subprocess, waits for health checks (~healthCheckTimeout: 15), and manages process lifecycle just to test HTTP routing. Most of that overhead is wasted.
-
Port counter is fragile - A global
nextTestPortcounter starting at 12000 with a mutex. Parallel tests or leftover processes can collide.
Stages
Stage 1: YAML-based test config helper
Goal: Tests go through the real LoadConfigFromReader path instead of hand-building structs.
Effort: Low | Impact: Config bugs caught earlier | Risk: None
Create a test helper in proxy/helpers_test.go:
// testConfigFromYAML substitutes simple-responder paths and loads through
// the real config pipeline (env vars, macros, port assignment, etc.)
func testConfigFromYAML(t *testing.T, yamlTmpl string) config.Config {
t.Helper()
yamlStr := strings.ReplaceAll(yamlTmpl, "{{RESPONDER}}", filepath.ToSlash(simpleResponderPath))
cfg, err := config.LoadConfigFromReader(strings.NewReader(yamlStr))
require.NoError(t, err)
return cfg
}
Tests would then look like:
func TestProxyManager_SwapProcessCorrectly(t *testing.T) {
config := testConfigFromYAML(t, `
healthCheckTimeout: 15
logLevel: error
models:
model1:
cmd: {{RESPONDER}} --port ${PORT} -silent -respond model1
model2:
cmd: {{RESPONDER}} --port ${PORT} -silent -respond model2
`)
proxy := New(config)
// ... same assertions
}
Why this stage first: Zero production code changes. Pure test-side refactoring. Can be done incrementally - migrate tests one at a time. Each migrated test now validates the full config pipeline.
Scope: ~20-30 tests in proxymanager_test.go, processgroup_test.go, peerproxy_test.go.
Stage 2: Injected test handler (eliminate simple-responder for routing tests)
Goal: Replace simple-responder subprocess launches with an injected http.Handler for tests that don't specifically test process lifecycle.
Effort: Medium | Impact: 10-100x faster routing tests | Risk: Low (additive, no existing code broken)
Add a testHandler http.Handler field to Process. When set, ProxyRequest delegates directly to this handler instead of going through the reverse proxy. No subprocess, no health checks, no TCP roundtrip.
2a. Add testHandler to Process:
// In Process struct (process.go):
testHandler http.Handler // set only in tests; bypasses subprocess and reverse proxy
In Process.Start(), skip subprocess + health check when handler is set:
func (p *Process) start() error {
if p.testHandler != nil {
p.setState(StateReady)
return nil
}
// existing subprocess logic...
}
In Process.ProxyRequest(), delegate directly to the handler:
// Before the reverseProxy.ServeHTTP call:
if p.testHandler != nil {
p.testHandler.ServeHTTP(w, r)
return
}
2b. Test helper to create the handler:
// newTestHandler returns an http.Handler that mimics llama.cpp's API
// (same endpoints as simple-responder).
func newTestHandler(respond string) http.Handler {
mux := http.NewServeMux()
mux.HandleFunc("/v1/chat/completions", func(w http.ResponseWriter, r *http.Request) { ... })
mux.HandleFunc("/health", func(w http.ResponseWriter, r *http.Request) { ... })
// ... other endpoints
return mux
}
Tests for routing/auth/CORS/streaming then become:
func TestProxyManager_AuthRequired(t *testing.T) {
handler := newTestHandler("model1")
config := testConfigFromYAML(t, `
healthCheckTimeout: 15
logLevel: error
requiredAPIKeys: [test-key]
models:
model1:
cmd: {{RESPONDER}} --port ${PORT} -silent -respond model1
`)
pm := NewProxyManager(config)
// inject handler — skips subprocess, health check, port allocation
pm.processGroups["model1"].process.testHandler = handler
}
Why this matters: The handler is called directly in-process. No subprocess spawn, no health check timeout, no port allocation, no TCP roundtrip, no reverse proxy overhead. Routing tests go from ~100ms each (process startup + health check) to ~1ms. Unlike an httptest.Server approach, there are zero network hops.
Why not blank-cmd + proxy URL: A blank cmd with a proxy field pointing at httptest.Server still requires a real TCP roundtrip through the reverse proxy and introduces "external process" semantics to the config schema. Injecting the handler directly keeps it purely a test concern with no config changes.
Scope: Most tests in proxymanager_test.go (auth, CORS, model listing, streaming, peer proxy), peerproxy_test.go, metrics_monitor_test.go.
Stage 3: Migrate tests incrementally
Goal: Convert existing tests to use the Stage 1 + Stage 2 helpers.
Effort: Medium | Impact: Cleaner, more reliable tests | Risk: None
Priority order:
proxymanager_test.gorouting tests (highest count, most repetition)peerproxy_test.go(straightforward, all HTTP routing)metrics_monitor_test.go(capture logic doesn't need real processes)processgroup_test.goswap tests (keep simple-responder for actual swap lifecycle tests)
Tests that must keep simple-responder:
- Process lifecycle: start/stop, SIGKILL, SIGTERM, TTL expiry, health check failures, failed start counting
- ProcessGroup swap concurrency (the port-collision test in
TestProcessGroup_ProxyRequestSwapIsTrueParallel)
Scope: ~60-70% of tests can drop simple-responder.
Stage 4 (optional): Process interface for ProcessGroup
Goal: Enable pure unit tests of ProcessGroup's swap/exclusive/concurrency logic without any HTTP server at all.
Effort: High | Impact: Pure unit tests possible | Risk: Medium (refactor core code)
type ProcessController interface {
Start() error
Stop(StopStrategy)
ProxyRequest(http.ResponseWriter, *http.Request) error
CurrentState() ProcessState
ID() string
SetState(ProcessState) // for test setup
}
This requires:
- Extracting the interface
- A
MockProcessimplementation - Refactoring
ProcessGroupto use the interface instead of*Process
Recommendation: Only do this if ProcessGroup grows significantly more complex. Stages 1-3 give 80% of the benefit for 20% of the effort.
Effort/Impact Summary
| Stage | Effort | Impact | Risk |
|---|---|---|---|
| 1. YAML config helper | Low | Config bugs caught earlier | None |
| 2. Injected test handler | Medium | 10-100x faster routing tests | Low |
| 3. Migrate tests | Medium | Cleaner, more reliable tests | None |
| 4. Process interface | High | Pure unit tests possible | Medium |
Recommended approach: Do stages 1-3 in order. Each stage is independently valuable and can ship on its own. Stage 4 is deferred unless there's a specific need.