Files
Pascal Wachowski 04b2771048 docs: add TurboQuant benchmark results and documentation
Performance (Qwen3-14B Q4_K_M, RX 9070 XT 16GB):
- turbo4: 1812 pp512 / 49 tg128 / PPL +0.010 vs f16 / 72% KV savings
- turbo3: 1836 pp512 / 50 tg128 / PPL +0.051 vs f16 / 78% KV savings
- All context lengths up to 40K work without OOM

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 20:48:00 +02:00
..