From 03d58e53fa35ec26b03ce5117cf55bad6ead662c Mon Sep 17 00:00:00 2001 From: Benson Wong <83972+mostlygeek@users.noreply.github.com> Date: Sat, 30 May 2026 17:04:30 -0700 Subject: [PATCH] Add load testing tool to the UI (#805) Wouldn't it be nice to test the performance, swapping and concurrency from the UI? Now we can! This is a port of `cmd/test-concurrency` into the UI Here's a demo of it working with a swap matrix: https://github.com/user-attachments/assets/b6bb12ec-0381-46f1-a6b8-27d1c3c0ddb3 --- .../playground/ConcurrencyInterface.svelte | 632 ++++++++++++++++++ ui-svelte/src/routes/Playground.svelte | 18 +- 2 files changed, 646 insertions(+), 4 deletions(-) create mode 100644 ui-svelte/src/components/playground/ConcurrencyInterface.svelte diff --git a/ui-svelte/src/components/playground/ConcurrencyInterface.svelte b/ui-svelte/src/components/playground/ConcurrencyInterface.svelte new file mode 100644 index 0000000..54eb76b --- /dev/null +++ b/ui-svelte/src/components/playground/ConcurrencyInterface.svelte @@ -0,0 +1,632 @@ + + +
+ Fire several streaming chat completions at llama-swap at the same time to see how it handles parallel + loading and concurrent inference. Each request streams into its own panel with a live timer and status. +
+max_tokens if you want.Tip: drag a result card's header to reorder, or hit × to drop it.
+