423 Commits

Author SHA1 Message Date
Benson Wong 6f8e7ccb57 .github/workflows: switch release.yml to build ui-svelte 2026-01-28 21:39:10 -08:00
Benson Wong 4384315b44 ui-svelte: add Svelte port of React UI (#487)
Trying out svelte for the UI. The port was done by Claude Code on the iOS app w/ Opus 4.5. 

---

* ui: add Svelte port of React UI

Port the React-based UI to Svelte 5 with the following changes:

- Create new ui-svelte directory with complete Svelte 5 implementation
- Use Svelte stores instead of React contexts for state management
- Implement custom ResizablePanels component to replace react-resizable-panels
- Port all pages: LogViewer, Models, Activity
- Port all components: Header, ConnectionStatus, LogPanel, ModelsPanel, etc.
- Use svelte-spa-router for client-side routing
- Same build output directory (proxy/ui_dist) and base path (/ui/)
- Tailwind CSS 4 with same theme configuration

https://claude.ai/code/session_01F3xXLYsd62gePVSFv7aboP

* ui-svelte: simplify state management

- Remove redundant state syncing pattern in LogPanel and ModelsPanel
- Use store values directly with $ syntax instead of manual subscriptions
- Consolidate duplicate title sync logic in App.svelte
- Use existing syncTitleToDocument() from theme.ts

https://claude.ai/code/session_01F3xXLYsd62gePVSFv7aboP

* ui-svelte: use idiomatic Svelte 5 patterns

- Use $effect for document side effects (theme, title) instead of
  store subscriptions
- Use class: directive for active nav links in Header
- Remove SSR guards (unnecessary for client-only SPA)
- Remove leaked subscription in syncThemeToDocument
- Simplify theme.ts by removing sync functions

https://claude.ai/code/session_01F3xXLYsd62gePVSFv7aboP

* ui-svelte: fix build warnings and improve accessibility

Fix Svelte build warnings and add proper accessibility support
to interactive components.

- add aria-labels to buttons for screen readers
- implement keyboard navigation for resizable separator
- suppress intentional state initialization warnings
- update Makefile to use ui-svelte build directory
- add peer:true to package-lock.json dependencies

* ui-svelte: reorganize navigation and add log view toggle

Make Models the default landing page and add view mode toggle
to the Logs page with persistent state.

- set Models as default route at /
- move Logs to /logs route
- reorder navigation: Models, Activity, Logs
- add view toggle with three modes: Panels, Proxy only, Upstream only
- fix horizontal overflow with width constraints
2026-01-28 21:37:29 -08:00
Benson Wong 6439ab1515 ui: add peer:true in package-lock.json v186 2026-01-22 08:43:36 -08:00
dependabot[bot] f94226122c build(deps-dev): bump tar from 7.5.3 to 7.5.6 in /ui (#477)
Bumps [tar](https://github.com/isaacs/node-tar) from 7.5.3 to 7.5.6.
- [Release notes](https://github.com/isaacs/node-tar/releases)
- [Changelog](https://github.com/isaacs/node-tar/blob/main/CHANGELOG.md)
- [Commits](https://github.com/isaacs/node-tar/compare/v7.5.3...v7.5.6)

---
updated-dependencies:
- dependency-name: tar
  dependency-version: 7.5.6
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-21 22:55:02 -08:00
Ryan Voots 7493618fdc Add count_tokens api proxying (#476) 2026-01-20 09:34:42 -08:00
Benson Wong 205efd40a1 proxy: extend /running endpoint with additional process data (#474)
Extend the /running endpoint to return more details about running
processes beyond just model and state.

- add cmd field to show the command being executed
- add proxy field to show the proxy URL
- add ttl (UnloadAfter) for automatic unloading configuration
- add name and description for model metadata
- update tests to verify new fields are returned correctly

fixes #471
2026-01-19 17:37:00 -08:00
Benson Wong 14207f8492 ui: npm security update 2026-01-18 21:56:32 -08:00
Benson Wong 4e850c2834 config: refactor macro substitution in configuration (#470)
This commit simplifies substitution of environment variables into the configuration. There was a lot of repetitive code substituting ${env.VAR_NAME} into different fields after the configuration was parsed into a config.Config. This refactor uses a string substitution of env vars into the YAML config before it is fully parsed. This eliminates a lot of logic while maintaining backwards compatibility.
2026-01-18 21:52:34 -08:00
Benson Wong 75fced579e config: support macros in peer apiKey and filters (#469)
* config: support environment variable macros in peer apiKeys

Add ${env.VAR_NAME} substitution for peer apiKey fields, consistent
with existing env macro support for model fields and global apiKeys.

- Add env macro substitution for peers.{name}.apiKey in LoadConfigFromReader
- Add tests for peer apiKey env substitution
- Update config.example.yaml to show env macro usage

* config: support macros in peer apiKey and filters

Extend macro substitution to peer configuration fields:
- peers.{name}.apiKey supports both global macros and env macros
- peers.{name}.filters.stripParams supports both macro types
- peers.{name}.filters.setParams supports both macro types

Also renamed validateMetadataForUnknownMacros to validateNestedForUnknownMacros
for reuse across model metadata and peer filters validation.
v185
2026-01-16 23:10:50 -08:00
Benson Wong b73f367f22 config-schema.json,config.example.yaml: Update examples and schema v184 2026-01-16 22:43:25 -08:00
Benson Wong 8f2137c72b config: support environment variable macros in apiKeys (#467)
Add substituteEnvMacros support for apiKeys configuration field,
allowing API keys to be loaded from environment variables using
the ${env.VAR_NAME} syntax.

- Apply env macro substitution before validation
- Add tests for env macro substitution in apiKeys
2026-01-16 22:41:14 -08:00
Benson Wong 124007cc98 config: add environment variable macros (#466)
* config: add environment variable macros

Add support for ${env.VAR_NAME} syntax to pull values from system
environment variables during config loading.

- env macros processed before regular macros (allows macros to reference env vars)
- works in cmd, cmdStop, proxy, checkEndpoint, filters.stripParams, metadata
- returns error if env var is not set
- add comprehensive tests

fixes #462

* docs: add env macro example to config.example.yaml
2026-01-16 22:25:20 -08:00
Benson Wong eb5bfff0b0 proxy: unify filtering for local models and peers
This unifies the filtering capabilities for models and peers

- stripParams: removes params in the request
- setParams: sets params in the request

fixes #453
2026-01-15 18:59:43 -08:00
Benson Wong 3edb180c08 ci: free up disk space before ROCm container build (#460) 2026-01-14 22:03:42 -08:00
Benson Wong 66d555e625 Improve container build reliability (#457)
* docker: add .env usage in build-container.sh
* .github,docker: add rocm, improve logging
* .github,CLAUDE.md: fix workflow and update guidelines

Update containers workflow to only push images when triggered
manually or on schedule, not on workflow file changes.

- add push trigger for workflow file changes in containers.yml
- update push condition to skip on regular push events
- update CLAUDE.md commit message guidelines

* docker: remove comma in build-container.sh

* .github,docker: improve container build workflow

Add pagination support for fetching llama.cpp tags and improve debugging.

- add build-container.sh to workflow trigger paths
- implement fetch_llama_tag() with pagination support
- replace .env with local testing instructions
- add DEBUG_ABORT_BUILD flag for testing
2026-01-10 22:14:33 -08:00
Benson Wong 4f863fd9fc CLAUDE.md: tweak instructions 2026-01-09 21:42:06 -08:00
Benson Wong 267c030457 ui: update react-router-dom to 7.12.0 (#456)
Update react-router-dom from 7.6.2 to 7.12.0 to address security vulnerability.

- Updated dependency in package.json
- Regenerated package-lock.json
- Verified build passes successfully
- Confirmed 0 vulnerabilities with npm audit

Co-authored-by: Claude <noreply@anthropic.com>
v183
2026-01-08 16:13:09 -08:00
Benson Wong c19309fe7e CLAUDE.md: small instruction tweaks 2026-01-07 21:34:23 -08:00
Benson Wong 4413881b2d proxy: actually add /v1/responses endpoint (#449)
ref: #448
v182
2026-01-01 13:35:45 -08:00
Benson Wong 8df5e8563b proxy: add /v1/responses and /v1/audio/voices endpoints (#448)
Updates #433
Fixes #442 #226
v181
2026-01-01 12:52:12 -08:00
Benson Wong 7931212d3e proxy: add v1/images/edits API endpoint (#447)
Updates #433
2026-01-01 12:43:06 -08:00
Benson Wong 3dc36032fb proxy: skip very slow tests in -short test mode (#446)
* proxy: skip very slow tests in -short test mode
* CLAUDE.md: update testing instructions
2025-12-31 14:08:56 -08:00
Benson Wong addb98646f proxy: add support for basic authorization (#445)
Fixes #444 where the UI with api keys did not work. The choice to use
http basic authorization is for simple, automatic browser support. No
changes to the UI were necessary. Just use an API key as the password,
no user name is required.
2025-12-31 13:42:35 -08:00
Benson Wong 37d74efc2d proxy: add /v1/images/generations (#443)
Add support for the /v1/images/generations endpoint

Updates #433
Closes #191
v180
2025-12-30 21:04:58 -08:00
Benson Wong 22e098ac8b Add Peer Model Support (#438)
This PR allows a single llama-swap to be the central proxy for models served by other inference servers. The peer servers can be another llama-swap or any API that supports the /v1/* inference endpoint.

Updates: #433, #299
Closes: #296
v179
2025-12-27 20:18:06 -08:00
Benson Wong 9864f9f517 .coderabbit.yaml: disable annoying features 2025-12-23 23:53:06 -08:00
Benson Wong 53b32f3601 proxy: add API key support (#436)
Add configuration support for api keys that are enforced by llama-swap. Keys are stripped before sending them to upstream servers. 

Updates: #433, #50 and #251
2025-12-23 23:39:33 -08:00
Benson Wong 565c44766d config,proxy: add new configuration logToStdout (#432)
The new logToStdout option controls what is logged to stdout. The
default has been changed to just the proxy logs, which contain swap and
http request logs.

There are four supported settings: none, proxy, upstream, both. The
"both" setting is the legacy setting where everything was spewed to
stdout.
v178
2025-12-21 22:23:31 -08:00
Benson Wong e6a9e210ba proxy: fix path bug in /logs/stream/{model_id} (#431)
A {model_id} containing a forward slash trips up gin's path param
parsing. This updates /logs/stream to work like /upstream where the
model_id is built up in parts and searched for in the configuration.

Updates #421
2025-12-21 21:47:14 -08:00
Benson Wong d3f329f924 proxy: Improve logging performance and allow separate log streaming (#421)
Replace container/ring.Ring with a custom circularBuffer that uses a
single contiguous []byte slice. This fixes the original implementation
which created 10,240 ring elements instead of 10KB of storage.

GetHistory is now 139x faster (145μs → 1μs) and uses 117x less memory
(1.2MB → 10KB). Allocations reduced from 2 to 1 per write operation.

Create a LogMonitor per proxy.Process, replacing the usage
of a shared one. The buffer in LogMonitor is lazy allocated on the first
call to Write and freed when the Process is stopped. This reduces
unnecessary memory usage when a model is not active.

The /logs/stream/{model_id} endpoint was added to stream logs from a
specific process.
v177
2025-12-18 21:49:25 -08:00
Benson Wong 98879b38c1 docker: add /app to $PATH (#424)
Make it so llama-server can be called directly instead of with the full
path at /app/llama-server.

Fixes #423
Ref: #233
2025-12-06 22:58:29 -08:00
Benson Wong 7b3b0f5eae move header images around [skip ci] 2025-12-02 19:40:42 -08:00
Benson Wong 021ccceef1 README: update hero image 2025-12-02 19:37:03 -08:00
Benson Wong f03871c50a Update README.md
- add supported anthropic API 
- add example for docker hot reload support
2025-12-02 19:03:01 -08:00
Ryan Steed dc00d17abe docs: add documentation for non-root container images and security considerations (#416)
* docs: add documentation for non-root container images and security considerations
* docs: move container security section to dedicated file and update README links
2025-12-02 08:52:26 -08:00
Benson Wong dea98733c3 proxy: extract metrics for v1/messages (#419) v176 2025-11-29 23:51:20 -08:00
Benson Wong bccce5fa19 go.mod,ui/package-lock.json: dependency and security updates (#418) v175 2025-11-29 22:27:22 -08:00
Benson Wong c968da1b73 proxy: add support for anthropic v1/messages api (#417)
* proxy: add support for anthropic v1/messages api
* proxy: restrict loading message to /v1/chat/completions
2025-11-29 22:09:07 -08:00
Ryan Steed a883d68d4f feat: Add support for custom llama.cpp base image and forked llama-swap repositories (#396)
* feat: Add support for custom llama.cpp base image and forked llama-swap repositories

- Introduce BASE_LLAMACPP_IMAGE env var to customize llama.cpp base image
- Introduce LS_REPO env var to customize llama-swap source
- Use GITHUB_REPOSITORY env var to automatically detect forked repos
- Update container tagging to use dynamic repo paths
- Pass build args for BASE_IMAGE and LS_REPO to Containerfile
- Enable flexible release downloads from forked repositories

* chore: quote entire curl options, appease coderabbitai
2025-11-29 20:59:15 -08:00
Ryan Steed b1dec8b735 docker: build both root and non-root container images (#412)
Change the user back to root for containers. Additionally, built a "non-root" labeled container for users who wish to have the additional security of running llama-swap as a lower privileged user.
v174
2025-11-25 10:44:13 -08:00
Nikesh Parajuli 06523d8c1e feat: add platform-specific process attributes support (#411)
Fixes issues on Windows showing new windows for every process llama-swap spawns.
2025-11-24 21:39:56 -08:00
Ryan Steed 86e9b93c37 proxy,ui: add version endpoint and display version info in UI (#395)
- Add /api/version endpoint to ProxyManager that returns build date, commit hash, and version
- Implement SetVersion method to configure version info in ProxyManager
- Add version info fetching to APIProvider and display in ConnectionStatus component
- Include version info in UI context and update dependencies
- Add tests for version endpoint functionality
v173
2025-11-17 10:43:47 -08:00
Ryan Steed 3acace810f proxy: add configurable logging timestamp format (#401)
introduces a new configuration option logTimeFormat that allows customizing the timestamp in log messages using golang's built in time format constants. The default remains no timestamp.
2025-11-16 10:21:59 -08:00
Ryan Steed 554d29e87d feat: enhance model listing to include aliases (#400)
introduce includeAliasesInList as a new configuration setting (default false) that includes aliases in v1/models

Fixes #399
2025-11-15 14:35:26 -08:00
Benson Wong 3567b7df08 Update image in README.md for web UI section 2025-11-08 15:29:37 -08:00
Benson Wong 38738525c9 config.example.yaml: add modeline for schema validation 2025-11-08 15:08:55 -08:00
Benson Wong c0fc858193 Add configuration file JSON schema (#393)
* add json schema for configuration
* add GH action to validate schema
2025-11-08 15:04:14 -08:00
Benson Wong b429349e8a add /ui/ to wol-proxy polling (#388) 2025-11-08 14:16:12 -08:00
Ryan Steed eab2efd7b5 feat: improve llama.cpp base image tag for cpu (#391)
Refactor the container build script to resolve llama.cpp base image for CPU, also tag these builds accordingly.

- For CPU containers, now fetch the latest 'server' tagged llama.cpp image instead of using a generic 'server' tag
- Cleans up the docker build command to use dynamic BASE_TAG variable
- Maintains existing push functionality for built images
2025-11-08 09:56:49 -08:00
Benson Wong 6aedbe121a cmd/wol-proxy: show a loading page for / (#381)
When requesting / wol-proxy will show a loading page that polls /status
every second. When the upstream server is ready the loading page will
refresh causing the actual root page to be displayed
2025-11-03 19:37:06 -08:00