llama-swap-rocm

Author	SHA1	Message	Date
ClaudeBot	980a4e6baa	ci: adapt build for Gitea registry, drop ghcr.io dependency - Login to git.wylab.me instead of ghcr.io - Use Gitea-hosted llama.cpp-rocm base image instead of ghcr.io - Rewrite fetch_llama_tag to use anonymous OCI registry API - Add LS_UPSTREAM for release binary fetches on forks - Add REGISTRY and BASE_TAG overrides for self-hosted builds - Only build rocm platform Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 19:39:57 +02:00
Benson Wong	7b2b82777f	docker/unified: derive rootless image from root container (#644 ) Build the root image once, then derive the rootless variant from it using a small inline Dockerfile that adds the non-root user and chowns the writable directories. This halves the number of CI jobs (4 → 2) and eliminates the redundant full CUDA compilation for the rootless variant. - remove RUN_UID build arg from build-image.sh - derive rootless image inline after root build completes - collapse variant matrix out of unified-docker.yml - push both root and rootless tags in a single CI job Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-10 22:59:54 -07:00
Benson Wong	d87f0ce2c5	docker/unified: publish rootless image variant (#630 )	2026-04-07 03:05:53 -07:00
Benson Wong	981910d734	ci: validate config.example.yaml against config-schema.json (#627 ) Extend the existing config-schema workflow to also validate config.example.yaml against config-schema.json using check-jsonschema. - add config.example.yaml to PR and push path triggers - install check-jsonschema via pip - run validation of config.example.yaml against schema https://claude.ai/code/session_01Y1oqwE6mwNs9UTJgZRgXtG --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-04-05 15:17:57 +08:00
Benson Wong	1dd1aadf93	docker/unified: add ik_llama.cpp to CUDA container (#620 )	2026-04-03 15:16:30 +08:00
Benson Wong	1e440770ea	ci: fix matrix exclude for scheduled docker workflow (#610 )	2026-03-29 20:04:28 +09:00
Benson Wong	8fabc75634	docker/unified: vulkan build fixes (#600 ) multiple fixes to vulkan build: - use ubuntu 26.04 to be compatible with AMD 395+ (Strix halo) hardware - add home directory in container - fix stable-diffusion install to actually enable vulkan --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-25 23:26:13 +09:00
Benson Wong	e5e7391b6d	.github,docker/unified: include vulkan build (#599 ) Update docker/unified scripts to support building both cuda and vulkan unified images.	2026-03-25 06:58:28 +09:00
Benson Wong	2c282dccad	.github,docker/unified: improve caching and fix bugs (#598 ) - set up a GHA scheduled job to build the container nightly - enabling pushing a llama-swap:unified and a llama-swap:unified-Y-M-D image to ghcr.io - tidy up Dockerfile to use a non-root user and llama-swap as an entry point	2026-03-23 22:24:40 +09:00
Benson Wong	916d13f5bd	.github/workflows,docker/unified: add cuda based unified container (#597 ) Add Docker build scripts for a unified cuda docker container with llama-server, stable-diffusion.cpp, whisper.cpp.	2026-03-22 21:11:54 +09:00
Benson Wong	15bd55d3a9	proxy, ui-svelte: add /sdapi/v1 endpoint support (#587 ) Add proxy routes for stable-diffusion.cpp's /sdapi/v1/txt2img, /sdapi/v1/img2img, and /sdapi/v1/loras endpoints. POST endpoints use proxyInferenceHandler (model in JSON body), GET /loras uses proxyGETModelHandler (model in query param). Update the image playground with a dual-mode UI supporting both OpenAI and SDAPI backends. In SDAPI mode, loras are fetched first to prime the server-side cache, and all txt2img parameters are exposed (negative prompt, steps, cfg_scale, seed, batch_size, clip_skip, sampler, scheduler, lora selection with multipliers). - Add 3 sdapi route registrations in proxymanager.go - Add sdApi.ts client with generateSdImage and fetchSdLoras - Add SDAPI types (SdApiTxt2ImgRequest, SdApiResponse, etc.) - Add /sdapi to vite dev proxy config - Add backend tests for sdapi routing - Support batch image display in gallery grid https://claude.ai/code/session_0186MGX6NXdHVBTv2KH45fqn --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-03-19 22:08:31 +09:00
pdscomp	181f71ca11	.github,docker: add cuda13 architecture support (#551 ) Add `cuda13` as a supported build architecture, targeting the `ghcr.io/ggml-org/llama.cpp:server-cuda13` upstream base image. The `server-cuda13` image ships with CUDA 13 libraries, providing improved performance on recent NVIDIA hardware compared to the existing `server-cuda` (CUDA 12) image. Users with newer GPUs (e.g., RTX 50-series) benefit from reduced model load latency and higher token throughput. - Add `cuda13` to the allowed architectures list in `docker/build-container.sh` - Add `cuda13` to the CI matrix in `.github/workflows/containers.yml` so the container is built and pushed automatically	2026-03-01 09:37:08 -08:00
Benson Wong	17e5263a76	.github/workflows: fix expired token in publishing images (#522 ) Fixes: #517	2026-02-14 10:06:05 -08:00
Benson Wong	bc01e6f539	build: add stable-diffusion server to musa and vulkan container images (#504 ) Add sd-server from stable-diffusion.cpp docker image for vulkan and musa containers. closes #450	2026-02-01 16:17:26 -08:00
Benson Wong	7b20fc011b	Add path filters to CI workflows and create UI test workflow (#501 ) * .github/workflows: add UI tests and path-filter Go CI Add ui-tests.yml workflow to run svelte type checking and vitest on push/PR to main when ui-svelte/ files change. - Add path filters to go-ci.yml and go-ci-windows.yml to skip Go tests when only non-backend files change - Filter on */.go, go.mod, go.sum, and Makefile https://claude.ai/code/session_01E6acq54D8JjuE7pczxPGT7 * ui-svelte: remove unused declarations in SpeechInterface Remove unused `generatedText` state and `clearAudio` function that caused svelte-check errors. https://claude.ai/code/session_01E6acq54D8JjuE7pczxPGT7 * .github/workflows: update Node.js to v24 Node 23 is end-of-life; bump to 24 in ui-tests.yml and release.yml. https://claude.ai/code/session_01E6acq54D8JjuE7pczxPGT7 --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-02-01 15:11:49 -08:00
Benson Wong	20738f3623	proxy,ui-svelte: replace old UI with svelte+playground Replace the legacy React UI with the new Svelte-based one. Introduce a Playground in the UI to quickly test out text, image, text to speech and speech to text models behind llama-swap. Key Changes New Svelte UI (ui-svelte/) - Multi-tab Playground with Chat, Image Generation, Audio Transcription, and Speech interfaces - Chat: message editing/regeneration, markdown rendering with LaTeX math support, image attachments, code syntax highlighting - Image: size selector, download/fullscreen viewing - Audio: transcription with peer support - Speech: voice caching with manual refresh, download button - Responsive mobile layout with collapsible navigation - XSS fixes and accessibility improvements Proxy Improvements - Add gzip/brotli compression for UI static assets (proxy/ui_compress.go) - Add GET /v1/audio/voices?model={model} endpoint for voice listing - Add peer support for /v1/audio/transcriptions	2026-01-31 22:49:13 -08:00
Benson Wong	6f8e7ccb57	.github/workflows: switch release.yml to build ui-svelte	2026-01-28 21:39:10 -08:00
Benson Wong	3edb180c08	ci: free up disk space before ROCm container build (#460 )	2026-01-14 22:03:42 -08:00
Benson Wong	66d555e625	Improve container build reliability (#457 ) * docker: add .env usage in build-container.sh * .github,docker: add rocm, improve logging * .github,CLAUDE.md: fix workflow and update guidelines Update containers workflow to only push images when triggered manually or on schedule, not on workflow file changes. - add push trigger for workflow file changes in containers.yml - update push condition to skip on regular push events - update CLAUDE.md commit message guidelines * docker: remove comma in build-container.sh * .github,docker: improve container build workflow Add pagination support for fetching llama.cpp tags and improve debugging. - add build-container.sh to workflow trigger paths - implement fetch_llama_tag() with pagination support - replace .env with local testing instructions - add DEBUG_ABORT_BUILD flag for testing	2026-01-10 22:14:33 -08:00
Benson Wong	c0fc858193	Add configuration file JSON schema (#393 ) * add json schema for configuration * add GH action to validate schema	2025-11-08 15:04:14 -08:00
Benson Wong	4662cf7699	add 'unconfirmed bug' as default label in bug-report.md	2025-08-15 15:38:12 -07:00
Benson Wong	74556c3a36	Update bug-report.md [skip ci]	2025-08-08 09:52:05 -07:00
Benson Wong	5c381e4b30	Add gofmt linting to ci	2025-08-07 20:29:18 -07:00
Benson Wong	5672cb03fd	Update github actions for notifying homebrew build (#212 ) Combine homebrew-llama-swap event with the release action	2025-07-30 11:29:03 -07:00
Benson Wong	7905fa9ea3	Update trigger-homebrew-update.yml [skip ci]	2025-07-30 10:13:49 -07:00
Ian Sebastian Mathew	bbaf172956	add trigger to rebuild homebrew formula (#210 )	2025-07-30 10:12:21 -07:00
Benson Wong	591a9cdf4d	update release.yml	2025-06-16 16:50:25 -07:00
Thammachart Chinvarapon	cc33b6c270	restore intel docker builds (#163 )	2025-06-16 11:13:49 -07:00
Benson Wong	b2a891f8f4	Disable building of intel container until it's fixed upstream	2025-05-23 22:54:43 -07:00
Benson Wong	d7b390df74	Add GH Action for Testing on Windows (#132 ) * Add windows specific test changes * Change the command line parsing library - Possible breaking changes for windows users!	2025-05-14 21:51:53 -07:00
Benson Wong	5025c2f1f3	Add GH windows tests (not working yet)	2025-05-14 19:58:22 -07:00
Benson Wong	8ada72eb57	Update issue templates	2025-05-14 16:36:32 -07:00
Thammachart Chinvarapon	9548931258	ci: re-enabled intel build pipeline (#121 )	2025-05-11 00:19:57 -07:00
Benson Wong	9667989727	Disabling intel container build since it's been broken for weeks.	2025-05-04 21:39:42 -07:00
Benson Wong	ec0348e431	Reduce stale time for issues	2025-04-29 21:16:34 -07:00
Benson Wong	9b2ed244e2	Improve Continuous integration and fix concurrency bugs (#66 ) - improvements to the continuous GH actions - fix edge case concurrency bugs with Process.start() and state transitions discovered setting up CI.	2025-03-11 10:39:14 -07:00
Benson Wong	eeb72297f7	add first version of CI for go	2025-03-11 08:45:56 -07:00
Benson Wong	eabfe70cc6	add GH action to close inactive issues	2025-03-09 19:51:48 -07:00
Benson Wong	29cd98878d	better container build logic when upstream containers do not exist	2025-03-09 13:02:06 -07:00
Benson Wong	1e25b44a06	add workflow_dispatch to release action	2025-02-18 17:27:43 -08:00
Benson Wong	ebabe55ff3	Delete untagged packages after build and push (#55 )	2025-02-18 10:32:32 -08:00
Benson Wong	41a338297c	deletion of untagged containers happen after build-and-push	2025-02-18 10:11:59 -08:00
Benson Wong	7e3353efeb	add action step to remove untagged containers	2025-02-18 10:08:41 -08:00
Benson Wong	4ed58fb173	update container build action	2025-02-18 09:59:06 -08:00
Benson Wong	0acfdb9f78	update workflow to build `cpu` and disable `musa`	2025-02-14 15:26:59 -08:00
Benson Wong	f20f2c9b7a	add docs and container build improvements #43	2025-02-14 12:20:07 -08:00
Benson Wong	7a97c38828	enable parallel container built #46	2025-02-14 11:04:33 -08:00
Benson Wong	4885132565	more permissions futzing	2025-02-14 11:02:15 -08:00
Benson Wong	8b46a0b7f1	grant package:write to container workflow #46	2025-02-14 10:55:30 -08:00
Benson Wong	1b6736ec6f	rename workflow for containers	2025-02-14 10:50:15 -08:00

1 2

59 Commits