forked from wylab/llama-swap
.github,docker: add cuda13 architecture support (#551)
Add `cuda13` as a supported build architecture, targeting the `ghcr.io/ggml-org/llama.cpp:server-cuda13` upstream base image. The `server-cuda13` image ships with CUDA 13 libraries, providing improved performance on recent NVIDIA hardware compared to the existing `server-cuda` (CUDA 12) image. Users with newer GPUs (e.g., RTX 50-series) benefit from reduced model load latency and higher token throughput. - Add `cuda13` to the allowed architectures list in `docker/build-container.sh` - Add `cuda13` to the CI matrix in `.github/workflows/containers.yml` so the container is built and pushed automatically
This commit is contained in:
@@ -29,7 +29,7 @@ jobs:
|
||||
runs-on: ubuntu-latest
|
||||
strategy:
|
||||
matrix:
|
||||
platform: [intel, cuda, vulkan, cpu, musa, rocm]
|
||||
platform: [intel, cuda, cuda13, vulkan, cpu, musa, rocm]
|
||||
fail-fast: false
|
||||
steps:
|
||||
- name: Checkout code
|
||||
|
||||
@@ -27,7 +27,7 @@ ARCH=$1
|
||||
PUSH_IMAGES=${2:-false}
|
||||
|
||||
# List of allowed architectures
|
||||
ALLOWED_ARCHS=("intel" "vulkan" "musa" "cuda" "cpu" "rocm")
|
||||
ALLOWED_ARCHS=("intel" "vulkan" "musa" "cuda" "cuda13" "cpu" "rocm")
|
||||
|
||||
# Check if ARCH is in the allowed list
|
||||
if [[ ! " ${ALLOWED_ARCHS[@]} " =~ " ${ARCH} " ]]; then
|
||||
|
||||
Reference in New Issue
Block a user