fix: match Ollama's proven gfx1103 approach — gfx1102 target + rocBLAS
Build Actions Cache / ubuntu-24-vulkan-cache (push) Has been cancelled
Build Actions Cache / ubuntu-24-openvino-cache (push) Has been cancelled
Build Actions Cache / windows-2022-rocm-cache (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[cuda_version:12.4.0 dockerfile:.devops/cuda.Dockerfile free_disk_space:true full:true light:true platforms:linux/amd64 runs_on:ubuntu-24.04 server:true tag:cuda cuda12 ubuntu_version:22.04]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[cuda_version:13.1.0 dockerfile:.devops/cuda-new.Dockerfile free_disk_space:true full:true light:true platforms:linux/amd64 runs_on:ubuntu-24.04 server:true tag:cuda13 ubuntu_version:24.04]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/cpu.Dockerfile free_disk_space:false full:true light:true platforms:linux/amd64 runs_on:ubuntu-24.04 server:true tag:cpu]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/cpu.Dockerfile free_disk_space:false full:true light:true platforms:linux/arm64 runs_on:ubuntu-24.04 server:true tag:cpu]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/intel.Dockerfile free_disk_space:true full:true light:true platforms:linux/amd64 runs_on:ubuntu-24.04 server:true tag:intel]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/musa.Dockerfile free_disk_space:true full:true light:true platforms:linux/amd64 runs_on:ubuntu-24.04 server:true tag:musa]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/openvino.Dockerfile free_disk_space:false full:true light:true platforms:linux/amd64 runs_on:ubuntu-24.04 server:true tag:openvino]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/rocm.Dockerfile free_disk_space:true full:true light:true platforms:linux/amd64 runs_on:ubuntu-24.04 server:true tag:rocm]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/s390x.Dockerfile free_disk_space:false full:true light:true platforms:linux/s390x runs_on:ubuntu-24.04-s390x server:true tag:s390x]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/vulkan.Dockerfile free_disk_space:false full:true light:true platforms:linux/amd64 runs_on:ubuntu-24.04 server:true tag:vulkan]) (push) Has been cancelled
Publish Docker image / Create and push git tag (push) Has been cancelled
Update Winget Package / Update Winget Package (push) Has been cancelled
EditorConfig Checker / editorconfig (push) Has been cancelled
Close inactive issues / close-issues (push) Has been cancelled
CI (msys) / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Has been cancelled
CI (msys) / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Has been cancelled
CI (cross) / debian-13-loongarch64-cpu-cross (push) Has been cancelled
CI (cross) / debian-13-loongarch64-vulkan-cross (push) Has been cancelled
CI (cross) / ubuntu-24-riscv64-cpu-spacemit-ime-cross (push) Has been cancelled

Remove GGML_CUDA_FORCE_MMQ — let rocBLAS handle large batch GEMMs
using gfx1102 TensileLibrary (available in ROCm 7.2). The GPU is
spoofed as gfx1102 via HSA_OVERRIDE_GFX_VERSION=11.0.2 at runtime,
matching Ollama's working configuration.

FORCE_MMQ caused crashes because MMQ kernel launch_bounds are tuned
for GPUs with many CUs and cannot fit on the 6-CU iGPU for large
matrix dimensions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
claudeopus46
2026-04-03 03:05:06 +02:00
parent 94127d7b33
commit 457e76fc0e
-1
View File
@@ -38,7 +38,6 @@ RUN HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
cmake -S . -B build \
-DGGML_HIP=ON \
-DGGML_HIP_ROCWMMA_FATTN=OFF \
-DGGML_CUDA_FORCE_MMQ=ON \
-DAMDGPU_TARGETS="$ROCM_DOCKER_ARCH" \
-DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON \
-DCMAKE_BUILD_TYPE=Release -DLLAMA_BUILD_TESTS=OFF \