fix: force MMQ kernels to bypass rocBLAS TensileLibrary on gfx1103
Build Actions Cache / ubuntu-24-vulkan-cache (push) Has been cancelled
Build Actions Cache / ubuntu-24-openvino-cache (push) Has been cancelled
Build Actions Cache / windows-2022-rocm-cache (push) Has been cancelled
EditorConfig Checker / editorconfig (push) Has been cancelled
Close inactive issues / close-issues (push) Has been cancelled
Build Actions Cache / ubuntu-24-vulkan-cache (push) Has been cancelled
Build Actions Cache / ubuntu-24-openvino-cache (push) Has been cancelled
Build Actions Cache / windows-2022-rocm-cache (push) Has been cancelled
EditorConfig Checker / editorconfig (push) Has been cancelled
Close inactive issues / close-issues (push) Has been cancelled
ROCm 7.2 rocBLAS has no TensileLibrary for gfx1103 (RDNA3 iGPU) and the gfx1102 library kernels crash due to register file differences. Force MMQ (matrix multiply quantized) kernels which are compiled by hipcc for the actual target arch, bypassing rocBLAS entirely. This matches how Ollama successfully runs on AMD 780M / gfx1103. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -38,6 +38,7 @@ RUN HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
|
||||
cmake -S . -B build \
|
||||
-DGGML_HIP=ON \
|
||||
-DGGML_HIP_ROCWMMA_FATTN=OFF \
|
||||
-DGGML_CUDA_FORCE_MMQ=ON \
|
||||
-DAMDGPU_TARGETS="$ROCM_DOCKER_ARCH" \
|
||||
-DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON \
|
||||
-DCMAKE_BUILD_TYPE=Release -DLLAMA_BUILD_TESTS=OFF \
|
||||
|
||||
Reference in New Issue
Block a user