llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-06-29 17:17:40 +02:00

Files

T

History

YiChen Lv f7c1df6502 metal : per-op source split + parallel compile (#24021 )

* preliminary extract common header

* op source split

* split metallib into 8 libs && load in parallel

* derive kernel->library routing from functionNames

* x-macro lib list + underscore filenames, dedup QK_NL, MRC fixes

* op source split 8 to 20

* improve robustness of source fallback

* clean up

* change bool -> atomic_bool

* only prepend headers that source actually includes

* no semaphore, use GCD global queue

* dedup library compile path, fix NSError lifetime, rename gla

* relocate upstream concat/rope_back/repeat kernel changes into split files

* move ggml-common.h from common.h into dequantize.h to shrink binary size

---------

Co-authored-by: lvyichen <lvyichen@stepfun.com>

2026-06-27 12:15:51 +03:00

cmake

ggml : Parallelize quant LUT init (#23595 )

2026-05-25 10:15:46 +03:00

include

sycl : support --split-mode tensor (#24152 )

2026-06-25 08:35:21 +03:00

src

metal : per-op source split + parallel compile (#24021 )

2026-06-27 12:15:51 +03:00

.gitignore

vulkan : cmake integration (#8119 )

2024-07-13 18:12:39 +02:00

CMakeLists.txt

metal : per-op source split + parallel compile (#24021 )

2026-06-27 12:15:51 +03:00