Default Branch

3ac3c20c96 · ggml-webgpu: Add clang-format job (#24308) · Updated 2026-06-09 05:54:24 +02:00

Branches

6eb6d84e46 · metal: add GDN partial rollback · Updated 2026-05-14 09:24:09 +02:00

439
12

8f71759570 · download: do not exit() on error · Updated 2026-05-13 14:02:26 +02:00

437
1

c8f8e2364c · cont : simplify · Updated 2026-05-11 09:54:07 +02:00

506
38

efa2f8e5a7 · naming : improve consistency · Updated 2026-05-08 11:24:57 +02:00

506
24

ba72d4d287 · ggml: update SCHED_DEBUG output to use ggml_op_desc() · Updated 2026-05-08 01:52:20 +02:00

503
1

0445829c1d · llama : enable layer input extraction · Updated 2026-05-05 19:50:20 +02:00

533
1

f84632951a · wip · Updated 2026-05-05 08:36:07 +02:00

541
23

82af405161 · arg : silence warnings about removed params · Updated 2026-05-04 09:07:57 +02:00

553
1

81eabb4781 · sync : ggml · Updated 2026-05-02 07:53:10 +02:00

568
2

1b2bd8699c · fix windows build · Updated 2026-04-30 21:52:31 +02:00

585
8

9d5887035f · testing · Updated 2026-04-30 18:18:57 +02:00

582
2

6eddb1c6e3 · pi : add rule to use gh CLI for GitHub resources · Updated 2026-04-30 08:49:54 +02:00

585
2

c6a04cb5c3 · ggml-metal: fix 2D async copy to use row-by-row transfers · Updated 2026-04-29 13:57:48 +02:00

595
3

fd6f79c7a4 · download : prefer q8_0 when q4_k not available · Updated 2026-04-27 11:08:25 +02:00

624
1

cb9fc575e4 · common : use pimpl in debug.h to reduce header dependencies · Updated 2026-04-26 08:49:28 +02:00

645
3

b9421898b6 · add for Q4_0 · Updated 2026-04-23 09:33:19 +02:00

787
2

a5355a0226 · server: keep router model refcount to avoid unloading models that have running requests · Updated 2026-04-22 10:07:13 +02:00

700
15

35df147d80 · cont : remove /api/tags · Updated 2026-04-20 14:45:42 +02:00

715
2

4943e3a396 · gen-libllama-abi: compile sort-key regex once outside the lambda · Updated 2026-04-15 14:04:44 +02:00

770
4

4cabbe36e0 · state · Updated 2026-04-09 13:00:31 +02:00    wylab

867
16