llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-06-09 07:16:44 +02:00

Files

T

Max Krasnyansky 7d2b45b4f7 mtp: support for gemma-4 E2B and E4B assistants (#24282 )

* models: update converter to support smaller assistants

* models: add masked_embd tensors to gemma4-assist arch

* gemma-4: remove temp debug for conversion

* gemma-4-mtp: filter out masked_embedding tensors during conversion

2026-06-08 13:48:52 -07:00

scripts

ggml : add NVFP4 quantization type support (#19769 )

2026-03-11 21:02:54 +01:00

__init__.py

convert-*.py: GGUF Naming Convention Refactor and Metadata Override Refactor (#7499 )

2024-07-18 20:40:15 +10:00

constants.py

mtp: support for gemma-4 E2B and E4B assistants (#24282 )

2026-06-08 13:48:52 -07:00

gguf_reader.py

ggml/gguf : prevent integer overflows (#19856 )

2026-02-24 20:17:11 +02:00

gguf_writer.py

model, mtmd: Granite4 Vision (#23545 )

2026-06-05 17:44:59 +02:00

gguf.py

gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981 )

2023-11-11 08:04:50 +03:00

lazy.py

ci : switch from pyright to ty (#20826 )

2026-03-21 08:54:34 +01:00

metadata.py

chore : correct typos [no ci] (#20041 )

2026-03-05 08:50:21 +01:00

py.typed

convert : various script cleanups/fixes + merges and special token handling (#2842 )

2023-08-30 11:25:50 +03:00

quants.py

convert : minor fixes for numpy 2.x (#23571 )

2026-05-24 09:51:31 +02:00

tensor_mapping.py

mtp: support for gemma-4 E2B and E4B assistants (#24282 )

2026-06-08 13:48:52 -07:00

utility.py

gguf-py : do not align the data start offset (#18291 )

2025-12-22 20:25:16 +01:00

vocab.py

vocab : add tokenizer support for jina-embeddings-v2-base-zh (#18756 )

2026-05-31 12:37:35 +02:00