llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-06-30 17:47:40 +02:00

Author	SHA1	Message	Date
Xuan-Son Nguyen	060ce1bf72	mtmd: refactor llava-uhd overview image handling (always use ov_img_first) (#24769 ) * add dedicated "overview" for mtmd_image_preproc_out * corrections * correct (again) * nits * nits (2)	2026-06-18 18:53:49 +02:00
Xuan-Son Nguyen	24bba7b98e	mtmd: refactor preprocessor, add mtmd_image_preproc_out (#24736 ) * add mtmd_image_preproc_out * add dev docs * remove unused clip API * rm unused clip_image_f32_batch::grid * change preprocess() call signature	2026-06-18 12:04:39 +02:00
Xuan-Son Nguyen	f3e1828164	mtmd: llava_uhd should no longer use batch dim (#24732 )	2026-06-17 22:40:50 +02:00
Xuan-Son Nguyen	31e82494c0	mtmd: support "frame merge" for qwen-vl-based models (#21858 ) * feat: add video support for Qwen3.5 * various clean up * revise the design * fix llava-uhd case * nits * nits 2 --------- Co-authored-by: andrewmd5 <1297077+andrewmd5@users.noreply.github.com>	2026-06-06 21:17:25 +02:00
Xuan-Son Nguyen	f5c6ae1827	mtmd, server: add "placeholder bitmap" for counting tokens , add /input_tokens API (#23913 ) mtmd: add "placeholder bitmap" for counting tokens w/o preprocessing * fast path skip preproc for placeholder * fix build * correct the api * add server endpoint + tests * add object name * update docs * add proxy handling * fix build * fix audio input path * use is_placeholder in process_mtmd_prompt() * nits * nits (2) * docs: clarify chat/completions/input_tokens is not official * fix merge problem	2026-06-06 11:06:51 +02:00
Saba Fallah	da3f990a47	mtmd: Add DeepSeekOCR 2 Support (#20975 ) * mtmd: DeepSeek-OCR 2 support, with multi-tile dynamic resolution * introduced clip_image_f32::add_viewsep * address PR review - drop redundant ggml_cpy ops in both deepseekocr versions build - drop no-op ggml_cont in build_sam - assert num_image_tokens deepseekocr2 - view_seperator as (1, n_embd) at conversion (for both versions) - drop redundant ggml_reshape_2d * Update tools/mtmd/models/deepseekocr2.cpp Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>	2026-05-29 16:13:51 +02:00
Saba Fallah	a8681a0ed2	mtmd : DeepSeek-OCR image processing fixes, img_tool::resize padding refactor (#23345 ) * mtmd : deepseek-ocr fixes, improvements and refactoring - image processing changes to achieve full parity with Pillow (reference impl) - SAM mask casting only when flash-attn is on - SAM refactor (build_sam() extracted so deepseek-ocr-2 can reuse it) - llama-chat changes to fix server/WebUI issue (new media_markers_first()) - adapted test-chat-template and added test cases for deepseek-ocr - changed regression test for deepseek-ocr to use CER+chrF scores for ground-truth comparison; removed embedding-model - ty.toml ignore unresolved-import for tools/mtmd/tests/** * image-text reordering fix removed * refactor bool add_padding + pad_rounding enum into a single pad_style enum	2026-05-20 17:37:10 +02:00
tc-mb	2496f9c149	mtmd : support MiniCPM-V 4.6 (#22529 ) * Support MiniCPM-V 4.6 in new branch Signed-off-by: tc-mb <tianchi_cai@icloud.com> * fix code bug Signed-off-by: tc-mb <tianchi_cai@icloud.com> * fix pre-commit Signed-off-by: tc-mb <tianchi_cai@icloud.com> * fix convert Signed-off-by: tc-mb <tianchi_cai@icloud.com> * rename clip_graph_minicpmv4_6 Signed-off-by: tc-mb <tianchi_cai@icloud.com> * use new TYPE_MINICPMV4_6 Signed-off-by: tc-mb <tianchi_cai@icloud.com> * use build_attn to allow flash attention support Signed-off-by: tc-mb <tianchi_cai@icloud.com> * no use legacy code, restored here. Signed-off-by: tc-mb <tianchi_cai@icloud.com> * use the existing tensors name Signed-off-by: tc-mb <tianchi_cai@icloud.com> * unused ctx->model.hparams.minicpmv_version Signed-off-by: tc-mb <tianchi_cai@icloud.com> * use n_merge for slice alignment Signed-off-by: tc-mb <tianchi_cai@icloud.com> * borrow wa_layer_indexes for vit_merger insertion point Signed-off-by: tc-mb <tianchi_cai@icloud.com> * fix code style Signed-off-by: tc-mb <tianchi_cai@icloud.com> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * use filter_tensors and add model.vision_tower Signed-off-by: tc-mb <tianchi_cai@icloud.com> * fix chkhsh Signed-off-by: tc-mb <tianchi_cai@icloud.com> * fix type check Signed-off-by: tc-mb <tianchi_cai@icloud.com> --------- Signed-off-by: tc-mb <tianchi_cai@icloud.com> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2026-05-06 21:54:09 +02:00
Sergiu	82764d8f40	mtmd: fix crash when sending image under 2x2 pixels (#21711 ) CI (android) / android-ndk (arm64-cpu, -D ANDROID_ABI=arm64-v8a -D ANDROID_PLATFORM=android-31 -D CMAKE_TOOLCHAIN_FILE=${ANDROID_NDK_ROOT}/build/cmake/android.toolchain.cmake -D GGML_NATIVE=OFF -DGGML_CPU_ARM_ARCH=armv8.5-a+fp16+i8mm -G Ninja -D LLAMA_OPENSSL=OFF -D … (push) Failing after 1m47s CI (android) / android-ndk (arm64-snapdragon, --preset arm64-android-snapdragon-release) (push) Failing after 37s CI (sanitize) / ubuntu-latest-sanitizer (Debug, ADDRESS) (push) Failing after 31s CI (sanitize) / ubuntu-latest-sanitizer (Debug, THREAD) (push) Failing after 19s CI (sanitize) / ubuntu-latest-sanitizer (Debug, UNDEFINED) (push) Failing after 19s CI / build-cmake-pkg (push) Failing after 3m1s CI / ubuntu-latest-rpc (push) Failing after 1s CI / ubuntu-latest-cuda (push) Failing after 4s Server (sanitize) / server (RelWithDebInfo, ADDRESS) (push) Failing after 1s CI (android) / android (push) Failing after 7m12s Server (sanitize) / server (RelWithDebInfo, UNDEFINED) (push) Failing after 1m7s Server / server (default) (push) Failing after 42s Server / server (backend-sampling) (push) Failing after 7s Build Actions Cache / ubuntu-24-vulkan-cache (push) Has been cancelled Build Actions Cache / ubuntu-24-openvino-cache (push) Has been cancelled Build Actions Cache / windows-2022-rocm-cache (push) Has been cancelled Update Winget Package / Update Winget Package (push) Has been skipped CI (3rd-party) / ubuntu-24-llguidance (push) Has been cancelled CI (apple) / macOS-latest-ios (push) Has been cancelled CI (apple) / macos-latest-ios-xcode (push) Has been cancelled CI (apple) / macOS-latest-tvos (push) Has been cancelled CI (apple) / macOS-latest-visionos (push) Has been cancelled CI (apple) / macOS-latest-swift (generic/platform=iOS) (push) Has been cancelled CI (apple) / macOS-latest-swift (generic/platform=macOS) (push) Has been cancelled CI (apple) / macOS-latest-swift (generic/platform=tvOS) (push) Has been cancelled CI (cann) / openEuler-latest-cann (aarch64, Release, 310p, off) (push) Has been cancelled CI (cann) / openEuler-latest-cann (aarch64, Release, 910b, off) (push) Has been cancelled CI (cann) / openEuler-latest-cann (aarch64, Release, 910b, on) (push) Has been cancelled CI (cann) / openEuler-latest-cann (x86, Release, 310p, off) (push) Has been cancelled CI (cann) / openEuler-latest-cann (x86, Release, 910b, off) (push) Has been cancelled CI (cann) / openEuler-latest-cann (x86, Release, 910b, on) (push) Has been cancelled CI (riscv) / ubuntu-riscv64-native-sanitizer (Debug, ADDRESS) (push) Has been cancelled CI (riscv) / ubuntu-riscv64-native-sanitizer (Debug, THREAD) (push) Has been cancelled CI (riscv) / ubuntu-riscv64-native-sanitizer (Debug, UNDEFINED) (push) Has been cancelled CI (self-hosted) / ggml-ci-nvidia-cuda (push) Has been cancelled CI (self-hosted) / ggml-ci-nvidia-vulkan-cm (push) Has been cancelled CI (self-hosted) / ggml-ci-nvidia-vulkan-cm2 (push) Has been cancelled CI (self-hosted) / ggml-ci-linux-intel-vulkan (push) Has been cancelled CI (self-hosted) / ggml-ci-win-intel-vulkan (push) Has been cancelled CI (self-hosted) / ggml-ci-intel-openvino-gpu-low-perf (push) Has been cancelled CI (vulkan) / ubuntu-24-vulkan-llvmpipe (push) Has been cancelled CI / macOS-latest-arm64 (push) Has been cancelled CI / macOS-latest-x64 (push) Has been cancelled CI / macOS-latest-arm64-webgpu (push) Has been cancelled CI / ubuntu-cpu (arm64, ubuntu-24.04-arm) (push) Has been cancelled CI / ubuntu-cpu (ppc64le, ubuntu-24.04-ppc64le) (push) Has been cancelled CI / ubuntu-cpu (s390x, ubuntu-24.04-s390x) (push) Has been cancelled CI / ubuntu-cpu (x64, ubuntu-22.04) (push) Has been cancelled CI / ubuntu-24-vulkan (arm64, ubuntu-24.04-arm) (push) Has been cancelled CI / ubuntu-24-vulkan (x64, ubuntu-24.04) (push) Has been cancelled CI / ubuntu-24-webgpu (push) Has been cancelled CI / ubuntu-24-webgpu-wasm (push) Has been cancelled CI / ubuntu-22-hip (push) Has been cancelled CI / ubuntu-22-musa (push) Has been cancelled CI / ubuntu-22-sycl (push) Has been cancelled CI / ubuntu-22-sycl-fp16 (push) Has been cancelled CI / ubuntu-24-openvino-CPU (push) Has been cancelled CI / ubuntu-24-openvino-GPU (push) Has been cancelled CI / windows-latest (arm64, llvm-arm64, -G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/arm64-windows-llvm.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON) (push) Has been cancelled CI / windows-latest (arm64, llvm-arm64-opencl-adreno, -G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/arm64-windows-llvm.cmake -DCMAKE_PREFIX_PATH="$env:RUNNER_TEMP/opencl-arm64-release" -DGGML_OPENCL=ON -DGGML_OPENCL_USE_ADRENO_KERNELS=ON) (push) Has been cancelled CI / windows-latest (x64, cpu-x64 (static), -G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/x64-windows-llvm.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DBUILD_SHARED_LIBS=OFF) (push) Has been cancelled CI / windows-latest (x64, openblas-x64, -G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/x64-windows-llvm.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON -DGGML_OPENMP=OFF -DGGML_BLAS=ON -DG… (push) Has been cancelled CI / windows-latest (x64, vulkan-x64, -DCMAKE_BUILD_TYPE=Release -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON -DGGML_VULKAN=ON) (push) Has been cancelled CI / windows-2022-cuda (12.4) (push) Has been cancelled CI / windows-latest-sycl (push) Has been cancelled CI / windows-latest-hip (push) Has been cancelled CI / ubuntu-cpu-riscv64-native (push) Has been cancelled CI / ggml-ci-x64-cpu-low-perf (push) Has been cancelled CI / ggml-ci-arm64-cpu-low-perf (push) Has been cancelled CI / ggml-ci-x64-cpu-high-perf (push) Has been cancelled CI / ggml-ci-arm64-cpu-high-perf (push) Has been cancelled CI / ggml-ci-arm64-cpu-high-perf-sve (push) Has been cancelled CI / ggml-ci-arm64-cpu-kleidiai (push) Has been cancelled CI / ggml-ci-arm64-cpu-kleidiai-graviton4 (push) Has been cancelled EditorConfig Checker / editorconfig (push) Has been cancelled flake8 Lint / Lint (push) Has been cancelled Python Type-Check / python type-check (push) Has been cancelled Release / macOS-cpu (arm64, arm64, -DGGML_METAL_USE_BF16=ON -DGGML_METAL_EMBED_LIBRARY=ON, macos-14) (push) Has been cancelled Release / macOS-cpu (arm64, arm64-kleidiai, -DGGML_METAL_USE_BF16=ON -DGGML_METAL_EMBED_LIBRARY=ON -DGGML_CPU_KLEIDIAI=ON, macos-14) (push) Has been cancelled Release / macOS-cpu (x64, x64, -DGGML_METAL=OFF -DCMAKE_OSX_DEPLOYMENT_TARGET=13.3, macos-15-intel) (push) Has been cancelled Release / ubuntu-cpu (arm64, ubuntu-24.04-arm) (push) Has been cancelled Release / ubuntu-cpu (s390x, ubuntu-24.04-s390x) (push) Has been cancelled Release / ubuntu-cpu (x64, ubuntu-22.04) (push) Has been cancelled Release / ubuntu-vulkan (arm64, ubuntu-24.04-arm) (push) Has been cancelled Release / ubuntu-vulkan (x64, ubuntu-22.04) (push) Has been cancelled Release / ubuntu-24-openvino (push) Has been cancelled Release / windows-cpu (arm64) (push) Has been cancelled Release / windows-cpu (x64) (push) Has been cancelled Release / windows (arm64, opencl-adreno, -G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/arm64-windows-llvm.cmake -DCMAKE_PREFIX_PATH="$env:RUNNER_TEMP/opencl-arm64-release" -DGGML_OPENCL=ON -DGGML_OPENCL_USE_ADRENO_KERNELS=ON, ggml-opencl) (push) Has been cancelled Release / windows (x64, vulkan, -DGGML_VULKAN=ON, ggml-vulkan) (push) Has been cancelled Release / windows-cuda (12.4) (push) Has been cancelled Release / windows-cuda (13.1) (push) Has been cancelled Release / windows-sycl (push) Has been cancelled Release / ubuntu-22-rocm (7.2.1, x64, gfx908;gfx90a;gfx942;gfx1030;gfx1100;gfx1101;gfx1102;gfx1151;gfx1150;gfx1200;gfx1201) (push) Has been cancelled Release / windows-hip (gfx1150;gfx1151;gfx1200;gfx1201;gfx1100;gfx1101;gfx1102;gfx1030;gfx1031;gfx1032, radeon) (push) Has been cancelled Release / ios-xcode-build (push) Has been cancelled Release / openEuler-cann (aarch64, Release, 310p, off) (push) Has been cancelled Release / openEuler-cann (aarch64, Release, 910b, on) (push) Has been cancelled Release / openEuler-cann (x86, Release, 310p, off) (push) Has been cancelled Release / openEuler-cann (x86, Release, 910b, on) (push) Has been cancelled Release / release (push) Has been cancelled Server (self-hosted) / server-metal (GPUx2, backend-sampling) (push) Has been cancelled Server (self-hosted) / server-metal (GPUx2) (push) Has been cancelled Server (self-hosted) / server-metal (GPUx1) (push) Has been cancelled Server (self-hosted) / server-metal (GPUx1, backend-sampling) (push) Has been cancelled Server (self-hosted) / server-cuda (GPUx1) (push) Has been cancelled Server (self-hosted) / server-cuda (GPUx1, backend-sampling) (push) Has been cancelled Server / server-windows (push) Has been cancelled Publish Docker image / Create and push git tag (push) Has been cancelled Publish Docker image / Prepare Docker matrices (push) Has been cancelled Publish Docker image / Push Docker image to Docker Registry (push) Has been cancelled Publish Docker image / Create shared tags from digests (push) Has been cancelled Check Pre-Tokenizer Hashes / pre-tokenizer-hashes (push) Has been cancelled Python check requirements.txt / check-requirements (push) Has been cancelled	2026-04-12 23:59:21 +02:00
forforever73	09343c0198	model : support step3-vl-10b (#21287 ) * feat: support step3-vl-10b * use fused QKV && mapping tensor in tensor_mapping.py * guard hardcoded params and drop crop metadata * get understand_projector_stride from global config * img_u8_resize_bilinear_to_f32 move in step3vl class * Apply suggestions from code review Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * fix the \r\n mess * add width and heads to MmprojModel.set_gguf_parameters --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2026-04-08 09:51:31 +02:00
Xuan-Son Nguyen	871f1a2d2f	mtmd: add more sanity checks (#21047 )	2026-03-27 11:00:52 +01:00
Xuan-Son Nguyen	a73bbd5d92	mtmd: refactor image preprocessing (#21031 ) * mtmd: refactor image pre-processing * correct some places * correct lfm2 * fix deepseek-ocr on server * add comment to clarify about mtmd_image_preprocessor_dyn_size	2026-03-26 19:49:20 +01:00

12 Commits