Georgi Gerganov
27c8bb4f63
logs : reduce v2 ( #25078 )
...
* server : reduce logs
* cont : common
* cont : spec
* cont : CMN_ -> COM_
2026-06-28 08:52:15 +03:00
Georgi Gerganov
d8a24ccee2
fit : wrap llama_device_memory_data ( #24522 )
2026-06-13 08:09:52 +03:00
Aman Gupta
83eebe9d08
server: add margin for draft model for fit ( #23485 )
2026-05-24 14:43:08 +08:00
Georgi Gerganov
67b2b7f2f2
logs : reduce ( #23021 )
...
* logs : reduce
* args : fix envs
* server : fix build
* common : print verbosity level at start
* server : clean-up logs
* server : print prompt processing timings + sampling params
* minor : whitespaces
2026-05-14 13:05:52 +03:00
fl0rianr
a0101225bc
common: do not fit to unknown device memory ( #22614 )
...
* common: do not fit to unknown device memory
Signed-off-by: Florian Reinle <f.reinle@otec.de >
* common: preserve host fallback for non-GPU fit devices
Signed-off-by: Florian Reinle <f.reinle@otec.de >
* common: keep unknown GPU fit memory at zero
Signed-off-by: Florian Reinle <f.reinle@otec.de >
---------
Signed-off-by: Florian Reinle <f.reinle@otec.de >
2026-05-06 17:03:45 +02:00
rankaiyx
42401c72b8
Fix type casting for unaccounted memory calculation ( #22424 )
2026-04-27 14:31:13 +02:00
Georgi Gerganov
cfe9838d26
fit-params : refactor + add option to output estimated memory per device ( #22171 )
...
* fit-params : add option to output estimated memory per device
* cont : minor
* cont : refactor
* cont : move fit params implementation to libcommon
* cont : header
* cont : headers
* cont : codeowners
2026-04-21 09:54:36 +03:00