-
Notifications
You must be signed in to change notification settings - Fork 14k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
server : make n_cache_reuse configurable per request
examples
server
#17858
opened Dec 8, 2025 by
ggerganov
Loading…
[SYCL] support bfloat16 release package
devops
improvements to build systems and github actions
#17855
opened Dec 8, 2025 by
arthw
Loading…
examples: fix memory leak for simple example
examples
#17854
opened Dec 8, 2025 by
lizhenneng
Loading…
cuda : add FILL op support
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#17851
opened Dec 8, 2025 by
JayZenith
Loading…
Webui: copy prompt and attachments
examples
server
#17841
opened Dec 7, 2025 by
ServeurpersoCom
Loading…
[SYCL] fix softmax for iGPU
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#17838
opened Dec 7, 2025 by
NeoZhangJianyu
Loading…
debug:Adding CPU-side visual trace for hexagon
ggml
changes relating to the ggml tensor library for machine learning
script
Script related
#17837
opened Dec 7, 2025 by
Ethan-a2
Loading…
console: allow using arrow left/right, home/end keys and history mode
#17836
opened Dec 7, 2025 by
ngxson
Loading…
server: delegate result_state creation to server_task
examples
server
#17835
opened Dec 6, 2025 by
ngxson
Loading…
[SYCL] Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#17826
opened Dec 6, 2025 by
NeoZhangJianyu
Loading…
cann : fix ops broken by circular padding guard
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#17825
opened Dec 6, 2025 by
CISC
Loading…
llama : add token matching support to llama-grammar
testing
Everything test related
#17816
opened Dec 6, 2025 by
aldehir
Loading…
3 tasks done
CANN: support gated linear attn
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#17814
opened Dec 6, 2025 by
YushengZhao
Loading…
vulkan: faster q6_k matmul
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#17813
opened Dec 6, 2025 by
netrunnereve
Loading…
model: support Rnj-1
model
Model specific
python
python script changes
#17811
opened Dec 6, 2025 by
philip-essential
Loading…
webui: Fix parsing non-LaTeX occurrencies of
\( or \)
examples
server
#17810
opened Dec 6, 2025 by
allozaur
Loading…
[DRAFT] CUDA: Improve performance via less synchronizations between token
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
Previous Next
ProTip!
Follow long discussions with comments:>50.