This website requires JavaScript.
Explore
Help
Sign In
tqcq
/
llama.cpp
Watch
0
Star
0
Fork
0
You've already forked llama.cpp
mirror of
https://github.com/ggml-org/llama.cpp.git
synced
2025-09-01 12:52:17 -04:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
Files
compilade/imatrix-neutral-prior
llama.cpp
/
common
History
Diego Devesa
ec428b02c3
llama : add --n-cpu-moe option (
#15077
)
...
* llama : add --n-cpu-moe option Keeps the MoE weights of the first N layers in the CPU
2025-08-05 01:05:36 +02:00
..
arg.cpp
llama : add --n-cpu-moe option (
#15077
)
2025-08-05 01:05:36 +02:00
arg.h
…
base64.hpp
…
build-info.cpp.in
…
chat-parser.cpp
…
chat-parser.h
…
chat.cpp
chat : fix multiple tool_calls on hermes-2-pro (
#14962
)
2025-08-02 18:04:48 +08:00
chat.h
…
CMakeLists.txt
cmake : do not search for curl libraries by ourselves (
#14613
)
2025-07-10 15:29:05 +03:00
common.cpp
llama : allow other bufts when overriding to CPU, add --no-repack option (
#14990
)
2025-07-31 18:11:34 +02:00
common.h
imatrix : warn when GGUF imatrix is saved without .gguf suffix (
#15076
)
2025-08-04 23:26:52 +02:00
console.cpp
…
console.h
…
json-partial.cpp
…
json-partial.h
…
json-schema-to-grammar.cpp
…
json-schema-to-grammar.h
…
llguidance.cpp
…
log.cpp
…
log.h
…
ngram-cache.cpp
…
ngram-cache.h
…
regex-partial.cpp
…
regex-partial.h
…
sampling.cpp
…
sampling.h
…
speculative.cpp
server : implement universal assisted decoding (
#12635
)
2025-07-31 14:25:23 +02:00
speculative.h
server : implement universal assisted decoding (
#12635
)
2025-07-31 14:25:23 +02:00