|
08011c2ca1
|
context : add llama_kv_cache_recurrent prototype
ggml-ci
|
2025-02-20 20:55:13 +02:00 |
|
|
b1554be1d7
|
context : add cache-less llama_context
ggml-ci
|
2025-02-20 18:30:04 +02:00 |
|
|
f95b04a21c
|
model : fix order kvq -> qkv
ggml-ci
|
2025-02-19 18:52:20 +02:00 |
|
|
2eacb4c1bf
|
graph : simplify attention api
ggml-ci
|
2025-02-19 18:43:49 +02:00 |
|
|
e17e4b72d1
|
context : add llama_context_recurrent
ggml-ci
|
2025-02-19 16:07:27 +02:00 |
|
|
5f11a5502a
|
kv-cache : remove llama_kv_cache_i
|
2025-02-19 14:36:27 +02:00 |
|
|
f5cedbcaaa
|
kv-cache : prepare for abstraction
ggml-ci
|
2025-02-18 21:28:58 +02:00 |
|
|
9e50456e19
|
context : minor simplify
ggml-ci
|
2025-02-18 14:53:02 +02:00 |
|
|
befe14f06f
|
llama : reorder encode/decode in sources
|
2025-02-18 14:47:53 +02:00 |
|
|
172f61690c
|
cont : return important tensors
ggml-ci
|
2025-02-18 13:48:43 +02:00 |
|
|
c23590319a
|
graph : add llama_graph_result
ggml-ci
|
2025-02-18 13:48:21 +02:00 |
|
|
1d801d27b9
|
graph : update attn/kv_self names
|
2025-02-14 17:22:55 +02:00 |
|
|
828064564c
|
context : move common inputs to base class
ggml-ci
|
2025-02-14 16:48:21 +02:00 |
|
|
d5e8e1a2ba
|
context : remove batch_manager
ggml-ci
|
2025-02-14 16:10:55 +02:00 |
|
|
131743ff4f
|
context : abstract constructor and init
ggml-ci
|
2025-02-13 17:17:51 +02:00 |
|
|
ed3cb55abe
|
context : abstract input
ggml-ci
|
2025-02-13 15:53:15 +02:00 |
|
|
107d1e2c32
|
context : move output functionality to base class
ggml-ci
|
2025-02-13 15:42:14 +02:00 |
|
|
f7c7757bab
|
context : abstract state read/write
ggml-ci
|
2025-02-13 12:37:28 +02:00 |
|
|
3a504d9a0b
|
llama : introduce llama_io interfaces
ggml-ci
|
2025-02-13 12:25:54 +02:00 |
|
|
fbe6a07256
|
context : rename to llama_context_kv_self
|
2025-02-12 17:16:44 +02:00 |
|
|
6ee86e5e0f
|
graph : restore ubatch in build_cb
ggml-ci
|
2025-02-12 16:29:15 +02:00 |
|
|
f63aeecce6
|
llama : models now build their graphs using llama_graph_i
ggml-ci
|
2025-02-12 15:08:40 +02:00 |
|
|
e633dc171a
|
context : introduce llama_graph_i
ggml-ci
|
2025-02-12 13:49:44 +02:00 |
|
|
5eae8e5183
|
context : move build_rope_factors to base class
ggml-ci
|
2025-02-12 13:32:02 +02:00 |
|
|
d146a14f77
|
context : minor naming fix
|
2025-02-12 12:41:36 +02:00 |
|
|
8da7f612b7
|
context : improve llama_context encapsulation
ggml-ci
|
2025-02-12 12:15:04 +02:00 |
|
|
b52b79b048
|
context : move encode/decode to llama-context.cpp
|
2025-02-12 11:23:38 +02:00 |
|
|
02ef4be975
|
context : initial abstraction
ggml-ci
|
2025-02-11 22:27:21 +02:00 |
|
|
2cd8a903c8
|
context : make output functions members
ggml-ci
|
2025-02-10 17:01:27 +02:00 |
|
|
ef358ee78f
|
context : add decode/encode
ggml-ci
|
2025-02-10 16:14:13 +02:00 |
|
|
1eca8916b5
|
llama : fix rwkv inference (#11618)
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
|
2025-02-03 14:17:50 +02:00 |
|
|
3e23be7911
|
context : store graph build function callback
ggml-ci
|
2025-02-02 10:49:32 +02:00 |
|
|
e665b57fa2
|
Merge branch 'master' into gg/llama-kv-cache
ggml-ci
|
2025-01-27 14:09:22 +02:00 |
|
|
a0c500b4dc
|
context : prepare for abstraction
ggml-ci
|
2025-01-26 20:16:22 +02:00 |
|
|
99422dfa3f
|
context : introduce llama_batch_manager
ggml-ci
|
2025-01-26 20:16:22 +02:00 |
|
|
133ad6a723
|
context : initial need_reserve logic
ggml-ci
|
2025-01-26 20:16:22 +02:00 |
|
|
f0713498fd
|
context : add get_ctx_padding()
ggml-ci
|
2025-01-26 20:16:22 +02:00 |
|
|
b4ec1d4429
|
cont : move kv_self update to llama_context
ggml-ci
|
2025-01-26 20:16:21 +02:00 |
|
|
f2524c0e41
|
llama : remove references to llama_kv_cache (wip)
Intermediate step necessary to abstract the `llama_context` and
`llama_kv_cache`.
ggml-ci
|
2025-01-26 20:16:21 +02:00 |
|
|
a19f671fe0
|
context : minor
ggml-ci
|
2025-01-26 20:16:21 +02:00 |
|
|
afa8a9ec9b
|
llama : add llama_vocab , functions -> methods, naming (#11110)
* llama : functions -> methods (#11110)
* llama : add struct llama_vocab to the API (#11156)
ggml-ci
* hparams : move vocab params to llama_vocab (#11159)
ggml-ci
* vocab : more pimpl (#11165)
ggml-ci
* vocab : minor tokenization optimizations (#11160)
ggml-ci
Co-authored-by: Diego Devesa <slarengh@gmail.com>
* lora : update API names (#11167)
ggml-ci
* llama : update API names to use correct prefix (#11174)
* llama : update API names to use correct prefix
ggml-ci
* cont
ggml-ci
* cont
ggml-ci
* minor [no ci]
* vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (#11174)
ggml-ci
* vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens (#11174)
ggml-ci
---------
Co-authored-by: Diego Devesa <slarengh@gmail.com>
|
2025-01-12 11:32:42 +02:00 |
|
|
f66f582927
|
llama : refactor src/llama.cpp (#10902)
* llama : scatter llama.cpp into multiple modules (wip)
* llama : control-vector -> adapter
* llama : arch
* llama : mmap
ggml-ci
* ci : remove BUILD_SHARED_LIBS=OFF
ggml-ci
* llama : arch (cont)
ggml-ci
* llama : chat
ggml-ci
* llama : model
ggml-ci
* llama : hparams
ggml-ci
* llama : adapter
ggml-ci
* examples : fix
ggml-ci
* rebase
ggml-ci
* minor
* llama : kv cache
ggml-ci
* llama : impl
ggml-ci
* llama : batch
ggml-ci
* cont
ggml-ci
* llama : context
ggml-ci
* minor
* llama : context (cont)
ggml-ci
* llama : model loader
ggml-ci
* common : update lora
ggml-ci
* llama : quant
ggml-ci
* llama : quant (cont)
ggml-ci
* minor [no ci]
|
2025-01-03 10:18:53 +02:00 |
|